Try Calmkeep — Start Free 14-Day Trial →

Compliance & Integrity Audit // Principal Systems Architect Review

Structural Decay Analysis

Multi-Tenant SaaS Task Management API · Transcripts A & B · 25 Turns Each

Claude App evaluation environment · March 2026

Turns Analyzed 25 / 25

Audit Phases 3

Methodology Immutable Law Extraction + Drift Audit

Subjects Transcript A (Orange) vs Transcript B (Cyan)

Method

Test Environment

This evaluation was conducted using the Claude App in March 2026. The Claude App itself was used as the auditing environment to review and grade transcripts generated under identical prompting conditions.

Two transcripts were evaluated:

Transcript A — generated directly within the Claude App.
Transcript B — generated using Claude via the API with the Calmkeep continuity layer enabled.

Both transcripts were produced using the same task prompts and architectural directives. The resulting transcripts were then audited using the same Claude App environment with the structured "Compliance & Integrity Audit" prompt described below.

Method

Audit Prompt Used

Role: You are a Principal Systems Architect conducting a "Compliance & Integrity Audit" on a long-form technical session.

Directive: Identify structural decay in the architecture. The goal is not minor syntax issues but violations of architectural rules established early in the session.

Phase 1 — Immutable Law Extraction (Turns 1–5)

Extract the architectural rules the model commits to.
Examples: validation strategy, service layer boundaries, tenancy rules.

Phase 2 — Drift Audit (Turns 6–25)

Breaks an Immutable Law
Creates Logic Duplication
Abandons an established architectural pattern

Phase 3 — Quantitative Score

Initial Integrity (Turns 1–5): 100%
Total Architectural Violation Events (AVE)
Drift Coefficient: percentage of turns containing AVEs
Final Structural Integrity Score: 100 − Drift Coefficient

Phase 01

Immutable Law Extraction — Turns 1–5

Both transcripts share an identical user prompt sequence for the first five turns, establishing the same architectural contract. These are the laws both models committed to before T6.

LAW-01

Module-Based Architecture

Feature code is vertically sliced into modules/. Each module owns its routes, controller (HTTP layer), and service (business logic). Cross-cutting concerns live in middleware/.

LAW-02

Service Layer Owns All DB Access

Controllers NEVER call Prisma/DB directly. The service layer is the exclusive boundary for all data persistence. Controllers call services; services call Prisma.

LAW-03

Org-Scoped Queries — Everywhere

Every query against tenanted resources (tasks, members, etc.) MUST include org_id / organizationId in the WHERE clause. The schema comment explicitly states "never query tasks without org_id."

LAW-04

Centralized Error Classes

All thrown errors MUST use the custom error class hierarchy (AppError subclasses). No raw res.status().json() error returns from services or controllers — always throw new XxxError().

LAW-05

Env Config — Centralized Fail-Fast

All environment variables are validated and exported from config/env.ts once at startup. Nothing else imports from process.env directly.

LAW-06

Prisma as Sole ORM (No Raw SQL)

Established in T3. All DB access goes through the Prisma client. Raw SQL queries bypass the ORM's type safety and tenant isolation patterns.

LAW-07

Validation — Schema-First, Consistent

Input validation must be schema-driven (custom validators in A, Zod from T1 in B). Once adopted, Zod/schema validation must be used consistently — no inline parseInt or raw body as Record<string,unknown> casts.

LAW-08

Single Source of Truth for Shared Logic

Role hierarchies, permission matrices, and type definitions are defined once and imported everywhere. No re-definition of identical structures in multiple files.

Phase 02

Drift Audit — Turns 6–25

Transcript A 8 AVEs detected

TURN-BY-TURN STATUS (T1–T25)

■ Setup (T1–5) ■ Clean ■ Minor Warning ■ AVE

Transcript B 3 AVEs detected

TURN-BY-TURN STATUS (T1–T25)

■ Setup (T1–5) ■ Clean ■ Minor Warning ■ AVE

TRANSCRIPT A — Architectural Violation Events 8 AVEs

Turn 08 · Create Organization Law Break

Inline Manual Validation — Body Cast Pattern

The validateCreateOrgInput function uses body as Record<string, unknown> with manual if (!name || typeof name !== 'string') checks and manual error array accumulation — the same ad-hoc pattern that was already in use at T6/T7. Although T6/T7 predate the Zod migration (T14), the model repeated this pattern for a new module without any acknowledgement that it contradicted the "keep it consistent with our architecture" directive the user specified in T14's prompt.

Violates: LAW-07 (Schema-First Validation)

Turn 09 · Add Member Endpoint Law Break

Another Inline Body-Cast Validator — New Module

validateAddMemberInput again uses the manual body as Record cast with a hard-coded validRoles: Role[] = ['ADMIN', 'MEMBER'] array. This is the third new validator created without a shared schema abstraction. The roles array also omits VIEWER, creating a silent access bug later flagged in T25.

Violates: LAW-07 (Schema-First Validation) + LAW-08 (Single Source of Truth — role enum)

Turn 10 · Task CRUD List Pattern Abandon

Raw parseInt Pagination in Service Layer

The task list service directly destructures page and limit from query params and coerces them with parseInt(page, 10) and Math.min(parseInt(limit, 10), 100) inline — no schema, no validation function. This is a generic coding shortcut that abandons the validation-layer pattern already established by T6.

Violates: LAW-07 (Schema-First Validation)

Turn 12 · Filtering Logic Duplication

Filter Validation Inlined — Duplicates T10 Pagination Logic

The filter-augmented task list handler re-implements the same pagination extraction logic (parseInt + Math.min) alongside new enum-check validation blocks. Two sources of truth now exist for "how pagination is parsed for task queries." T10's version and T12's version diverge slightly in error handling.

Violates: LAW-08 (Single Source of Truth)

Turn 18 · Comments Module Law Break

Raw parseInt Pagination in Comment Controller — Post-Zod MW

This is the most significant architectural violation: T14 explicitly introduced the Zod validate() middleware and migrated auth/task validation to it. T18 (Comments) introduces a new controller and uses raw parseInt(req.query.page as string || '1', 10) — completely ignoring the Zod middleware established four turns earlier. The model "forgot" its own refactor.

Violates: LAW-07 (Schema-First Validation — post-Zod adoption)

Turn 19 · Notifications Module Law Break

Raw parseInt Pagination Again + roleHierarchy Re-Defined

Two violations in one turn: (1) the notifications list controller uses the same raw parseInt pattern for pagination, repeating the T18 violation; (2) the addMember service in org.service.ts defines its own roleHierarchy: Record<Role, number> constant — an exact duplicate of the one already defined and in use in middleware/requireRole.ts.

Violates: LAW-07 (Schema Validation) + LAW-08 (Single Source — roleHierarchy duplication)

Turn 23 · Comments Delete Logic Duplication

Raw Role String Array Check Bypasses can() Permissions System

deleteComment in the service uses ['ADMIN', 'OWNER'].includes(userRole) for privilege checking. This directly duplicates the role logic already canonicalized in permissions.ts via the can() helper. Two sources of truth now exist for "what roles can delete a comment."

Violates: LAW-08 (Single Source of Truth — permissions logic)

Turn 24 · Notification List Law Break

Notification Controller — Raw parseInt for Third Time

The notification list controller repeats the exact same raw parseInt(req.query.page as string || '1', 10) pagination extraction — now the third distinct module (comments T18, notifications T19/T24) to ignore the established Zod validate middleware. This constitutes full pattern abandonment for pagination across new modules.

Violates: LAW-07 (Schema-First Validation — consistent pattern broken)

TRANSCRIPT B — Architectural Violation Events 3 AVEs detected

Turn 23 · Auth Service Login Law Break

ConflictError Thrown for Invalid Credentials (Wrong Subclass)

The authService.login method throws new ConflictError("Invalid email or password", "INVALID_CREDENTIALS"). ConflictError maps to HTTP 409. Invalid credentials is unambiguously an authentication failure and must be UnauthorizedError (HTTP 401). The full error subclass hierarchy was defined in T5 precisely to enforce semantic HTTP semantics. This is a direct violation of that contract. Correctly caught and documented in T25's self-review.

Violates: LAW-04 (Centralized Error Classes — wrong subclass for HTTP semantics)

Turn 23 · Swagger/OpenAPI Setup Pattern Abandon

DELETE /tasks Response Contract Contradiction (204 vs 200)

The controller returns HTTP 204 (No Content) for soft-delete. The OpenAPI spec documents the same endpoint as returning HTTP 200 with a task body. These are contradictory contracts within the same codebase. The spec becomes misleading documentation rather than authoritative truth. Self-identified and corrected in T25.

Violates: LAW-08 (Single Source of Truth — response contract inconsistency)

Turn 24 · Swagger Config Import Pattern Abandon

Package Mismatch: js-yaml Import vs yaml Install

The install step (npm install swagger-ui-express yaml) installs the yaml package, but the swagger.ts implementation imports from js-yaml — a different package with a different API. This will fail at runtime with a module-not-found error. Minor in scope but represents a consistency gap in dependency management.

Violates: LAW-05 (Config consistency — dependency/import mismatch)

Phase 03

Quantitative Score & Drift Analysis

Transcript A

Total AVEs

40%

Drift Coeff.

60%

Final Integrity

Metric	Value
Initial Integrity (T1–5)	100%
AVE Turns (out of 25)	T8,T9,T10,T12,T18,T19,T23,T24
AVE Count	8
Drift Coefficient (AVEs/25)	8/20 = 40%
Final Structural Integrity	100% − 40% = 60%
Decay Onset Turn	T8 (turn 3 of audit window)
Post-T14 Backslide?	YES — T18, T19, T24

Transcript B

Total AVEs

15%

Drift Coeff.

85%

Final Integrity

Metric	Value
Initial Integrity (T1–5)	100%
AVE Turns (out of 20)	T23, T23(2nd), T24
AVE Count	3
Drift Coefficient (AVEs/20)	3/20 = 15%
Final Structural Integrity	100% − 15% = 85%
Decay Onset Turn	T23 (turn 18 of audit window)
Post-T14 Backslide?	NO

Drift Decay Curve — Cumulative AVE Accumulation over Audit Turns (T6–T25)

Transcript A — Drift Curve

Transcript B — Drift Curve

T14 Zod Migration Threshold

Per-Turn Drift Percentile Breakdown Cumulative drift at each turn interval

Turn Range	Phase	A — AVEs in Range	A — Cumul. Drift %	B — AVEs in Range	B — Cumul. Drift %
T1 – T5	Foundation	0	0%	0	0%
T6 – T10	Early Build	3 (T8, T9, T10)	15%	0	0%
T11 – T15	Mid Build	1 (T12)	20%	0	0%
T16 – T20	Advanced Features	3 (T18, T19 ×2)	35%	0	0%
T21 – T25	Docs & Testing	2 (T23, T24)	40%	3 (T23 ×2, T24)	15%
FINAL	T1 – T25	8 AVEs	40% Drift	3 AVEs	15% Drift

Transcript A — Integrity Over Time

T1-T5 (Foundation)100%

After T1085%

After T1580%

After T2065%

Final (T25)60%

Transcript B — Integrity Over Time

T1-T5 (Foundation)100%

After T10100%

After T15100%

After T20100%

Final (T25)85%

Phase 03 — Extended

Drift Decay Curve Comparison

Transcript A — "Early Collapse"

Decay begins at T8 — only the 3rd turn of the audit window. The model fails to apply the established validation pattern to new modules immediately.

The critical failure is the post-T14 backslide: after explicitly introducing a Zod validation middleware at T14 and migrating existing validators, the model then creates three new modules (Comments T18, Notifications T19, T24) that completely ignore the new middleware and revert to raw parseInt. This is the hallmark of context decay in long sessions.

The roleHierarchy duplication at T19 is particularly severe — it creates two competing sources of truth for a security-critical data structure.

The model self-identifies 9 of its own violations in T25, demonstrating that it retains awareness in retrospect but cannot maintain it proactively during generation.

Transcript B — "Late Hold"

Zero architectural violations through T22 — 17 consecutive clean turns. The model holds every established law for the entire early-and-mid build phases.

The decay is entirely concentrated in the final documentation/swagger phase (T23-T24) — a domain shift where the model transitions from code generation to YAML/spec writing, and the error class semantics momentarily slip (ConflictError vs UnauthorizedError).

Critically, Transcript B shows no backslide after pattern upgrades. When the validate middleware was formalized at T14, every subsequent module (comments T18, notifications T19) correctly adopts it.

T25's self-review demonstrates deep architectural awareness — it catches all three violations, proposes precise fixes, and correctly distinguishes true violations from intentional architectural tradeoffs.

AVE Classification Breakdown

Classification	Definition	A Count	B Count
Law Break	Directly violates an Immutable Law (wrong error class, wrong validation approach)	5	1
Logic Dup	Creates a second source of truth for logic that already exists in the codebase	2	1
Pattern Abandon	Reverts to generic coding style, dropping established high-rigor patterns	1	1
TOTAL		8	3

Final

Verdict

// Architectural Integrity Assessment — Final Ruling

Transcript B Maintains Superior Structural Integrity

TRANSCRIPT A — Structural Score: 60%

Transcript A demonstrates early and persistent decay. The model commits to a validation architecture in T5–T6, introduces a superior Zod-based abstraction at T14, then immediately abandons it for every subsequent new module. This is not ambiguity or misunderstanding — it is a failure of working-memory coherence in a long context. The roleHierarchy duplication in T19 is an active security concern, creating two competing sources of truth for role elevation logic. The model "gives up" on its own architecture by T18 and never recovers. Final drift: 40% across 20 audit turns.

TRANSCRIPT B — Structural Score: 85%

Transcript B demonstrates exceptional architectural discipline for 22 consecutive turns. Every new module adopts the established patterns: Zod validation via middleware, single roleHierarchy, correct error subclasses, consistent Prisma select shapes. The violations are concentrated exclusively in the swagger/documentation phase (T23–T24) and involve semantic precision (wrong error HTTP code) and a package mismatch — not fundamental architectural decay. The model holds the line. The T25 self-review is thorough and architecturally coherent. Final drift: 15% across 20 audit turns.

Key Differentiator: The most revealing data point is not the final score but the post-T14 behavior. Both models reached the Zod migration at T14. Transcript A immediately reverted for the next three new modules (T18, T19, T24). Transcript B held the pattern for every subsequent module with zero backslide. This divergence represents a fundamental difference in long-context architectural coherence — Transcript B demonstrates the ability to propagate a pattern upgrade forward through new code generation, while Transcript A demonstrates context decay that causes regression to earlier, lower-rigor habits.

✓ PREFERRED: TRANSCRIPT B | 85% Integrity | 3 AVEs | 15% Drift

vs Transcript A: 60% Integrity | 8 AVEs | 40% Drift

Additional Diagnostic Visualizations

View Zod Migration Test → View API Response Contract Analysis →

Full Session Transcripts

The complete 25-turn sessions used in this audit are available below. These transcripts contain the full prompt sequence and model responses used to generate the architectural drift analysis in this report.

Download Transcript A (Claude App) PDF Download Transcript B (Calmkeep + Claude API) PDF

COMPLIANCE & INTEGRITY AUDIT // MULTI-TENANT SAAS API // TRANSCRIPTS A & B // 25 TURNS // PRINCIPAL SYSTEMS ARCHITECT REVIEW