Session Amnesia: The Hidden Cost of Stateless AI Coding Assistants

Yanbing Li
44 minutes ago
5 min read

Practitioner Case Study — Field Notes

Yanbing Li · iSterna LLC · May 2026

Working draft — Feedback welcome: yanbing@aisterna.com

In Brief

AI coding assistants start every session fresh. In one production workflow, a known constraint cost $66-90 to rediscover — entirely avoidable if the lesson had been captured the first time. This paper names the failure mode, diagnoses three root causes, and proposes a five-layer framework. The one-sentence takeaway: The code fix is the last step. Capture the lesson first.

Abstract

AI coding assistants like Claude Code are transforming how engineers build complex systems. But they carry a structural limitation that requires deliberate design to work around: every session starts fresh. We call this session amnesia. We quantified one instance at $66-90 in LLM review cycles when a constraint discovered in one session was not captured and had to be rediscovered weeks later. The paper's core principle: the code fix is the last step — capture the lesson first.

1. Introduction

The trajectory from AI copilot to AI coworker is well underway. Microsoft has documented fiber-break field dispatches on its Azure backbone handled autonomously with no human engineer involved. Practitioners and researchers are publishing hands-on observations about AI assistants taking on longer, more consequential engineering work.

This shift exposes a failure mode that short-duration copilot use conceals: session amnesia. The correct mental model: each session is a stateless function call. Inputs are whatever files load at start; outputs are whatever files are committed before ending. Everything in between is ephemeral.

Engineers experience session amnesia as vague slowness, repeated debugging, and a sense that the system keeps forgetting things — without identifying the structural cause.

2. The Case Study: A $66 Lesson

2.1 Setting

The incident occurred in a production AI-augmented engineering platform built across multiple parallel Claude Code sessions over several months. What one session learns, the next session should know.

2.2 The Incident

Session A (manual extraction): discovered an identifier stability constraint. The fix was committed. The constraint was not written down anywhere accessible to a future session.

Session B (agent-loop, weeks later): read the architecture documents correctly per the spec — without knowing the hidden pre-condition. The identifier collision surfaced at smoke run 5.

2.3 The Cost

Total: ~$66-90 in LLM review cycles (5 smoke runs + 2 R-cycles). If the feedback file had existed, Session B would have loaded it and needed zero smoke runs.

Note: under subscription billing, LLM compute may be near-zero. Engineering time (3-5 hrs) and schedule delay remain constant regardless of billing model.

3. Anatomy of Session Amnesia

3.1 Definition and Failure Modes

We define session amnesia as the loss of session-specific knowledge at a session boundary, arising from one or more of three failure modes: capture failure, routing failure, or loading failure.

Related reading: For a deeper treatment of the harness architecture this paper builds on, see You're Not Talking to a Model — You're Talking to a Harness — iSterna, April 2026 · https://aisterna.com/post/you-re-not-talking-to-a-model-you-re-talking-to-a-harness

Three Failure Modes: Capture, Routing, Loading

Capture failure: lesson in conversation history, never written to persistent storage.

Routing failure: written to a code comment — visible to a human, invisible to an AI session starting from the architecture specification.

Loading failure: lesson in the right place, but the new session skipped the session-start checklist.

3.2 Why Code Fixes Are Not Lessons

A code fix is implicit knowledge; a lesson is explicit knowledge. Discovering a constraint should trigger: (1) capture the lesson; (2) encode it in a CI test; (3) update ADR pre-conditions; (4) write the code fix.

4. What Savvy Practitioners Do Today

Note: §4.1 and §4.2 are inferred from practitioners' public writing and do not represent direct correspondence or endorsement.

4.1 The Karpathy Approach

Test as constraint: every discovered constraint becomes a CI test — executable, CI-enforced, impossible to accidentally ignore.

Short, dense instruction files: 20 critical invariants; everything else in referenced documents with explicit MUST READ pointers.

Sessions as stateless function calls: explicit inputs, explicit outputs, explicit handoff.

4.2 The Willison Approach

Before closing a session, ask the model what non-obvious constraints were discovered. Commit the output as a feedback file. Converts lesson capture from discipline to infrastructure.

4.3 The Ecosystem Gap

Both approaches are partial — neither addresses multi-session parallel architecture, phase transitions, or scalable index management.

4.4 Related Work

RAG and MemGPT address model-level memory, not session-to-session practitioner knowledge transfer. SRE post-mortem practices address human-to-human transfer but assume practitioners who can be trained. Software engineering knowledge management literature has the same failure taxonomy but assumes human knowledge workers. Empirical AI coding assistant research focuses on single-session output quality. To our knowledge, session amnesia has not been previously defined, quantified, or systematically addressed in published literature.

5. A Framework for Cross-Session Knowledge Transfer

Start here: implement CI enforcement (layer 4) first. A test that fails when a constraint is violated enforces it on every commit. Then spend five minutes before your next session closes: write two sentences about what non-obvious constraint was discovered. Commit it. Add remaining layers as your workflow matures.

Five-Layer Framework: uniform bars, CI Enforcement highlighted

5.1 Capture: The Lesson, Not the Fix

Before any non-obvious fix is committed, capture the lesson. Automate via a SessionEnd hook. Pair every lesson with a CI test.

5.2 Routing: To Where Future Sessions Will Look

Write lessons to locations future sessions load at initialization. For phase transitions, write an explicit handoff document.

5.3 Loading: Before Writing Any Code

The session-start checklist must include explicit knowledge loading steps:

Read the active work block — what are we building today?
Grep feedback directory for files newer than 7 days
Follow all MUST READ pointers in the instruction file
Check for prior platform solutions before designing new architecture

5.4 Enforcement: CI as the Durable Anchor

A CI test enforces the constraint on every commit. The test's docstring cites the feedback file and the cost of original discovery.

5.5 Sustainability: Managing the Knowledge Index

Three-phase approach: (1) cluster summary files; (2) categorical sub-indices; (3) session-type routing. New entries always go into a cluster, never the main index.

6. Worked Example: Capturing the Identifier Constraint

What should have happened after Session A discovered the constraint:

Lesson file: What was assumed — identifiers are globally unique across extraction runs. Why it's wrong — the {seq} counter restarts per run. Correct approach — semantic matching at merge boundary. Cost of rediscovery — ~$66-90.

The CI test docstring cites the feedback file and the rediscovery cost. The ADR pre-condition states the constraint adjacent to the merge logic. With these three artifacts, Session B would have encountered the constraint before writing any code.

7. Implications for AI-Augmented Engineering

7.1 Compounding vs. Anti-Compounding

Done well, cross-session knowledge transfer creates a compounding effect: each session starts smarter than the last. Without it, engineers describe 'AI going backward' even as the underlying models improve.

7.2 The Design Principle

Sessions are ephemeral compute; knowledge is persistent state. Engineer them separately. In team settings, constraint capture must be structural — a pre-commit hook requiring a corresponding lesson file is more reliable than expecting every engineer to remember.

7.3 What the Ecosystem Needs

Automatic lesson extraction at session end via a SessionEnd hook
Session-type routing that loads only the relevant knowledge subset
CI integration that treats constraint capture as a merge requirement
Index management tooling that prevents the knowledge index growing past the context limit

8. Conclusion

Session amnesia is a structural property of current-generation AI coding assistants. It has a measurable cost that scales with the ambition and autonomy of the AI-augmented workflow. The response is architectural: five layers — capture, routing, loading, enforcement, sustainability — treat lessons as first-class engineering artifacts.

The code fix is the last step. Capture the lesson first.