How Memory Actually Works

Source-code-verified internals of Claude Code's memory system.

There is no magic Claude does not "remember" in a human sense. It re-reads plain markdown files every session. No embeddings, no vector store, no persistent internal state.

The Two Memory Systems

PropertyCLAUDE.mdAuto Memory (MEMORY.md)
AuthorYouClaude
ContainsInstructions and rulesLearnings and patterns
ScopeProject, user, or orgPer working tree / git repo
LoadedEvery session, in fullFirst 200 lines or 25KB
Injection pointSystem prompt context sectionSystem prompt + per-query attachments

CLAUDE.md Loading Chain

Files load from broadest to most specific scope (source: src/utils/claudemd.ts). The harness walks up the directory tree from cwd to root, then processes root-to-cwd for precedence:

  1. Managed policy/etc/claude-code/CLAUDE.md (Linux) — cannot be excluded
  2. User-level~/.claude/CLAUDE.md — personal, all projects
  3. User rules~/.claude/rules/*.md — loaded before project rules
  4. Project-level./CLAUDE.md or ./.claude/CLAUDE.md — shared via git
  5. Project rules./.claude/rules/*.md — can have paths: frontmatter for conditional loading
  6. Subdirectory CLAUDE.md — lazy-loaded when Claude reads files in those directories
  7. Local overrides./CLAUDE.local.md — personal, gitignored

CLAUDE.md content is injected as a system prompt context section (not a user message). It fully survives compaction — re-read from disk after /compact.

Import System

@path/to/file syntax imports additional files. Resolves relative to the importing file. Max recursion depth: 5. HTML comments stripped before injection. Imports do NOT work inside code blocks.

Size Limits

There is no enforced size limit on CLAUDE.md. A 40KB flag exists but only triggers a warning in the /doctor command. Files of any size are loaded in full. That said, adherence degrades beyond ~200 lines — this is a practical recommendation, not a hard limit.

Auto Memory (MEMORY.md)

Stored in ~/.claude/projects/<project-slug>/memory/. All worktrees within the same repo share one memory directory.

MEMORY.md Truncation

Dual caps applied (source: src/memdir/memdir.ts:35-103):

CapValuePurpose
Line cap200 linesCatches index sprawl
Byte cap25,000 bytesCatches long-line indexes

When truncated, a WARNING is appended explaining which cap fired. Topic files (individual memory files) are NOT loaded at startup.

Per-File Recall Limits

When topic files are surfaced by the selection agent:

LimitValue
Per-file lines200
Per-file bytes4,096 (~4KB)
Session total61,440 (60KB)
Max files scanned200

Truncated files get a note: "Use FileRead to view the complete file at: [path]"

The Memory Selection Agent

Claude Code uses Claude Sonnet as a dedicated sub-agent for deciding which memory files to attach (source: src/memdir/findRelevantMemories.ts).

AspectDetails
InputUser query + list of memory file headers (filename, description, type, mtime)
OutputUp to 5 filenames (hard-coded max)
De-duplicationFiles shown in prior turns are excluded from re-selection
Tool awarenessIf a tool is in recentTools, usage docs deprioritized, but warnings/gotchas kept
Injected as<system-reminder> attachments
The type field does NOT influence selection The frontmatter type (user/feedback/project/reference) is used for display only in the memory manifest. It does NOT filter or influence the Sonnet selector's decision. Only filename and description matter for routing.

Memory Extraction (Auto-Save)

Runs as a forked agent (shares prompt cache with main agent) at end of each query loop. Has restricted tool permissions: Read/Grep/Glob, read-only Bash, and FileEdit/FileWrite only within the memory directory. Skipped if the main agent already wrote to memory that turn.

Dream: Memory Consolidation

Background agent that synthesizes daily logs/transcripts into durable memories. Four phases: Orient → Gather → Consolidate → Prune.

Gates (cheapest checked first):

  1. Time gate — 24 hours since last consolidation (default)
  2. Session gate — 5 transcript files touched since (default)
  3. Lock gate — PID-based lock with 1-hour stale threshold

Feature-gated. User override: autoDreamEnabled in settings.json.

Session Memory (Separate System)

Distinct from auto-memory. A periodic background sub-agent extracts key session info into .session/memory.md. Triggered by token thresholds (~30K init, ~20K between updates). Does not interrupt the main conversation. This is for within-session continuity, not cross-session persistence.

The Harness Architecture

System Prompt Assembly Order

  1. Attribution header
  2. CLI prefix
  3. Introduction section
  4. System section (tools, compression, hooks)
  5. Doing tasks section
  6. Tools section (per-tool instructions)
  7. SYSTEM_PROMPT_DYNAMIC_BOUNDARY — cache scope split point
  8. User/project context (CLAUDE.md, memory) — not cacheable
  9. Output style, language, MCP instructions
  10. Append system prompt
Cache scope split Content before the boundary gets global cache scope (shared). Content after (your CLAUDE.md, memory) gets org-level or no caching — re-processed every API call.

Auto-Compact

Triggers when token usage ≥ (effective context window - 13,000 tokens). Circuit breaker stops after 3 consecutive failures. CLAUDE.md and MEMORY.md survive compaction (re-read from disk).