How Memory Actually Works

Source-code-verified internals of Claude Code's memory system.

There is no magic Claude does not "remember" in a human sense. It re-reads plain markdown files every session. No embeddings, no vector store, no persistent internal state.

The Two Memory Systems

Property	CLAUDE.md	Auto Memory (MEMORY.md)
Author	You	Claude
Contains	Instructions and rules	Learnings and patterns
Scope	Project, user, or org	Per working tree / git repo
Loaded	Every session, in full	First 200 lines or 25KB
Injection point	System prompt context section	System prompt + per-query attachments

CLAUDE.md Loading Chain

Files load from broadest to most specific scope (source: src/utils/claudemd.ts). The harness walks up the directory tree from cwd to root, then processes root-to-cwd for precedence:

Managed policy — /etc/claude-code/CLAUDE.md (Linux) — cannot be excluded
User-level — ~/.claude/CLAUDE.md — personal, all projects
User rules — ~/.claude/rules/*.md — loaded before project rules
Project-level — ./CLAUDE.md or ./.claude/CLAUDE.md — shared via git
Project rules — ./.claude/rules/*.md — can have paths: frontmatter for conditional loading
Subdirectory CLAUDE.md — lazy-loaded when Claude reads files in those directories
Local overrides — ./CLAUDE.local.md — personal, gitignored

CLAUDE.md content is injected as a system prompt context section (not a user message). It fully survives compaction — re-read from disk after /compact.

Import System

@path/to/file syntax imports additional files. Resolves relative to the importing file. Max recursion depth: 5. HTML comments stripped before injection. Imports do NOT work inside code blocks.

Size Limits

There is no enforced size limit on CLAUDE.md. A 40KB flag exists but only triggers a warning in the /doctor command. Files of any size are loaded in full. That said, adherence degrades beyond ~200 lines — this is a practical recommendation, not a hard limit.

Auto Memory (MEMORY.md)

Stored in ~/.claude/projects/<project-slug>/memory/. All worktrees within the same repo share one memory directory.

MEMORY.md Truncation

Dual caps applied (source: src/memdir/memdir.ts:35-103):

Cap	Value	Purpose
Line cap	200 lines	Catches index sprawl
Byte cap	25,000 bytes	Catches long-line indexes

When truncated, a WARNING is appended explaining which cap fired. Topic files (individual memory files) are NOT loaded at startup.

Per-File Recall Limits

When topic files are surfaced by the selection agent:

Limit	Value
Per-file lines	200
Per-file bytes	4,096 (~4KB)
Session total	61,440 (60KB)
Max files scanned	200

Truncated files get a note: "Use FileRead to view the complete file at: [path]"

The Memory Selection Agent

Claude Code uses Claude Sonnet as a dedicated sub-agent for deciding which memory files to attach (source: src/memdir/findRelevantMemories.ts).

Aspect	Details
Input	User query + list of memory file headers (filename, description, type, mtime)
Output	Up to 5 filenames (hard-coded max)
De-duplication	Files shown in prior turns are excluded from re-selection
Tool awareness	If a tool is in `recentTools`, usage docs deprioritized, but warnings/gotchas kept
Injected as	`<system-reminder>` attachments

The type field does NOT influence selection The frontmatter type (user/feedback/project/reference) is used for display only in the memory manifest. It does NOT filter or influence the Sonnet selector's decision. Only filename and description matter for routing.

Memory Extraction (Auto-Save)

Runs as a forked agent (shares prompt cache with main agent) at end of each query loop. Has restricted tool permissions: Read/Grep/Glob, read-only Bash, and FileEdit/FileWrite only within the memory directory. Skipped if the main agent already wrote to memory that turn.

Dream: Memory Consolidation

Background agent that synthesizes daily logs/transcripts into durable memories. Four phases: Orient → Gather → Consolidate → Prune.

Gates (cheapest checked first):

Time gate — 24 hours since last consolidation (default)
Session gate — 5 transcript files touched since (default)
Lock gate — PID-based lock with 1-hour stale threshold

Feature-gated. User override: autoDreamEnabled in settings.json.

Session Memory (Separate System)

Distinct from auto-memory. A periodic background sub-agent extracts key session info into .session/memory.md. Triggered by token thresholds (~30K init, ~20K between updates). Does not interrupt the main conversation. This is for within-session continuity, not cross-session persistence.

The Harness Architecture

System Prompt Assembly Order

Attribution header
CLI prefix
Introduction section
System section (tools, compression, hooks)
Doing tasks section
Tools section (per-tool instructions)
SYSTEM_PROMPT_DYNAMIC_BOUNDARY — cache scope split point
User/project context (CLAUDE.md, memory) — not cacheable
Output style, language, MCP instructions
Append system prompt

Cache scope split Content before the boundary gets global cache scope (shared). Content after (your CLAUDE.md, memory) gets org-level or no caching — re-processed every API call.

Auto-Compact

Triggers when token usage ≥ (effective context window - 13,000 tokens). Circuit breaker stops after 3 consecutive failures. CLAUDE.md and MEMORY.md survive compaction (re-read from disk).