How the memory system actually works, what the harness controls, and where the gaps are.
Claude Code has two complementary memory mechanisms. Neither involves a database, vector store, or persistent internal state. Claude re-reads plain markdown files every session.
| Property | CLAUDE.md | Auto Memory (MEMORY.md) |
|---|---|---|
| Author | User | Claude |
| Contains | Instructions and rules | Learnings and patterns |
| Scope | Project, user, or org | Per working tree / git repo |
| Loaded | Every session, in full | First 200 lines or 25KB |
CLAUDE.md files load in a strict hierarchy, from broadest to most specific scope. More specific locations take precedence:
/etc/claude-code/CLAUDE.md/Library/Application Support/ClaudeCode/CLAUDE.md~/.claude/CLAUDE.md — personal preferences across all projects~/.claude/rules/*.md — loaded before project rules./CLAUDE.md or ./.claude/CLAUDE.md — team-shared via version control./.claude/rules/*.md — can have paths: frontmatter for conditional loading./CLAUDE.local.md — personal, not committed to gitThe harness walks up the directory tree from cwd, loading CLAUDE.md from each ancestor. Files are loaded in full regardless of length, though Anthropic recommends keeping them under 200 lines for best adherence.
CLAUDE.md files support @path/to/file syntax to import additional files. Imports resolve relative to the importing file, support recursive imports (max depth 5), and expand at launch time. HTML comments are stripped before injection to save tokens.
CLAUDE.md fully survives compaction. After /compact, Claude re-reads CLAUDE.md from disk and re-injects it fresh. Instructions given only in conversation (not written to CLAUDE.md) may be lost.
Rules are modular instruction files in .claude/rules/. They support YAML frontmatter with a paths field for conditional loading:
---
paths:
- "src/api/**/*.ts"
---
# API Development Rules
Always validate input at the handler level...
Rules without paths frontmatter load unconditionally at launch. Path-scoped rules trigger when Claude reads files matching the glob pattern. Rules are re-injected as system-reminders every time Claude accesses a matching file — unlike CLAUDE.md which loads once.
Storage location: ~/.claude/projects/<project>/memory/ where <project> is derived from the git repository root. All worktrees and subdirectories within the same repo share one auto memory directory. Outside git repos, the project root path is used.
~/.claude/projects/<project>/memory/
├── MEMORY.md # Concise index, loaded every session
├── debugging.md # Topic file (loaded on demand)
├── api-conventions.md # Topic file (loaded on demand)
└── ...
WARNING is appended explaining which cap fired. Topic files are never loaded at startup — they're surfaced on demand by the memory selection agent.
When topic files ARE surfaced, they're also truncated:
| Limit | Value | Scope |
|---|---|---|
| Per-file lines | 200 lines | Each surfaced memory file |
| Per-file bytes | 4,096 bytes (~4KB) | Each surfaced memory file |
| Session total | 60 KB | All surfaced memories combined |
| Max files scanned | 200 files | Memory directory scan limit |
Truncated files get a note pointing to the full path for FileRead.
Claude Code uses Claude Sonnet as a dedicated sub-agent for memory selection. Its system prompt reads:
"You are selecting memories that will be useful to Claude Code as it processes a user's query. You will be given the user's query and a list of available memory files with their filenames and descriptions. Return a list of filenames for the memories that will clearly be useful (up to 5). Only include memories that you are certain will be helpful based on their name and description."
alreadySurfaced set prevents re-selecting files shown in prior turnsrecentTools, usage reference docs are deprioritized, but warnings/gotchas are kept<system-reminder> attachments (not in main conversation)The type field in frontmatter is used for display in the manifest ([feedback] filename) but is NOT used for filtering. Only filenames and descriptions influence selection.
Claude Code includes a "dream" system (related to the KAIROS feature flag) that performs memory consolidation as a background agent with four phases:
ls the memory directory, read the index, skim topic filesDream runs when ALL gates pass (cheapest checked first):
minHours since last consolidation (default: 24h)minSessions transcript files touched since last run (default: 5).consolidate-lock) with 1-hour stale thresholdFeature-gated via tengu_onyx_plover. User override: autoDreamEnabled in settings.json.
Claude Code is the agentic harness around Claude. The harness provides tools, context management, and execution environment that turn a language model into a coding agent. The model reasons; the harness acts.
| Component | Token Cost | Details |
|---|---|---|
| System prompt | ~2,300-3,600 tokens | Core instructions, behavior rules |
| Tool definitions | ~14-17K tokens | 18+ built-in tools, deferred MCP tools |
| CLAUDE.md content | Varies | Injected as system prompt context section (NOT a user message) |
| MEMORY.md index | First 200 lines | Always loaded at session start |
| Selected memory files | Up to 5 files | Chosen by memory selection agent |
| Rules | Varies | Unconditional at start, path-scoped on demand |
| Hook stdout | Varies | SessionStart output added to context |
Distinct from auto-memory, session memory runs as a periodic background forked sub-agent that extracts key information into ~/.claude/projects/<slug>/.session/memory.md. Triggered by token thresholds (~30K init, ~20K between updates). Does not interrupt the main conversation.
Auto-memory saving runs as a forked agent (shares prompt cache with main agent) at end of each query loop. It has restricted tools: Read/Grep/Glob, read-only Bash, and FileEdit/FileWrite only within the memory directory. Skipped if the main agent already wrote to memory that turn.
The harness actively optimizes for prompt caching. promptCacheBreakDetection.ts uses a 2-phase system: pre-call state recording (hashing system prompt, tool schemas, model, betas) and post-call cache break detection (>5% drop in cache read tokens). "Sticky latches" prevent mode toggles from invalidating the cache. System prompt content before a dynamic boundary marker gets global cache scope; content after gets org-level or no caching.
The memory selection agent operates only on filenames and one-line descriptions. No semantic search over memory file contents. If a memory file's index entry doesn't clearly match the query by keyword, it won't be selected.
At session start, the harness loads all ancestor CLAUDE.md files in full, all unconditional rules, and the first 200 lines of MEMORY.md. This happens regardless of whether the task is about deployment, debugging, Svelte, or CV writing.
All memory entries have equal weight. A debugging insight from 6 months ago occupies the same space as one from yesterday.
No relationships, no tags beyond the filename, no cross-references, no structured metadata.
Each session starts with a fresh context window. The only bridges across sessions are CLAUDE.md and auto memory files.
| Project | Approach |
|---|---|
claude-code-vector-memory | Semantic memory via vector search of session summaries |
claude-mem | ChromaDB + MCP integration, automatic compression, ~10x token efficiency |
claude-code-memory | Knowledge graphs + Tree-sitter + Qdrant vector search |
claude-context | Code search MCP, ~40% token reduction |