A comprehensive analysis of AI coding assistant memory systems, the memory routing problem, and how to build an optimal harness. April 2026. Updated with source code verification.
How the memory system actually works: CLAUDE.md loading chain, auto-memory, the memory selection agent, KAIROS/dream consolidation, and what the harness controls.
Inventory of your 30 memory files, trigger coverage analysis, routing gaps, scaling concerns, and what a fresh Claude instance will miss.
The core challenge: automatically loading the right context at the right time. Approaches from keyword matching to semantic search to LLM re-ranking.
Complete reference for Claude Code hooks, settings, MCP servers, path-scoped rules, and practical patterns for memory augmentation.
How Cursor, Windsurf, Copilot, Aider, Continue, Cline/Roo, Codex CLI, Amazon Q, and Zed handle memory and context.
State of the art: memory types, RAG for code, knowledge graphs, memory platforms (Mem0, Zep, Hindsight, Letta), and the MemGPT paradigm.
Concrete, phased plan: from immediate CLAUDE.md fixes (30 min) to a full semantic memory routing pipeline (weekend project).
Corrections from reading the actual source: hard-coded constants, memory data flow, system prompt assembly order, feature gates, and key file references.
| Metric | Your Current Setup | After Recommendations |
|---|---|---|
| Memory files | 30 files, ~51 KB | ~24 files after consolidation |
| Files with explicit triggers | 7 of 30 (23%) | 15+ of 24 (62%+) |
| Behavioral rules routed | 0 of 13 (0%) | 5 inlined + rest triggered |
| Routing method | LLM reads MEMORY.md one-liners | Always-on rules + keyword hooks + semantic search |
| User intervention needed | Often ("read X.md") | Rarely (pipeline handles it) |