Claude-Mem: Persistent Memory for AI Coding Assistants
How an open-source plugin gives Claude Code the ability to remember your entire development history
TL;DR
claude-mem is an open-source memory system for Claude Code that automatically captures your coding sessions, compresses them with AI, and injects relevant context into future sessions. Think of it as giving Claude a long-term memory that survives across restarts.
Key Stats:
- 🧠 Automatic capture of all tool usage
- 📊 ~10x reduction in context tokens via progressive disclosure
- 🔍 Natural language search across your entire project history
- 🔒 Local-only storage with privacy controls
- ⚡ Built with TypeScript + SQLite + Bun
The Context Problem
AI coding assistants like Claude Code are incredibly powerful, but they share a fundamental limitation: they forget everything when the session ends.
The Traditional Workaround:
Developers have relied on several manual approaches:
- Maintaining a
CLAUDE.mdfile with project instructions - Copy-pasting relevant code into each conversation
- Re-explaining architectural decisions repeatedly
- Starting from scratch after every restart
The Cost:
- Time wasted re-establishing context
- Inconsistent knowledge across sessions
- Lost insights from previous interactions
- High token costs from redundant file reads
Enter Claude-Mem
Claude-mem solves this by implementing a persistent memory layer that sits between you and Claude Code. It automatically:
- Captures every tool execution (file reads, writes, searches)
- Compresses observations into semantic summaries using AI
- Indexes everything with full-text and vector search
- Injects relevant context at the start of each new session
Result: Claude remembers your project history without you lifting a finger.
Architecture Deep Dive
System Components
┌─────────────────────────────────────────────────────────┐
│ Claude Code CLI │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ 5 Lifecycle Hooks │
│ SessionStart | UserPromptSubmit | PostToolUse │
│ Stop | SessionEnd │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ Worker Service │
│ HTTP API (port 37777) + Background Processing │
└─────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────┐
│ Storage Layer │
│ SQLite + FTS5 Full-Text Search │
│ ChromaDB Vector Store (optional) │
└─────────────────────────────────────────────────────────┘
Technology Stack
| Layer | Technology | Purpose |
|---|---|---|
| Language | TypeScript (ES2022) | Type-safe plugin code |
| Runtime | Node.js 18+ | Hook execution |
| Process Manager | Bun | Worker service management |
| Database | SQLite 3 + bun:sqlite | Persistent storage |
| Full-Text Search | FTS5 | Fast text queries |
| Vector Search | ChromaDB (optional) | Semantic similarity |
| HTTP Server | Express.js 4.18 | Web API + viewer UI |
| Real-time | Server-Sent Events | Live memory stream |
| AI SDK | @anthropic-ai/claude-agent-sdk | Observation processing |
| Build Tool | esbuild | TypeScript bundling |
How It Works: The Memory Pipeline
1. Session Start - Context Injection
When you start Claude Code:
// Context Hook (SessionStart)
1. Start Bun worker service if needed
2. Query last 10 sessions from SQLite
3. Retrieve top 50 observations (configurable)
4. Format as compressed summaries
5. Inject into Claude's system prompt
What Claude sees:
<claude-mem-context>
<session id="123" date="2024-01-15">
<observation type="bugfix">
Fixed authentication race condition in auth.ts
- Added mutex lock to token refresh
- Prevents duplicate API calls
[Read cost: 250 tokens | Created from: 1,200 tokens]
</observation>
</session>
</claude-mem-context>
Token Economics:
- Without claude-mem: Re-read entire
auth.tsfile every session (1,200 tokens) - With claude-mem: Inject compressed summary (250 tokens)
- Savings: 950 tokens (~79% reduction)
2. User Prompt - Session Creation
When you type a prompt:
// New Hook (UserPromptSubmit)
1. Create session record in SQLite
2. Save raw prompt for full-text search
3. Associate with current project/folder
3. Tool Execution - Observation Capture
Every time Claude uses a tool (can fire 100+ times per session):
// Save Hook (PostToolUse)
1. Capture tool input/output
2. Strip <private> tags (edge processing)
3. Queue observation for processing
4. Send to worker service via HTTP
4. Background Processing - AI Compression
Worker service processes observations asynchronously:
// Worker Service (Claude Agent SDK)
1. Batch observations for efficiency
2. Send to Claude API for analysis
3. Extract structured learnings:
- Type: bugfix | feature | refactor | discovery
- Concepts: how-it-works | gotcha | trade-off
- Narrative: Human-readable summary
- Facts: Key technical details
4. Store compressed results in SQLite
Example Transformation:
Raw Tool Execution (1,500 tokens):
{
"tool": "Read",
"path": "src/auth/middleware.ts",
"content": "... (entire file contents) ..."
}
Compressed Observation (300 tokens):
{
"type": "discovery",
"concepts": ["how-it-works", "gotcha"],
"title": "Auth middleware token validation flow",
"narrative": "Middleware checks JWT expiry before route access. Gotcha: Clock skew tolerance of 60s can cause confusion.",
"facts": {
"file": "src/auth/middleware.ts",
"key_function": "validateToken()",
"edge_case": "Clock skew within 60s accepted"
}
}
5. Session End - Summary Generation
When Claude stops or you end the session:
// Summary Hook (Stop)
1. Generate session-level summary
2. Aggregate all observations
3. Store completions, learnings, next steps
4. Mark session as complete
Progressive Disclosure: The Token-Efficiency Secret
One of claude-mem's most clever features is progressive disclosure - a three-layer retrieval pattern that minimizes token usage:
The 3-Layer Workflow
Layer 1: Index Search (~50-100 tokens/result)
search(query="authentication bug", type="bugfix", limit=20)
Returns compact index with IDs, titles, types, dates.
Layer 2: Timeline Context (~200 tokens/result)
timeline(observation_id=123, before=2, after=2)
Shows chronological context around interesting observations.
Layer 3: Full Details (~500-1,000 tokens/result)
get_observations(ids=[123, 456, 789])
Fetches complete narratives and facts for selected IDs only.
Token Savings Example
Without Progressive Disclosure:
Fetch 20 full observations upfront: 10,000-20,000 tokens
With Progressive Disclosure:
1. Search index: ~1,000 tokens
2. Review, identify 3 relevant IDs
3. Fetch only those 3: ~1,500-3,000 tokens
Total: 2,500-4,000 tokens (~75% savings)
MCP Search Tools
Claude-mem provides three Model Context Protocol (MCP) tools that Claude can invoke automatically:
Available Tools
1. search - Query the Index
// Natural language or structured queries
search({
query: "authentication bug",
type: "bugfix",
dateFrom: "2024-01-01",
limit: 20
})
2. timeline - Chronological Context
// What happened before/after an observation?
timeline({
observation_id: 123,
before: 2, // 2 observations before
after: 2 // 2 observations after
})
3. get_observations - Full Details
// Batch fetch by IDs
get_observations({
ids: [123, 456, 789]
})
Auto-Invocation
Claude recognizes natural language queries and automatically uses these tools:
You: "What bugs did we fix last week?"
Claude (internally):
// 1. Search for recent bugfixes
search({ query: "bug", type: "bugfix", limit: 10 })
// 2. Review results, identify relevant IDs
// 3. Fetch full details
get_observations({ ids: [104, 107, 112] })
Claude (to you): "Last week we fixed three bugs: [detailed summary]..."
Configuration & Control
Core Settings
Managed in ~/.claude-mem/settings.json:
{
"CLAUDE_MEM_MODEL": "sonnet",
"CLAUDE_MEM_PROVIDER": "claude",
"CLAUDE_MEM_CONTEXT_OBSERVATIONS": 50,
"CLAUDE_MEM_WORKER_PORT": 37777,
"CLAUDE_MEM_LOG_LEVEL": "INFO"
}
Context Injection Control
Fine-grained control over what gets injected:
Loading Settings:
CONTEXT_OBSERVATIONS(1-200): Number of observations to injectCONTEXT_SESSION_COUNT(1-50): Number of recent sessions to pull from
Filter Settings:
- Types: bugfix, feature, refactor, discovery, decision, change
- Concepts: how-it-works, why-it-exists, gotcha, pattern, trade-off
Display Settings:
CONTEXT_FULL_COUNT(0-20): How many observations show full detailsCONTEXT_FULL_FIELD: narrative | facts- Token economics visibility toggles
Privacy Control
Manual Privacy Tags:
# This won't be stored in memory
<private>
API_KEY = "sk-abc123..."
DATABASE_PASSWORD = "super-secret"
</private>
System-Level Tags:
<!-- Prevents recursive observation storage -->
<claude-mem-context>
Past observations here...
</claude-mem-context>
Edge Processing: Privacy tag stripping happens at the hook layer before data reaches the worker or database.
Web Viewer UI
Real-time memory visualization at http://localhost:37777:
Features:
- 🔴 Live stream of observations via Server-Sent Events
- 🔍 Full-text search across all stored data
- 📊 Project filtering - View memory by project/folder
- ⚙️ Settings panel - Configure context injection
- 📈 Token economics - See read costs, work investment, savings
- 🔗 Citations - Reference observations by ID
- 🎨 GPU-accelerated animations for smooth scrolling
Terminal Preview:
Shows exactly what will be injected at the start of your next Claude Code session for the selected project.
Security & Privacy Analysis
✅ Good Security Practices
Local-Only Storage:
- All data stored in
~/.claude-mem/on your machine - No external API calls (except Claude API for processing)
- No telemetry or tracking
- No cloud sync
Privacy Controls:
<private>tags for excluding sensitive content- Edge processing (stripping before database)
- Configurable skip tools (exclude certain tool types)
- Manual control over what gets captured
Open Source:
- AGPL-3.0 license
- Code is auditable on GitHub
- Active community development
- Transparent architecture
⚠️ Security Considerations
Unencrypted Database:
- SQLite DB at
~/.claude-mem/claude-mem.dbis plain text - Mitigation: Use full disk encryption (FileVault on macOS, BitLocker on Windows)
Localhost HTTP API:
- Worker service runs on port 37777 without authentication by default
- Mitigation: Firewall the port or bind to 127.0.0.1 only (default)
Automatic Capture:
- Everything Claude does is recorded unless you use
<private>tags - Risk: Forgetting to tag sensitive content
- Mitigation: Review stored data regularly via web viewer
AI Processing:
- Observations sent to Claude API for compression
- Uses your API key (same as Claude Code itself)
- Note: This is inherent to the compression feature
🔒 Security Hardening Recommendations
- Enable full disk encryption on your development machine
- Use
<private>tags liberally for any sensitive data - Firewall the worker port if you're on a shared network
- Set
CLAUDE_MEM_SKIP_TOOLSto exclude tools you don't want captured - Regularly audit stored data via the web viewer
- Review git commits before pushing (in case memory got committed)
- Add
~/.claude-mem/to.gitignoreglobally
Performance Characteristics
Disk Space
Typical Usage:
- Light user (10 sessions/week): ~10-20 MB/month
- Heavy user (100 sessions/week): ~100-200 MB/month
- Database growth: Linear with observation count
- Storage location:
~/.claude-mem/claude-mem.db
Maintenance:
# Check database size
du -h ~/.claude-mem/claude-mem.db
# Vacuum to reclaim space (safe, but locks DB temporarily)
sqlite3 ~/.claude-mem/claude-mem.db "VACUUM;"
Memory & CPU
Worker Service:
- RAM: ~100-200 MB typical, ~500 MB peak during heavy processing
- CPU: Minimal when idle, spikes during AI compression
- Process: Managed by Bun, auto-restarts on crashes
Hook Execution:
- SessionStart: 10-500ms (cached dependencies vs. fresh install)
- UserPromptSubmit: <10ms
- PostToolUse: <5ms per execution (async processing)
- Stop: 100-300ms (summary generation)
Token Usage
Context Injection (per session):
- 50 observations @ ~250 tokens each = ~12,500 tokens
- Cost (Claude Sonnet): ~$0.0375 per session start
- Savings vs. re-reading files: ~70-80% reduction
AI Compression (background):
- Processing 100 observations: ~50,000 tokens
- Cost (Claude Sonnet): ~$0.15 per 100 observations
- Amortized over reuse: Pays for itself in 2-3 sessions
Use Cases & Workflows
1. Long-Running Projects
Problem: Working on a project for weeks/months, Claude forgets past decisions.
With claude-mem:
- "Why did we choose Redis over Memcached?" → Instant answer from past discussion
- "What was that authentication gotcha we hit?" → Retrieved from discovery observations
- Consistent architectural decisions across sessions
2. Team Onboarding
Problem: New team members ask the same questions repeatedly.
With claude-mem:
- Share your
~/.claude-mem/claude-mem.db(or exports) - New teammates get institutional knowledge automatically
- Reduce context-gathering time by 70%+
3. Bug Investigation
Problem: Recurring bugs, hard to remember past fixes.
With claude-mem:
search({ query: "timeout error", type: "bugfix" })
- Instantly find similar past bugs
- See what solutions worked
- Avoid repeating failed approaches
4. Code Review Assistance
Problem: Reviewers lack context about design decisions.
With claude-mem:
- Claude knows why code was written a certain way
- Can explain trade-offs made during implementation
- References specific past discussions
5. Documentation Generation
Problem: Writing docs requires remembering entire project history.
With claude-mem:
- "Generate architecture docs for this project"
- Claude draws from all past sessions
- Includes decisions, trade-offs, gotchas
- Accurate because it witnessed the development
Comparison: Manual Memory vs. Claude-Mem
| Aspect | Manual (CLAUDE.md) | Claude-Mem |
|---|---|---|
| Setup effort | Write project docs manually | Install plugin, automatic |
| Maintenance | Update docs as code changes | Automatic observation capture |
| Coverage | Only what you document | Everything Claude does |
| Searchability | Ctrl+F in markdown | Full-text + semantic search |
| Token efficiency | Re-reads entire file | Progressive disclosure |
| Granularity | Project-level guidance | Observation-level detail |
| Privacy | You control content | Requires privacy tags |
| Cross-session | Static context | Dynamic, contextual |
| Learning curve | Minimal | Moderate (concepts, tools) |
Best Practice: Use both!
CLAUDE.mdfor high-level project guidanceclaude-memfor detailed session history
Installation & Setup
Quick Start
# Install plugin via Claude Code CLI
/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem
# Restart Claude Code
# Memory will now persist automatically!
Verify Installation
# Check worker is running
ps aux | grep worker-service
# View logs
tail -f ~/.claude-mem/logs/worker-out.log
# Access web viewer
open http://localhost:37777
Configuration
# Edit settings
vim ~/.claude-mem/settings.json
# Restart worker to apply changes
cd ~/.claude/plugins/marketplaces/thedotmack
npm run worker:restart
Advanced Features
Folder Context Files
Auto-generates CLAUDE.md in project folders with activity timelines:
# Project Context
Last updated: 2024-01-15
## Recent Activity
- Fixed auth middleware race condition (2024-01-15)
- Refactored database connection pool (2024-01-14)
- Added rate limiting (2024-01-13)
## Key Decisions
- Using Redis for session storage (2024-01-10)
- Chose JWT over session cookies (2024-01-08)
Enable:
{
"CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED": true
}
Multilingual Support
Claude-mem supports 28 languages via mode configuration:
{
"CLAUDE_MEM_MODE": "code--es" // Spanish code mode
}
Supported languages: English, Spanish, Chinese, French, German, Japanese, Korean, Portuguese, Russian, Arabic, Hebrew, and more.
Mode System
Switch between workflow profiles:
code- Standard development (default)email-investigation- Email/communication analysischill- Casual coding sessions
{
"CLAUDE_MEM_MODE": "email-investigation"
}
Beta Channel
Try experimental features like "Endless Mode" (biomimetic memory architecture):
- Open web viewer:
http://localhost:37777 - Click Settings gear icon
- Switch to Beta channel
- Your data is preserved, only plugin code changes
Troubleshooting
Worker Won't Start
# Check for port conflicts
lsof -i :37777
# Kill conflicting process
kill -9 <PID>
# Or change port
export CLAUDE_MEM_WORKER_PORT=38000
npm run worker:restart
Missing Context
# Verify observations are being captured
sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"
# Check last session
sqlite3 ~/.claude-mem/claude-mem.db "SELECT * FROM sessions ORDER BY created_at DESC LIMIT 1;"
# Enable debug logging
export CLAUDE_MEM_LOG_LEVEL=DEBUG
npm run worker:restart
Memory Growing Too Large
# Check database size
du -h ~/.claude-mem/claude-mem.db
# Archive old sessions (manual export)
sqlite3 ~/.claude-mem/claude-mem.db "SELECT * FROM sessions WHERE created_at < '2024-01-01';" > old-sessions.sql
# Delete old sessions (dangerous! backup first)
sqlite3 ~/.claude-mem/claude-mem.db "DELETE FROM sessions WHERE created_at < '2024-01-01';"
# Vacuum to reclaim space
sqlite3 ~/.claude-mem/claude-mem.db "VACUUM;"
Limitations & Trade-offs
⚠️ Current Limitations
Claude Code CLI Only:
- Does not work with Claude.ai web interface
- Does not work with other IDEs (VS Code, Cursor)
- Specifically designed for Claude Code CLI hooks
Local Storage Only:
- No cloud sync between machines
- Manual export/import for sharing
- No cross-device persistence
Manual Privacy:
- Requires remembering to use
<private>tags - Easy to accidentally capture sensitive data
- No automatic credential detection
API Costs:
- Background processing uses Claude API tokens
- ~$0.15 per 100 observations processed
- Can add up for very active projects
Resource Usage:
- Background worker service always running
- ~100-200 MB RAM minimum
- Database grows indefinitely (manual cleanup needed)
🎯 Design Trade-offs
Automatic vs. Manual:
- Benefit: Zero-effort memory capture
- Cost: Potential for capturing unwanted data
Compression vs. Fidelity:
- Benefit: 70-80% token reduction
- Cost: Some nuance lost in summarization
Local vs. Cloud:
- Benefit: Privacy and control
- Cost: No multi-device sync
Background Processing:
- Benefit: Non-blocking, async compression
- Cost: Slightly delayed memory availability
Future Directions
Based on the GitHub roadmap and community discussion:
Planned Features:
- IDE integrations (VS Code, Cursor via plugins)
- Cloud sync option (opt-in, encrypted)
- Automatic sensitive data detection
- Memory export/import tools
- Team memory sharing workflows
- Memory pruning/archival automation
- Enhanced vector search with better embeddings
- Multi-agent collaboration support
Community Requests:
- Integration with other AI assistants (GPT-4, Gemini)
- Memory visualization tools (graph view)
- Cost optimization (cheaper compression models)
- Privacy-preserving memory sharing (anonymization)
Alternatives & Comparisons
Claude.ai Projects (Web)
Similarities:
- Project-specific context
- File awareness
Differences:
- No session-to-session memory in web (yet)
- No automatic capture of tool usage
- No searchable history
VS Code Workspace Settings
Similarities:
- Project-level configuration
- Persists across sessions
Differences:
- Static configuration only
- No dynamic memory
- No AI compression
Git Commit History
Similarities:
- Historical record of changes
- Searchable
Differences:
- Captures code changes, not reasoning
- No AI summaries
- No connection to AI assistant context
Custom CLAUDE.md Files
Similarities:
- Persistent context
- Manual curation
Differences:
- Static vs. dynamic
- Requires manual updates
- No observation-level detail
Verdict: Claude-mem is complementary to all of these. Use it alongside existing tools.
Is Claude-Mem Right for You?
✅ Good Fit If You:
- Use Claude Code CLI regularly for development
- Work on long-running projects (weeks/months)
- Want automatic session memory without manual effort
- Need searchable project history
- Are comfortable with local database storage
- Have Claude API quota to spare for compression
- Value token efficiency (progressive disclosure)
❌ Not a Fit If You:
- Primarily use Claude.ai web interface (it won't work)
- Prefer explicit control over all stored data
- Work on highly sensitive projects requiring audit trails
- Are concerned about unencrypted local storage
- Don't want another background service running
- Have very limited Claude API budget
- Only do quick, one-off coding tasks
🤔 Try It If You're:
- Curious about AI agent memory systems
- Experimenting with workflow optimization
- Building a personal knowledge base of coding decisions
- Interested in the architecture (TypeScript/SQLite/MCP)
Getting Started: A Practical Workflow
Week 1: Installation & Familiarization
Day 1-2: Install & Observe
/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem
# Use Claude Code normally, don't change workflow yet
Day 3-4: Explore Memory
- Visit
http://localhost:37777 - Browse captured observations
- See what types/concepts are being extracted
Day 5-7: Test Search
"What did we work on yesterday?"
"Find all bugfixes from last week"
"Show me database-related changes"
Week 2: Optimization
Configure Context Injection:
- Open Settings in web viewer
- Adjust observation count (start with 50)
- Filter by relevant types/concepts
- Monitor token usage
Add Privacy Tags:
<private>
API_KEY = "..."
</private>
Skip Noisy Tools:
{
"CLAUDE_MEM_SKIP_TOOLS": "ListMcpResourcesTool,SlashCommand,Skill,TodoWrite"
}
Week 3+: Advanced Usage
Leverage Progressive Disclosure:
"Search for authentication issues"
# Review index results
"Get full details for observations 104, 107, 112"
Use Timeline Queries:
"Show me what happened around observation 156"
Analyze Token Economics:
- Check "Savings" column in web viewer
- Optimize observation count vs. context quality
- Tune full observation display count
Conclusion
Claude-mem represents a significant step forward in AI agent memory systems. By automatically capturing, compressing, and intelligently retrieving context, it eliminates one of the biggest pain points in AI-assisted development: the loss of knowledge between sessions.
Key Takeaways:
- Automatic > Manual - Zero-effort capture beats manual documentation for session-level detail
- Compression Works - 70-80% token reduction proves AI summarization is effective
- Progressive Disclosure - Layer-based retrieval is the key to token efficiency
- Local-First - Privacy-conscious design with no cloud dependencies
- Open Source - Auditable, extensible, community-driven
The Future of AI Memory:
Claude-mem is pioneering techniques that will likely become standard in AI assistants:
- Lifecycle hooks for observation capture
- AI-driven compression of tool usage
- Progressive disclosure for context retrieval
- Local-first, privacy-respecting storage
As AI coding assistants become more capable, giving them better memory will be crucial for truly collaborative development. Claude-mem shows us what that future looks like.
Resources
- GitHub: github.com/thedotmack/claude-mem
- Documentation: docs.claude-mem.ai
- Discord: discord.com/invite/J4wttp9vDu
- Author: Alex Newman (@thedotmack)
- License: AGPL-3.0
- Latest Version: v9.0.0 (as of Feb 2026)
Have you tried claude-mem? What's your experience with AI agent memory systems? Share your thoughts in the comments!