Claude-Mem: Persistent Memory for AI Coding Assistants

Claude-Mem: Persistent Memory for AI Coding Assistants
Photo by Soragrit Wongsa / Unsplash

How an open-source plugin gives Claude Code the ability to remember your entire development history

TL;DR

claude-mem is an open-source memory system for Claude Code that automatically captures your coding sessions, compresses them with AI, and injects relevant context into future sessions. Think of it as giving Claude a long-term memory that survives across restarts.

Key Stats:

  • 🧠 Automatic capture of all tool usage
  • 📊 ~10x reduction in context tokens via progressive disclosure
  • 🔍 Natural language search across your entire project history
  • 🔒 Local-only storage with privacy controls
  • ⚡ Built with TypeScript + SQLite + Bun

The Context Problem

AI coding assistants like Claude Code are incredibly powerful, but they share a fundamental limitation: they forget everything when the session ends.

The Traditional Workaround:

Developers have relied on several manual approaches:

  • Maintaining a CLAUDE.md file with project instructions
  • Copy-pasting relevant code into each conversation
  • Re-explaining architectural decisions repeatedly
  • Starting from scratch after every restart

The Cost:

  • Time wasted re-establishing context
  • Inconsistent knowledge across sessions
  • Lost insights from previous interactions
  • High token costs from redundant file reads

Enter Claude-Mem

Claude-mem solves this by implementing a persistent memory layer that sits between you and Claude Code. It automatically:

  1. Captures every tool execution (file reads, writes, searches)
  2. Compresses observations into semantic summaries using AI
  3. Indexes everything with full-text and vector search
  4. Injects relevant context at the start of each new session

Result: Claude remembers your project history without you lifting a finger.


Architecture Deep Dive

System Components

┌─────────────────────────────────────────────────────────┐
│                    Claude Code CLI                      │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│                  5 Lifecycle Hooks                      │
│  SessionStart | UserPromptSubmit | PostToolUse          │
│  Stop | SessionEnd                                      │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│                   Worker Service                        │
│  HTTP API (port 37777) + Background Processing          │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│                  Storage Layer                          │
│  SQLite + FTS5 Full-Text Search                         │
│  ChromaDB Vector Store (optional)                       │
└─────────────────────────────────────────────────────────┘

Technology Stack

Layer Technology Purpose
Language TypeScript (ES2022) Type-safe plugin code
Runtime Node.js 18+ Hook execution
Process Manager Bun Worker service management
Database SQLite 3 + bun:sqlite Persistent storage
Full-Text Search FTS5 Fast text queries
Vector Search ChromaDB (optional) Semantic similarity
HTTP Server Express.js 4.18 Web API + viewer UI
Real-time Server-Sent Events Live memory stream
AI SDK @anthropic-ai/claude-agent-sdk Observation processing
Build Tool esbuild TypeScript bundling

How It Works: The Memory Pipeline

1. Session Start - Context Injection

When you start Claude Code:

// Context Hook (SessionStart)
1. Start Bun worker service if needed
2. Query last 10 sessions from SQLite
3. Retrieve top 50 observations (configurable)
4. Format as compressed summaries
5. Inject into Claude's system prompt

What Claude sees:

<claude-mem-context>
  <session id="123" date="2024-01-15">
    <observation type="bugfix">
      Fixed authentication race condition in auth.ts
      - Added mutex lock to token refresh
      - Prevents duplicate API calls
      [Read cost: 250 tokens | Created from: 1,200 tokens]
    </observation>
  </session>
</claude-mem-context>

Token Economics:

  • Without claude-mem: Re-read entire auth.ts file every session (1,200 tokens)
  • With claude-mem: Inject compressed summary (250 tokens)
  • Savings: 950 tokens (~79% reduction)

2. User Prompt - Session Creation

When you type a prompt:

// New Hook (UserPromptSubmit)
1. Create session record in SQLite
2. Save raw prompt for full-text search
3. Associate with current project/folder

3. Tool Execution - Observation Capture

Every time Claude uses a tool (can fire 100+ times per session):

// Save Hook (PostToolUse)
1. Capture tool input/output
2. Strip <private> tags (edge processing)
3. Queue observation for processing
4. Send to worker service via HTTP

4. Background Processing - AI Compression

Worker service processes observations asynchronously:

// Worker Service (Claude Agent SDK)
1. Batch observations for efficiency
2. Send to Claude API for analysis
3. Extract structured learnings:
   - Type: bugfix | feature | refactor | discovery
   - Concepts: how-it-works | gotcha | trade-off
   - Narrative: Human-readable summary
   - Facts: Key technical details
4. Store compressed results in SQLite

Example Transformation:

Raw Tool Execution (1,500 tokens):

{
  "tool": "Read",
  "path": "src/auth/middleware.ts",
  "content": "... (entire file contents) ..."
}

Compressed Observation (300 tokens):

{
  "type": "discovery",
  "concepts": ["how-it-works", "gotcha"],
  "title": "Auth middleware token validation flow",
  "narrative": "Middleware checks JWT expiry before route access. Gotcha: Clock skew tolerance of 60s can cause confusion.",
  "facts": {
    "file": "src/auth/middleware.ts",
    "key_function": "validateToken()",
    "edge_case": "Clock skew within 60s accepted"
  }
}

5. Session End - Summary Generation

When Claude stops or you end the session:

// Summary Hook (Stop)
1. Generate session-level summary
2. Aggregate all observations
3. Store completions, learnings, next steps
4. Mark session as complete

Progressive Disclosure: The Token-Efficiency Secret

One of claude-mem's most clever features is progressive disclosure - a three-layer retrieval pattern that minimizes token usage:

The 3-Layer Workflow

Layer 1: Index Search (~50-100 tokens/result)

search(query="authentication bug", type="bugfix", limit=20)

Returns compact index with IDs, titles, types, dates.

Layer 2: Timeline Context (~200 tokens/result)

timeline(observation_id=123, before=2, after=2)

Shows chronological context around interesting observations.

Layer 3: Full Details (~500-1,000 tokens/result)

get_observations(ids=[123, 456, 789])

Fetches complete narratives and facts for selected IDs only.

Token Savings Example

Without Progressive Disclosure:

Fetch 20 full observations upfront: 10,000-20,000 tokens

With Progressive Disclosure:

1. Search index: ~1,000 tokens
2. Review, identify 3 relevant IDs
3. Fetch only those 3: ~1,500-3,000 tokens
Total: 2,500-4,000 tokens (~75% savings)

MCP Search Tools

Claude-mem provides three Model Context Protocol (MCP) tools that Claude can invoke automatically:

Available Tools

1. search - Query the Index

// Natural language or structured queries
search({
  query: "authentication bug",
  type: "bugfix",
  dateFrom: "2024-01-01",
  limit: 20
})

2. timeline - Chronological Context

// What happened before/after an observation?
timeline({
  observation_id: 123,
  before: 2,  // 2 observations before
  after: 2    // 2 observations after
})

3. get_observations - Full Details

// Batch fetch by IDs
get_observations({
  ids: [123, 456, 789]
})

Auto-Invocation

Claude recognizes natural language queries and automatically uses these tools:

You: "What bugs did we fix last week?"

Claude (internally):

// 1. Search for recent bugfixes
search({ query: "bug", type: "bugfix", limit: 10 })

// 2. Review results, identify relevant IDs

// 3. Fetch full details
get_observations({ ids: [104, 107, 112] })

Claude (to you): "Last week we fixed three bugs: [detailed summary]..."


Configuration & Control

Core Settings

Managed in ~/.claude-mem/settings.json:

{
  "CLAUDE_MEM_MODEL": "sonnet",
  "CLAUDE_MEM_PROVIDER": "claude",
  "CLAUDE_MEM_CONTEXT_OBSERVATIONS": 50,
  "CLAUDE_MEM_WORKER_PORT": 37777,
  "CLAUDE_MEM_LOG_LEVEL": "INFO"
}

Context Injection Control

Fine-grained control over what gets injected:

Loading Settings:

  • CONTEXT_OBSERVATIONS (1-200): Number of observations to inject
  • CONTEXT_SESSION_COUNT (1-50): Number of recent sessions to pull from

Filter Settings:

  • Types: bugfix, feature, refactor, discovery, decision, change
  • Concepts: how-it-works, why-it-exists, gotcha, pattern, trade-off

Display Settings:

  • CONTEXT_FULL_COUNT (0-20): How many observations show full details
  • CONTEXT_FULL_FIELD: narrative | facts
  • Token economics visibility toggles

Privacy Control

Manual Privacy Tags:

# This won't be stored in memory
<private>
API_KEY = "sk-abc123..."
DATABASE_PASSWORD = "super-secret"
</private>

System-Level Tags:

<!-- Prevents recursive observation storage -->
<claude-mem-context>
  Past observations here...
</claude-mem-context>

Edge Processing: Privacy tag stripping happens at the hook layer before data reaches the worker or database.


Web Viewer UI

Real-time memory visualization at http://localhost:37777:

Features:

  • 🔴 Live stream of observations via Server-Sent Events
  • 🔍 Full-text search across all stored data
  • 📊 Project filtering - View memory by project/folder
  • ⚙️ Settings panel - Configure context injection
  • 📈 Token economics - See read costs, work investment, savings
  • 🔗 Citations - Reference observations by ID
  • 🎨 GPU-accelerated animations for smooth scrolling

Terminal Preview:
Shows exactly what will be injected at the start of your next Claude Code session for the selected project.


Security & Privacy Analysis

✅ Good Security Practices

Local-Only Storage:

  • All data stored in ~/.claude-mem/ on your machine
  • No external API calls (except Claude API for processing)
  • No telemetry or tracking
  • No cloud sync

Privacy Controls:

  • <private> tags for excluding sensitive content
  • Edge processing (stripping before database)
  • Configurable skip tools (exclude certain tool types)
  • Manual control over what gets captured

Open Source:

  • AGPL-3.0 license
  • Code is auditable on GitHub
  • Active community development
  • Transparent architecture

⚠️ Security Considerations

Unencrypted Database:

  • SQLite DB at ~/.claude-mem/claude-mem.db is plain text
  • Mitigation: Use full disk encryption (FileVault on macOS, BitLocker on Windows)

Localhost HTTP API:

  • Worker service runs on port 37777 without authentication by default
  • Mitigation: Firewall the port or bind to 127.0.0.1 only (default)

Automatic Capture:

  • Everything Claude does is recorded unless you use <private> tags
  • Risk: Forgetting to tag sensitive content
  • Mitigation: Review stored data regularly via web viewer

AI Processing:

  • Observations sent to Claude API for compression
  • Uses your API key (same as Claude Code itself)
  • Note: This is inherent to the compression feature

🔒 Security Hardening Recommendations

  1. Enable full disk encryption on your development machine
  2. Use <private> tags liberally for any sensitive data
  3. Firewall the worker port if you're on a shared network
  4. Set CLAUDE_MEM_SKIP_TOOLS to exclude tools you don't want captured
  5. Regularly audit stored data via the web viewer
  6. Review git commits before pushing (in case memory got committed)
  7. Add ~/.claude-mem/ to .gitignore globally

Performance Characteristics

Disk Space

Typical Usage:

  • Light user (10 sessions/week): ~10-20 MB/month
  • Heavy user (100 sessions/week): ~100-200 MB/month
  • Database growth: Linear with observation count
  • Storage location: ~/.claude-mem/claude-mem.db

Maintenance:

# Check database size
du -h ~/.claude-mem/claude-mem.db

# Vacuum to reclaim space (safe, but locks DB temporarily)
sqlite3 ~/.claude-mem/claude-mem.db "VACUUM;"

Memory & CPU

Worker Service:

  • RAM: ~100-200 MB typical, ~500 MB peak during heavy processing
  • CPU: Minimal when idle, spikes during AI compression
  • Process: Managed by Bun, auto-restarts on crashes

Hook Execution:

  • SessionStart: 10-500ms (cached dependencies vs. fresh install)
  • UserPromptSubmit: <10ms
  • PostToolUse: <5ms per execution (async processing)
  • Stop: 100-300ms (summary generation)

Token Usage

Context Injection (per session):

  • 50 observations @ ~250 tokens each = ~12,500 tokens
  • Cost (Claude Sonnet): ~$0.0375 per session start
  • Savings vs. re-reading files: ~70-80% reduction

AI Compression (background):

  • Processing 100 observations: ~50,000 tokens
  • Cost (Claude Sonnet): ~$0.15 per 100 observations
  • Amortized over reuse: Pays for itself in 2-3 sessions

Use Cases & Workflows

1. Long-Running Projects

Problem: Working on a project for weeks/months, Claude forgets past decisions.

With claude-mem:

  • "Why did we choose Redis over Memcached?" → Instant answer from past discussion
  • "What was that authentication gotcha we hit?" → Retrieved from discovery observations
  • Consistent architectural decisions across sessions

2. Team Onboarding

Problem: New team members ask the same questions repeatedly.

With claude-mem:

  • Share your ~/.claude-mem/claude-mem.db (or exports)
  • New teammates get institutional knowledge automatically
  • Reduce context-gathering time by 70%+

3. Bug Investigation

Problem: Recurring bugs, hard to remember past fixes.

With claude-mem:

search({ query: "timeout error", type: "bugfix" })
  • Instantly find similar past bugs
  • See what solutions worked
  • Avoid repeating failed approaches

4. Code Review Assistance

Problem: Reviewers lack context about design decisions.

With claude-mem:

  • Claude knows why code was written a certain way
  • Can explain trade-offs made during implementation
  • References specific past discussions

5. Documentation Generation

Problem: Writing docs requires remembering entire project history.

With claude-mem:

  • "Generate architecture docs for this project"
  • Claude draws from all past sessions
  • Includes decisions, trade-offs, gotchas
  • Accurate because it witnessed the development

Comparison: Manual Memory vs. Claude-Mem

Aspect Manual (CLAUDE.md) Claude-Mem
Setup effort Write project docs manually Install plugin, automatic
Maintenance Update docs as code changes Automatic observation capture
Coverage Only what you document Everything Claude does
Searchability Ctrl+F in markdown Full-text + semantic search
Token efficiency Re-reads entire file Progressive disclosure
Granularity Project-level guidance Observation-level detail
Privacy You control content Requires privacy tags
Cross-session Static context Dynamic, contextual
Learning curve Minimal Moderate (concepts, tools)

Best Practice: Use both!

  • CLAUDE.md for high-level project guidance
  • claude-mem for detailed session history

Installation & Setup

Quick Start

# Install plugin via Claude Code CLI
/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem

# Restart Claude Code
# Memory will now persist automatically!

Verify Installation

# Check worker is running
ps aux | grep worker-service

# View logs
tail -f ~/.claude-mem/logs/worker-out.log

# Access web viewer
open http://localhost:37777

Configuration

# Edit settings
vim ~/.claude-mem/settings.json

# Restart worker to apply changes
cd ~/.claude/plugins/marketplaces/thedotmack
npm run worker:restart

Advanced Features

Folder Context Files

Auto-generates CLAUDE.md in project folders with activity timelines:

# Project Context

Last updated: 2024-01-15

## Recent Activity
- Fixed auth middleware race condition (2024-01-15)
- Refactored database connection pool (2024-01-14)
- Added rate limiting (2024-01-13)

## Key Decisions
- Using Redis for session storage (2024-01-10)
- Chose JWT over session cookies (2024-01-08)

Enable:

{
  "CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED": true
}

Multilingual Support

Claude-mem supports 28 languages via mode configuration:

{
  "CLAUDE_MEM_MODE": "code--es"  // Spanish code mode
}

Supported languages: English, Spanish, Chinese, French, German, Japanese, Korean, Portuguese, Russian, Arabic, Hebrew, and more.

Mode System

Switch between workflow profiles:

  • code - Standard development (default)
  • email-investigation - Email/communication analysis
  • chill - Casual coding sessions
{
  "CLAUDE_MEM_MODE": "email-investigation"
}

Beta Channel

Try experimental features like "Endless Mode" (biomimetic memory architecture):

  1. Open web viewer: http://localhost:37777
  2. Click Settings gear icon
  3. Switch to Beta channel
  4. Your data is preserved, only plugin code changes

Troubleshooting

Worker Won't Start

# Check for port conflicts
lsof -i :37777

# Kill conflicting process
kill -9 <PID>

# Or change port
export CLAUDE_MEM_WORKER_PORT=38000
npm run worker:restart

Missing Context

# Verify observations are being captured
sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"

# Check last session
sqlite3 ~/.claude-mem/claude-mem.db "SELECT * FROM sessions ORDER BY created_at DESC LIMIT 1;"

# Enable debug logging
export CLAUDE_MEM_LOG_LEVEL=DEBUG
npm run worker:restart

Memory Growing Too Large

# Check database size
du -h ~/.claude-mem/claude-mem.db

# Archive old sessions (manual export)
sqlite3 ~/.claude-mem/claude-mem.db "SELECT * FROM sessions WHERE created_at < '2024-01-01';" > old-sessions.sql

# Delete old sessions (dangerous! backup first)
sqlite3 ~/.claude-mem/claude-mem.db "DELETE FROM sessions WHERE created_at < '2024-01-01';"

# Vacuum to reclaim space
sqlite3 ~/.claude-mem/claude-mem.db "VACUUM;"

Limitations & Trade-offs

⚠️ Current Limitations

Claude Code CLI Only:

  • Does not work with Claude.ai web interface
  • Does not work with other IDEs (VS Code, Cursor)
  • Specifically designed for Claude Code CLI hooks

Local Storage Only:

  • No cloud sync between machines
  • Manual export/import for sharing
  • No cross-device persistence

Manual Privacy:

  • Requires remembering to use <private> tags
  • Easy to accidentally capture sensitive data
  • No automatic credential detection

API Costs:

  • Background processing uses Claude API tokens
  • ~$0.15 per 100 observations processed
  • Can add up for very active projects

Resource Usage:

  • Background worker service always running
  • ~100-200 MB RAM minimum
  • Database grows indefinitely (manual cleanup needed)

🎯 Design Trade-offs

Automatic vs. Manual:

  • Benefit: Zero-effort memory capture
  • Cost: Potential for capturing unwanted data

Compression vs. Fidelity:

  • Benefit: 70-80% token reduction
  • Cost: Some nuance lost in summarization

Local vs. Cloud:

  • Benefit: Privacy and control
  • Cost: No multi-device sync

Background Processing:

  • Benefit: Non-blocking, async compression
  • Cost: Slightly delayed memory availability

Future Directions

Based on the GitHub roadmap and community discussion:

Planned Features:

  • IDE integrations (VS Code, Cursor via plugins)
  • Cloud sync option (opt-in, encrypted)
  • Automatic sensitive data detection
  • Memory export/import tools
  • Team memory sharing workflows
  • Memory pruning/archival automation
  • Enhanced vector search with better embeddings
  • Multi-agent collaboration support

Community Requests:

  • Integration with other AI assistants (GPT-4, Gemini)
  • Memory visualization tools (graph view)
  • Cost optimization (cheaper compression models)
  • Privacy-preserving memory sharing (anonymization)

Alternatives & Comparisons

Claude.ai Projects (Web)

Similarities:

  • Project-specific context
  • File awareness

Differences:

  • No session-to-session memory in web (yet)
  • No automatic capture of tool usage
  • No searchable history

VS Code Workspace Settings

Similarities:

  • Project-level configuration
  • Persists across sessions

Differences:

  • Static configuration only
  • No dynamic memory
  • No AI compression

Git Commit History

Similarities:

  • Historical record of changes
  • Searchable

Differences:

  • Captures code changes, not reasoning
  • No AI summaries
  • No connection to AI assistant context

Custom CLAUDE.md Files

Similarities:

  • Persistent context
  • Manual curation

Differences:

  • Static vs. dynamic
  • Requires manual updates
  • No observation-level detail

Verdict: Claude-mem is complementary to all of these. Use it alongside existing tools.


Is Claude-Mem Right for You?

✅ Good Fit If You:

  • Use Claude Code CLI regularly for development
  • Work on long-running projects (weeks/months)
  • Want automatic session memory without manual effort
  • Need searchable project history
  • Are comfortable with local database storage
  • Have Claude API quota to spare for compression
  • Value token efficiency (progressive disclosure)

❌ Not a Fit If You:

  • Primarily use Claude.ai web interface (it won't work)
  • Prefer explicit control over all stored data
  • Work on highly sensitive projects requiring audit trails
  • Are concerned about unencrypted local storage
  • Don't want another background service running
  • Have very limited Claude API budget
  • Only do quick, one-off coding tasks

🤔 Try It If You're:

  • Curious about AI agent memory systems
  • Experimenting with workflow optimization
  • Building a personal knowledge base of coding decisions
  • Interested in the architecture (TypeScript/SQLite/MCP)

Getting Started: A Practical Workflow

Week 1: Installation & Familiarization

Day 1-2: Install & Observe

/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem
# Use Claude Code normally, don't change workflow yet

Day 3-4: Explore Memory

  • Visit http://localhost:37777
  • Browse captured observations
  • See what types/concepts are being extracted

Day 5-7: Test Search

"What did we work on yesterday?"
"Find all bugfixes from last week"
"Show me database-related changes"

Week 2: Optimization

Configure Context Injection:

  • Open Settings in web viewer
  • Adjust observation count (start with 50)
  • Filter by relevant types/concepts
  • Monitor token usage

Add Privacy Tags:

<private>
API_KEY = "..."
</private>

Skip Noisy Tools:

{
  "CLAUDE_MEM_SKIP_TOOLS": "ListMcpResourcesTool,SlashCommand,Skill,TodoWrite"
}

Week 3+: Advanced Usage

Leverage Progressive Disclosure:

"Search for authentication issues"
# Review index results
"Get full details for observations 104, 107, 112"

Use Timeline Queries:

"Show me what happened around observation 156"

Analyze Token Economics:

  • Check "Savings" column in web viewer
  • Optimize observation count vs. context quality
  • Tune full observation display count

Conclusion

Claude-mem represents a significant step forward in AI agent memory systems. By automatically capturing, compressing, and intelligently retrieving context, it eliminates one of the biggest pain points in AI-assisted development: the loss of knowledge between sessions.

Key Takeaways:

  1. Automatic > Manual - Zero-effort capture beats manual documentation for session-level detail
  2. Compression Works - 70-80% token reduction proves AI summarization is effective
  3. Progressive Disclosure - Layer-based retrieval is the key to token efficiency
  4. Local-First - Privacy-conscious design with no cloud dependencies
  5. Open Source - Auditable, extensible, community-driven

The Future of AI Memory:

Claude-mem is pioneering techniques that will likely become standard in AI assistants:

  • Lifecycle hooks for observation capture
  • AI-driven compression of tool usage
  • Progressive disclosure for context retrieval
  • Local-first, privacy-respecting storage

As AI coding assistants become more capable, giving them better memory will be crucial for truly collaborative development. Claude-mem shows us what that future looks like.


Resources


Have you tried claude-mem? What's your experience with AI agent memory systems? Share your thoughts in the comments!