Claude-Mem: Persistent Memory for AI Coding Assistants

Sascha Corti

03 Feb 2026 • 12 min read

How an open-source plugin gives Claude Code the ability to remember your entire development history

TL;DR

claude-mem is an open-source memory system for Claude Code that automatically captures your coding sessions, compresses them with AI, and injects relevant context into future sessions. Think of it as giving Claude a long-term memory that survives across restarts.

Key Stats:

🧠 Automatic capture of all tool usage
📊 ~10x reduction in context tokens via progressive disclosure
🔍 Natural language search across your entire project history
🔒 Local-only storage with privacy controls
⚡ Built with TypeScript + SQLite + Bun

The Context Problem

AI coding assistants like Claude Code are incredibly powerful, but they share a fundamental limitation: they forget everything when the session ends.

The Traditional Workaround:

Developers have relied on several manual approaches:

Maintaining a CLAUDE.md file with project instructions
Copy-pasting relevant code into each conversation
Re-explaining architectural decisions repeatedly
Starting from scratch after every restart

The Cost:

Time wasted re-establishing context
Inconsistent knowledge across sessions
Lost insights from previous interactions
High token costs from redundant file reads

Enter Claude-Mem

Claude-mem solves this by implementing a persistent memory layer that sits between you and Claude Code. It automatically:

Captures every tool execution (file reads, writes, searches)
Compresses observations into semantic summaries using AI
Indexes everything with full-text and vector search
Injects relevant context at the start of each new session

Result: Claude remembers your project history without you lifting a finger.

Architecture Deep Dive

System Components

┌─────────────────────────────────────────────────────────┐
│                    Claude Code CLI                      │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│                  5 Lifecycle Hooks                      │
│  SessionStart | UserPromptSubmit | PostToolUse          │
│  Stop | SessionEnd                                      │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│                   Worker Service                        │
│  HTTP API (port 37777) + Background Processing          │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│                  Storage Layer                          │
│  SQLite + FTS5 Full-Text Search                         │
│  ChromaDB Vector Store (optional)                       │
└─────────────────────────────────────────────────────────┘

Technology Stack

Layer	Technology	Purpose
Language	TypeScript (ES2022)	Type-safe plugin code
Runtime	Node.js 18+	Hook execution
Process Manager	Bun	Worker service management
Database	SQLite 3 + bun:sqlite	Persistent storage
Full-Text Search	FTS5	Fast text queries
Vector Search	ChromaDB (optional)	Semantic similarity
HTTP Server	Express.js 4.18	Web API + viewer UI
Real-time	Server-Sent Events	Live memory stream
AI SDK	@anthropic-ai/claude-agent-sdk	Observation processing
Build Tool	esbuild	TypeScript bundling

How It Works: The Memory Pipeline

1. Session Start - Context Injection

When you start Claude Code:

// Context Hook (SessionStart)
1. Start Bun worker service if needed
2. Query last 10 sessions from SQLite
3. Retrieve top 50 observations (configurable)
4. Format as compressed summaries
5. Inject into Claude's system prompt

What Claude sees:

<claude-mem-context>
  <session id="123" date="2024-01-15">
    <observation type="bugfix">
      Fixed authentication race condition in auth.ts
      - Added mutex lock to token refresh
      - Prevents duplicate API calls
      [Read cost: 250 tokens | Created from: 1,200 tokens]
    </observation>
  </session>
</claude-mem-context>

Token Economics:

Without claude-mem: Re-read entire auth.ts file every session (1,200 tokens)
With claude-mem: Inject compressed summary (250 tokens)
Savings: 950 tokens (~79% reduction)

2. User Prompt - Session Creation

When you type a prompt:

// New Hook (UserPromptSubmit)
1. Create session record in SQLite
2. Save raw prompt for full-text search
3. Associate with current project/folder

3. Tool Execution - Observation Capture

Every time Claude uses a tool (can fire 100+ times per session):

// Save Hook (PostToolUse)
1. Capture tool input/output
2. Strip <private> tags (edge processing)
3. Queue observation for processing
4. Send to worker service via HTTP

4. Background Processing - AI Compression

Worker service processes observations asynchronously:

// Worker Service (Claude Agent SDK)
1. Batch observations for efficiency
2. Send to Claude API for analysis
3. Extract structured learnings:
   - Type: bugfix | feature | refactor | discovery
   - Concepts: how-it-works | gotcha | trade-off
   - Narrative: Human-readable summary
   - Facts: Key technical details
4. Store compressed results in SQLite

Example Transformation:

Raw Tool Execution (1,500 tokens):

{
  "tool": "Read",
  "path": "src/auth/middleware.ts",
  "content": "... (entire file contents) ..."
}

Compressed Observation (300 tokens):

{
  "type": "discovery",
  "concepts": ["how-it-works", "gotcha"],
  "title": "Auth middleware token validation flow",
  "narrative": "Middleware checks JWT expiry before route access. Gotcha: Clock skew tolerance of 60s can cause confusion.",
  "facts": {
    "file": "src/auth/middleware.ts",
    "key_function": "validateToken()",
    "edge_case": "Clock skew within 60s accepted"
  }
}

5. Session End - Summary Generation

When Claude stops or you end the session:

// Summary Hook (Stop)
1. Generate session-level summary
2. Aggregate all observations
3. Store completions, learnings, next steps
4. Mark session as complete

Progressive Disclosure: The Token-Efficiency Secret

One of claude-mem's most clever features is progressive disclosure - a three-layer retrieval pattern that minimizes token usage:

The 3-Layer Workflow

Layer 1: Index Search (~50-100 tokens/result)

search(query="authentication bug", type="bugfix", limit=20)

Returns compact index with IDs, titles, types, dates.

Layer 2: Timeline Context (~200 tokens/result)

timeline(observation_id=123, before=2, after=2)

Shows chronological context around interesting observations.

Layer 3: Full Details (~500-1,000 tokens/result)

get_observations(ids=[123, 456, 789])

Fetches complete narratives and facts for selected IDs only.

Token Savings Example

Without Progressive Disclosure:

Fetch 20 full observations upfront: 10,000-20,000 tokens

With Progressive Disclosure:

1. Search index: ~1,000 tokens
2. Review, identify 3 relevant IDs
3. Fetch only those 3: ~1,500-3,000 tokens
Total: 2,500-4,000 tokens (~75% savings)

MCP Search Tools

Claude-mem provides three Model Context Protocol (MCP) tools that Claude can invoke automatically:

Available Tools

1. search - Query the Index

// Natural language or structured queries
search({
  query: "authentication bug",
  type: "bugfix",
  dateFrom: "2024-01-01",
  limit: 20
})

2. timeline - Chronological Context

// What happened before/after an observation?
timeline({
  observation_id: 123,
  before: 2,  // 2 observations before
  after: 2    // 2 observations after
})

3. get_observations - Full Details

// Batch fetch by IDs
get_observations({
  ids: [123, 456, 789]
})

Auto-Invocation

Claude recognizes natural language queries and automatically uses these tools:

You: "What bugs did we fix last week?"

Claude (internally):

// 1. Search for recent bugfixes
search({ query: "bug", type: "bugfix", limit: 10 })

// 2. Review results, identify relevant IDs

// 3. Fetch full details
get_observations({ ids: [104, 107, 112] })

Claude (to you): "Last week we fixed three bugs: [detailed summary]..."

Configuration & Control

Core Settings

Managed in ~/.claude-mem/settings.json:

{
  "CLAUDE_MEM_MODEL": "sonnet",
  "CLAUDE_MEM_PROVIDER": "claude",
  "CLAUDE_MEM_CONTEXT_OBSERVATIONS": 50,
  "CLAUDE_MEM_WORKER_PORT": 37777,
  "CLAUDE_MEM_LOG_LEVEL": "INFO"
}

Context Injection Control

Fine-grained control over what gets injected:

Loading Settings:

CONTEXT_OBSERVATIONS (1-200): Number of observations to inject
CONTEXT_SESSION_COUNT (1-50): Number of recent sessions to pull from

Filter Settings:

Types: bugfix, feature, refactor, discovery, decision, change
Concepts: how-it-works, why-it-exists, gotcha, pattern, trade-off

Display Settings:

CONTEXT_FULL_COUNT (0-20): How many observations show full details
CONTEXT_FULL_FIELD: narrative | facts
Token economics visibility toggles

Privacy Control

Manual Privacy Tags:

# This won't be stored in memory
<private>
API_KEY = "sk-abc123..."
DATABASE_PASSWORD = "super-secret"
</private>

System-Level Tags:

<!-- Prevents recursive observation storage -->
<claude-mem-context>
  Past observations here...
</claude-mem-context>

Edge Processing: Privacy tag stripping happens at the hook layer before data reaches the worker or database.

Web Viewer UI

Real-time memory visualization at http://localhost:37777:

Features:

🔴 Live stream of observations via Server-Sent Events
🔍 Full-text search across all stored data
📊 Project filtering - View memory by project/folder
⚙️ Settings panel - Configure context injection
📈 Token economics - See read costs, work investment, savings
🔗 Citations - Reference observations by ID
🎨 GPU-accelerated animations for smooth scrolling

Terminal Preview:
Shows exactly what will be injected at the start of your next Claude Code session for the selected project.

Security & Privacy Analysis

✅ Good Security Practices

Local-Only Storage:

All data stored in ~/.claude-mem/ on your machine
No external API calls (except Claude API for processing)
No telemetry or tracking
No cloud sync

Privacy Controls:

<private> tags for excluding sensitive content
Edge processing (stripping before database)
Configurable skip tools (exclude certain tool types)
Manual control over what gets captured

Open Source:

AGPL-3.0 license
Code is auditable on GitHub
Active community development
Transparent architecture

⚠️ Security Considerations

Unencrypted Database:

SQLite DB at ~/.claude-mem/claude-mem.db is plain text
Mitigation: Use full disk encryption (FileVault on macOS, BitLocker on Windows)

Localhost HTTP API:

Worker service runs on port 37777 without authentication by default
Mitigation: Firewall the port or bind to 127.0.0.1 only (default)

Automatic Capture:

Everything Claude does is recorded unless you use <private> tags
Risk: Forgetting to tag sensitive content
Mitigation: Review stored data regularly via web viewer

AI Processing:

Observations sent to Claude API for compression
Uses your API key (same as Claude Code itself)
Note: This is inherent to the compression feature

🔒 Security Hardening Recommendations

Enable full disk encryption on your development machine
Use <private> tags liberally for any sensitive data
Firewall the worker port if you're on a shared network
Set CLAUDE_MEM_SKIP_TOOLS to exclude tools you don't want captured
Regularly audit stored data via the web viewer
Review git commits before pushing (in case memory got committed)
Add ~/.claude-mem/ to .gitignore globally

Performance Characteristics

Disk Space

Typical Usage:

Light user (10 sessions/week): ~10-20 MB/month
Heavy user (100 sessions/week): ~100-200 MB/month
Database growth: Linear with observation count
Storage location: ~/.claude-mem/claude-mem.db

Maintenance:

# Check database size
du -h ~/.claude-mem/claude-mem.db

# Vacuum to reclaim space (safe, but locks DB temporarily)
sqlite3 ~/.claude-mem/claude-mem.db "VACUUM;"

Memory & CPU

Worker Service:

RAM: ~100-200 MB typical, ~500 MB peak during heavy processing
CPU: Minimal when idle, spikes during AI compression
Process: Managed by Bun, auto-restarts on crashes

Hook Execution:

SessionStart: 10-500ms (cached dependencies vs. fresh install)
UserPromptSubmit: <10ms
PostToolUse: <5ms per execution (async processing)
Stop: 100-300ms (summary generation)

Token Usage

Context Injection (per session):

50 observations @ ~250 tokens each = ~12,500 tokens
Cost (Claude Sonnet): ~$0.0375 per session start
Savings vs. re-reading files: ~70-80% reduction

AI Compression (background):

Processing 100 observations: ~50,000 tokens
Cost (Claude Sonnet): ~$0.15 per 100 observations
Amortized over reuse: Pays for itself in 2-3 sessions

Use Cases & Workflows

1. Long-Running Projects

Problem: Working on a project for weeks/months, Claude forgets past decisions.

With claude-mem:

"Why did we choose Redis over Memcached?" → Instant answer from past discussion
"What was that authentication gotcha we hit?" → Retrieved from discovery observations
Consistent architectural decisions across sessions

2. Team Onboarding

Problem: New team members ask the same questions repeatedly.

With claude-mem:

Share your ~/.claude-mem/claude-mem.db (or exports)
New teammates get institutional knowledge automatically
Reduce context-gathering time by 70%+

3. Bug Investigation

Problem: Recurring bugs, hard to remember past fixes.

With claude-mem:

search({ query: "timeout error", type: "bugfix" })

Instantly find similar past bugs
See what solutions worked
Avoid repeating failed approaches

4. Code Review Assistance

Problem: Reviewers lack context about design decisions.

With claude-mem:

Claude knows why code was written a certain way
Can explain trade-offs made during implementation
References specific past discussions

5. Documentation Generation

Problem: Writing docs requires remembering entire project history.

With claude-mem:

"Generate architecture docs for this project"
Claude draws from all past sessions
Includes decisions, trade-offs, gotchas
Accurate because it witnessed the development

Comparison: Manual Memory vs. Claude-Mem

Aspect	Manual (CLAUDE.md)	Claude-Mem
Setup effort	Write project docs manually	Install plugin, automatic
Maintenance	Update docs as code changes	Automatic observation capture
Coverage	Only what you document	Everything Claude does
Searchability	Ctrl+F in markdown	Full-text + semantic search
Token efficiency	Re-reads entire file	Progressive disclosure
Granularity	Project-level guidance	Observation-level detail
Privacy	You control content	Requires privacy tags
Cross-session	Static context	Dynamic, contextual
Learning curve	Minimal	Moderate (concepts, tools)

Best Practice: Use both!

CLAUDE.md for high-level project guidance
claude-mem for detailed session history

Installation & Setup

Quick Start

# Install plugin via Claude Code CLI
/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem

# Restart Claude Code
# Memory will now persist automatically!

Verify Installation

# Check worker is running
ps aux | grep worker-service

# View logs
tail -f ~/.claude-mem/logs/worker-out.log

# Access web viewer
open http://localhost:37777

Configuration

# Edit settings
vim ~/.claude-mem/settings.json

# Restart worker to apply changes
cd ~/.claude/plugins/marketplaces/thedotmack
npm run worker:restart

Advanced Features

Folder Context Files

Auto-generates CLAUDE.md in project folders with activity timelines:

# Project Context

Last updated: 2024-01-15

## Recent Activity
- Fixed auth middleware race condition (2024-01-15)
- Refactored database connection pool (2024-01-14)
- Added rate limiting (2024-01-13)

## Key Decisions
- Using Redis for session storage (2024-01-10)
- Chose JWT over session cookies (2024-01-08)

Enable:

{
  "CLAUDE_MEM_FOLDER_CLAUDEMD_ENABLED": true
}

Multilingual Support

Claude-mem supports 28 languages via mode configuration:

{
  "CLAUDE_MEM_MODE": "code--es"  // Spanish code mode
}

Supported languages: English, Spanish, Chinese, French, German, Japanese, Korean, Portuguese, Russian, Arabic, Hebrew, and more.

Mode System

Switch between workflow profiles:

code - Standard development (default)
email-investigation - Email/communication analysis
chill - Casual coding sessions

{
  "CLAUDE_MEM_MODE": "email-investigation"
}

Beta Channel

Try experimental features like "Endless Mode" (biomimetic memory architecture):

Open web viewer: http://localhost:37777
Click Settings gear icon
Switch to Beta channel
Your data is preserved, only plugin code changes

Troubleshooting

Worker Won't Start

# Check for port conflicts
lsof -i :37777

# Kill conflicting process
kill -9 <PID>

# Or change port
export CLAUDE_MEM_WORKER_PORT=38000
npm run worker:restart

Missing Context

# Verify observations are being captured
sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations;"

# Check last session
sqlite3 ~/.claude-mem/claude-mem.db "SELECT * FROM sessions ORDER BY created_at DESC LIMIT 1;"

# Enable debug logging
export CLAUDE_MEM_LOG_LEVEL=DEBUG
npm run worker:restart

Memory Growing Too Large

# Check database size
du -h ~/.claude-mem/claude-mem.db

# Archive old sessions (manual export)
sqlite3 ~/.claude-mem/claude-mem.db "SELECT * FROM sessions WHERE created_at < '2024-01-01';" > old-sessions.sql

# Delete old sessions (dangerous! backup first)
sqlite3 ~/.claude-mem/claude-mem.db "DELETE FROM sessions WHERE created_at < '2024-01-01';"

# Vacuum to reclaim space
sqlite3 ~/.claude-mem/claude-mem.db "VACUUM;"

Limitations & Trade-offs

⚠️ Current Limitations

Claude Code CLI Only:

Does not work with Claude.ai web interface
Does not work with other IDEs (VS Code, Cursor)
Specifically designed for Claude Code CLI hooks

Local Storage Only:

No cloud sync between machines
Manual export/import for sharing
No cross-device persistence

Manual Privacy:

Requires remembering to use <private> tags
Easy to accidentally capture sensitive data
No automatic credential detection

API Costs:

Background processing uses Claude API tokens
~$0.15 per 100 observations processed
Can add up for very active projects

Resource Usage:

Background worker service always running
~100-200 MB RAM minimum
Database grows indefinitely (manual cleanup needed)

🎯 Design Trade-offs

Automatic vs. Manual:

Benefit: Zero-effort memory capture
Cost: Potential for capturing unwanted data

Compression vs. Fidelity:

Benefit: 70-80% token reduction
Cost: Some nuance lost in summarization

Local vs. Cloud:

Benefit: Privacy and control
Cost: No multi-device sync

Background Processing:

Benefit: Non-blocking, async compression
Cost: Slightly delayed memory availability

Future Directions

Based on the GitHub roadmap and community discussion:

Planned Features:

IDE integrations (VS Code, Cursor via plugins)
Cloud sync option (opt-in, encrypted)
Automatic sensitive data detection
Memory export/import tools
Team memory sharing workflows
Memory pruning/archival automation
Enhanced vector search with better embeddings
Multi-agent collaboration support

Community Requests:

Integration with other AI assistants (GPT-4, Gemini)
Memory visualization tools (graph view)
Cost optimization (cheaper compression models)
Privacy-preserving memory sharing (anonymization)

Alternatives & Comparisons

Claude.ai Projects (Web)

Similarities:

Project-specific context
File awareness

Differences:

No session-to-session memory in web (yet)
No automatic capture of tool usage
No searchable history

VS Code Workspace Settings

Similarities:

Project-level configuration
Persists across sessions

Differences:

Static configuration only
No dynamic memory
No AI compression

Git Commit History

Similarities:

Historical record of changes
Searchable

Differences:

Captures code changes, not reasoning
No AI summaries
No connection to AI assistant context

Custom CLAUDE.md Files

Similarities:

Persistent context
Manual curation

Differences:

Static vs. dynamic
Requires manual updates
No observation-level detail

Verdict: Claude-mem is complementary to all of these. Use it alongside existing tools.

Is Claude-Mem Right for You?

✅ Good Fit If You:

Use Claude Code CLI regularly for development
Work on long-running projects (weeks/months)
Want automatic session memory without manual effort
Need searchable project history
Are comfortable with local database storage
Have Claude API quota to spare for compression
Value token efficiency (progressive disclosure)

❌ Not a Fit If You:

Primarily use Claude.ai web interface (it won't work)
Prefer explicit control over all stored data
Work on highly sensitive projects requiring audit trails
Are concerned about unencrypted local storage
Don't want another background service running
Have very limited Claude API budget
Only do quick, one-off coding tasks

🤔 Try It If You're:

Curious about AI agent memory systems
Experimenting with workflow optimization
Building a personal knowledge base of coding decisions
Interested in the architecture (TypeScript/SQLite/MCP)

Getting Started: A Practical Workflow

Week 1: Installation & Familiarization

Day 1-2: Install & Observe

/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem
# Use Claude Code normally, don't change workflow yet

Day 3-4: Explore Memory

Visit http://localhost:37777
Browse captured observations
See what types/concepts are being extracted

Day 5-7: Test Search

"What did we work on yesterday?"
"Find all bugfixes from last week"
"Show me database-related changes"

Week 2: Optimization

Configure Context Injection:

Open Settings in web viewer
Adjust observation count (start with 50)
Filter by relevant types/concepts
Monitor token usage

Add Privacy Tags:

<private>
API_KEY = "..."
</private>

Skip Noisy Tools:

{
  "CLAUDE_MEM_SKIP_TOOLS": "ListMcpResourcesTool,SlashCommand,Skill,TodoWrite"
}

Week 3+: Advanced Usage

Leverage Progressive Disclosure:

"Search for authentication issues"
# Review index results
"Get full details for observations 104, 107, 112"

Use Timeline Queries:

"Show me what happened around observation 156"

Analyze Token Economics:

Check "Savings" column in web viewer
Optimize observation count vs. context quality
Tune full observation display count

Conclusion

Claude-mem represents a significant step forward in AI agent memory systems. By automatically capturing, compressing, and intelligently retrieving context, it eliminates one of the biggest pain points in AI-assisted development: the loss of knowledge between sessions.

Key Takeaways:

Automatic > Manual - Zero-effort capture beats manual documentation for session-level detail
Compression Works - 70-80% token reduction proves AI summarization is effective
Progressive Disclosure - Layer-based retrieval is the key to token efficiency
Local-First - Privacy-conscious design with no cloud dependencies
Open Source - Auditable, extensible, community-driven

The Future of AI Memory:

Claude-mem is pioneering techniques that will likely become standard in AI assistants:

Lifecycle hooks for observation capture
AI-driven compression of tool usage
Progressive disclosure for context retrieval
Local-first, privacy-respecting storage

As AI coding assistants become more capable, giving them better memory will be crucial for truly collaborative development. Claude-mem shows us what that future looks like.

Resources

GitHub: github.com/thedotmack/claude-mem
Documentation: docs.claude-mem.ai
Discord: discord.com/invite/J4wttp9vDu
Author: Alex Newman (@thedotmack)
License: AGPL-3.0
Latest Version: v9.0.0 (as of Feb 2026)

Have you tried claude-mem? What's your experience with AI agent memory systems? Share your thoughts in the comments!