Claude Self-Reflect
Claude forgets everything. This fixes that.
Give Claude perfect memory of all your conversations. Search past discussions instantly. Never lose context again.
100% Local by Default • 20x Faster • Zero Configuration • Production Ready
Latest: v7.1.9 Cross-Project Iteration Memory - Ralph loops now share memory across ALL projects automatically. Learn more →
Why This Exists
Claude starts fresh every conversation. You've solved complex bugs, designed architectures, made critical decisions - all forgotten. Until now.
Table of Contents
- Quick Install
- Performance
- The Magic
- Before & After
- Real Examples
- Ralph Loop Memory
- Key Features
- Code Quality Insights
- Architecture
- Requirements
- Documentation
- Keeping Up to Date
- Troubleshooting
- Contributors
Quick Install
# Install and run automatic setup (5 minutes, everything automatic) npm install -g claude-self-reflect claude-self-reflect setup # That's it! The setup will: # - Run everything in Docker (no Python issues!) # - Configure everything automatically # - Install the MCP in Claude Code # - Start monitoring for new conversations # - Keep all data local - no API keys needed
Tip
Auto-Migration: Updates automatically handle breaking changes. Simply run npm update -g claude-self-reflect.
Cloud Mode (Better Search Accuracy)
# Step 1: Get your free Voyage AI key # Sign up at https://www.voyageai.com/ - it takes 30 seconds # Step 2: Install with Voyage key npm install -g claude-self-reflect claude-self-reflect setup --voyage-key=YOUR_ACTUAL_KEY_HERE
[!NOTE] Cloud mode provides 1024-dimensional embeddings (vs 384 local) for more accurate semantic search but sends conversation data to Voyage AI for processing.
Performance
| Metric | Before | After | Improvement |
|---|---|---|---|
| Status Check | 119ms | 6ms | 20x faster |
| Storage Usage | 100MB | 50MB | 50% reduction |
| Import Speed | 10/sec | 100/sec | 10x faster |
| Memory Usage | 500MB | 50MB | 90% reduction |
| Search Latency | 15ms | 3ms | 5x faster |
Competitive Comparison
| Feature | Claude Self-Reflect | MemGPT | LangChain Memory |
|---|---|---|---|
| Local-first | Yes | No | Partial |
| No API keys | Yes | No | No |
| Real-time indexing | Yes 2-sec | Manual | No |
| Search speed | <3ms | ~50ms | ~100ms |
| Setup time | 5 min | 30+ min | 20+ min |
| Docker required | Yes | Python | Python |
The Magic
Before & After
Real Examples
You: "How did we fix that 100% CPU usage bug?"
Claude: "Found it - we fixed the circular reference causing 100% CPU usage
in the server modularization. Also fixed store_reflection dimension
mismatch by creating separate reflections_local and reflections_voyage."
You: "What about that Docker memory issue?"
Claude: "The container was limited to 2GB but only using 266MB. We found
the issue only happened with MAX_QUEUE_SIZE=1000 outside Docker.
With proper Docker limits, memory stays stable at 341MB."
You: "Have we worked with JWT authentication?"
Claude: "Found conversations about JWT patterns including User.authenticate
methods, TokenHandler classes, and concepts like token rotation,
PKCE, and social login integration."
Ralph Loop Memory
The difference between spinning in circles and building on every iteration.
Use the ralph-wiggum plugin for long tasks? CSR gives your Ralph loops persistent memory across sessions and projects.
Without CSR: The Hamster Wheel
- Each context compaction = everything forgotten
- Same mistakes repeated across iterations
- No learning from past sessions
- Cross-project insights lost forever
With CSR: The Upward Spiral
- Automatic backup before context compaction
- Anti-pattern injection - "DON'T RETRY THESE" surfaces first
- Success pattern learning - reuse what worked before
- Cross-project memory - learn from ALL your projects
Quick Setup
./scripts/ralph/install_hooks.sh # Install hooks globally ./scripts/ralph/install_hooks.sh --check # Verify installation
How It Works
- Start a Ralph loop:
/ralph-wiggum:ralph-loop "Build feature X" - Work naturally - CSR hooks capture state automatically
- Stop hook stores each iteration's learnings
- PreCompact hook backs up state before compaction
- Next session retrieves past insights, failed approaches, and wins
v7.1.9+: Cross-project iteration memory - hooks work for ALL projects, entries tagged with
project_{name}for global searchability.
Code Quality Insights
AST-GREP Pattern Analysis (100+ Patterns)
Real-time Quality Scoring in Statusline
Your code quality displayed live as you work:
- 🟢 A+ (95-100): Exceptional code quality
- 🟢 A (90-95): Excellent, production-ready
- 🟢 B (80-90): Good, minor improvements possible
- 🟡 C (60-80): Fair, needs refactoring
- 🔴 D (40-60): Poor, significant issues
- 🔴 F (0-40): Critical problems detected
Pattern Categories Analyzed
- Security Patterns: SQL injection, XSS vulnerabilities, hardcoded secrets
- Performance Patterns: N+1 queries, inefficient loops, memory leaks
- Error Handling: Bare exceptions, missing error boundaries
- Type Safety: Missing type hints, unsafe casts
- Async Patterns: Missing await, promise handling
- Testing Patterns: Test coverage, assertion quality
How It Works
- During Import: AST elements extracted from all code blocks
- Pattern Matching: 100+ patterns from unified registry
- Quality Scoring: Weighted scoring normalized by lines of code
- Statusline Display: Real-time feedback as you code
v7.0 Automated Narrative Generation
9.3x Better Search Quality • 50% Cost Savings • Fully Automated
v7.0 introduces AI-powered conversation narratives that transform raw conversation excerpts into rich problem-solution summaries with comprehensive metadata extraction.
Before/After Comparison
| Metric | v6.x (Raw Excerpts) | v7.0 (AI Narratives) | Improvement |
|---|---|---|---|
| Search Quality | 0.074 | 0.691 | 9.3x better |
| Token Compression | 100% | 18% | 82% reduction |
| Cost per Conversation | $0.025 | $0.012 | 50% savings |
| Metadata Richness | Basic | Tools + Concepts + Files | Full context |
What You Get
Enhanced Search Results:
- Problem-Solution Patterns: Conversations structured as challenges encountered and solutions implemented
- Rich Metadata: Automatic extraction of tools used, technical concepts, and files modified
- Context Compression: 82% token reduction while maintaining searchability
- Better Relevance: Search scores improved from 0.074 to 0.691 (9.3x)
Cost-Effective Processing:
- Anthropic Batch API: $0.012 per conversation (vs $0.025 standard)
- Automatic batch queuing and processing
- Progress monitoring via Docker containers
- Evaluation generation for quality assurance
Fully Automated Workflow:
# 1. Watch for new conversations docker compose up batch-watcher # 2. Auto-trigger batch processing when threshold reached # (Configurable: BATCH_THRESHOLD_FILES, default 10) # 3. Monitor batch progress docker compose logs batch-monitor -f # 4. Enhanced narratives automatically imported to Qdrant
Example: Raw Excerpt vs AI Narrative
Before (v6.x) - Raw excerpt showing basic conversation flow:
User: How do I fix the Docker memory issue?
Assistant: The container was limited to 2GB but only using 266MB...
After (v7.0) - Rich narrative with metadata:
PROBLEM: Docker container memory consumption investigation revealed
discrepancy between limits (2GB) and actual usage (266MB). Analysis
required to determine if memory limit was appropriate.
SOLUTION: Discovered issue occurred with MAX_QUEUE_SIZE=1000 outside
Docker environment. Implemented proper Docker resource constraints
stabilizing memory at 341MB.
TOOLS USED: Docker, grep, Edit
CONCEPTS: container-memory, resource-limits, queue-sizing
FILES: docker-compose.yaml, batch_watcher.py
Getting Started with Narratives
Narratives are automatically generated for new conversations. To process existing conversations:
# Process all existing conversations in batch python docs/design/batch_import_all_projects.py # Monitor batch progress docker compose logs batch-monitor -f # Check completion status curl http://localhost:6333/collections/csr_claude-self-reflect_local_384d
For complete documentation, see Batch Automation Guide.
Key Features
MCP Tools Available to Claude
Search & Memory:
reflect_on_past- Search past conversations using semantic similarity with time decay (supports quick/summary modes)store_reflection- Store important insights or learnings for future referenceget_next_results- Paginate through additional search resultssearch_by_file- Find conversations that analyzed specific filessearch_by_concept- Search for conversations about development conceptsget_full_conversation- Retrieve complete JSONL conversation files
Temporal Queries:
get_recent_work- Answer "What did we work on last?" with session groupingsearch_by_recency- Time-constrained search like "docker issues last week"get_timeline- Activity timeline with statistics and patterns
Runtime Configuration:
switch_embedding_mode- Switch between local/cloud modes without restartget_embedding_mode- Check current embedding configurationreload_code- Hot reload Python code changesreload_status- Check reload stateclear_module_cache- Clear Python cache
Status & Monitoring:
get_status- Real-time import progress and system statusget_health- Comprehensive system health checkcollection_status- Check Qdrant collection health and stats
[!TIP] Use
reflect_on_past --mode quickfor instant existence checks - returns count + top match only!
All tools are automatically available when the MCP server is connected to Claude Code.
Statusline Integration
See your indexing progress right in your terminal! Works with Claude Code Statusline:
- Progress Bar - Visual indicator
[████████ ] 91% - Indexing Lag - Shows backlog
• 7h behind - Auto-updates every 60 seconds
- Zero overhead with intelligent caching
Project-Scoped Search
Searches are project-aware by default. Claude automatically searches within your current project:
# In ~/projects/MyApp
You: "What authentication method did we use?"
Claude: [Searches ONLY MyApp conversations]
# To search everywhere
You: "Search all projects for WebSocket implementations"
Claude: [Searches across ALL your projects]
Memory Decay
Recent conversations matter more. Old ones fade. Like your brain, but reliable.
- 90-day half-life: Recent memories stay strong
- Graceful aging: Old information fades naturally
- Configurable: Adjust decay rate to your needs
[!NOTE] Memory decay ensures recent solutions are prioritized while still maintaining historical context.
Performance at Scale
- Search: <3ms average response time
- Scale: 600+ conversations across 24 projects
- Reliability: 100% indexing success rate
- Memory: 96% reduction from v2.5.15
- Real-time: HOT/WARM/COLD intelligent prioritization
[!TIP] For best performance, keep Docker allocated 4GB+ RAM and use SSD storage.
Architecture
View Architecture Diagram & Details
HOT/WARM/COLD Intelligent Prioritization
- HOT (< 5 minutes): 2-second intervals for near real-time import
- WARM (< 24 hours): Normal priority with starvation prevention
- COLD (> 24 hours): Batch processed to prevent blocking
Files are categorized by age and processed with priority queuing to ensure newest content gets imported quickly while preventing older files from being starved.
Components
- Vector Database: Qdrant for semantic search
- MCP Server: Python-based using FastMCP
- Embeddings: Local (FastEmbed) or Cloud (Voyage AI)
- Import Pipeline: Docker-based with automatic monitoring
Requirements
System Requirements
Minimum Requirements
- Docker Desktop (macOS/Windows) or Docker Engine (Linux)
- Node.js 16+ (for the setup wizard)
- Claude Code CLI
- 4GB RAM available for Docker
- 2GB disk space for vector database
Recommended
- 8GB RAM for optimal performance
- SSD storage for faster indexing
- Docker Desktop 4.0+ for best compatibility
Operating Systems
- macOS 11+ (Intel & Apple Silicon)
- Windows 10/11 with WSL2
- Linux (Ubuntu 20.04+, Debian 11+)
Documentation
Technical Stack
- Vector DB: Qdrant (local, your data stays yours)
- Embeddings:
- Local (Default): FastEmbed with all-MiniLM-L6-v2
- Cloud (Optional): Voyage AI
- MCP Server: Python + FastMCP
- Search: Semantic similarity with time decay
Advanced Topics
Troubleshooting
Uninstall
For complete uninstall instructions, see docs/UNINSTALL.md.
Quick uninstall:
# Remove MCP server claude mcp remove claude-self-reflect # Stop Docker containers docker-compose down # Uninstall npm package npm uninstall -g claude-self-reflect
Keeping Up to Date
npm update -g claude-self-reflect
Updates are automatic and preserve your data. See full changelog for details.
Release Evolution
v7.0 - Automated Narratives (Oct 2025)
- 9.3x better search quality via AI-powered conversation summaries
- 50% cost savings using Anthropic Batch API ($0.012 per conversation)
- 82% token compression while maintaining searchability
- Rich metadata extraction (tools, concepts, files)
- Problem-solution narrative structure
- Automated batch processing with Docker monitoring
v4.0 - Performance Revolution (Sep 2025)
- 20x faster status checks (119ms → 6ms)
- 50% storage reduction via unified state management
- 10x faster imports (10/sec → 100/sec)
- 90% memory reduction (500MB → 50MB)
- Runtime mode switching (no restart required)
- Prefixed collection naming (breaking change)
- Code quality tracking with AST-GREP (100+ patterns)
v3.3 - Temporal Intelligence (Aug 2025)
- Time-based search: "docker issues last week"
- Session grouping: "What did we work on last?"
- Activity timelines with statistics
- Recency-aware queries
v2.8 - Full Context Access (Jul 2025)
- Complete conversation retrieval
- JSONL file access for deeper analysis
- Enhanced debugging capabilities
Troubleshooting
Common Issues and Solutions
1. "No collections created" after import
Symptom: Import runs but Qdrant shows no collections
Cause: Docker can't access Claude projects directory
Solution:
# Run diagnostics to identify the issue claude-self-reflect doctor # Fix: Re-run setup to set correct paths claude-self-reflect setup # Verify .env has full paths (no ~): cat .env | grep CLAUDE_LOGS_PATH # Should show: CLAUDE_LOGS_PATH=/Users/YOUR_NAME/.claude/projects
2. MCP server shows "ERROR" but it's actually INFO
Symptom: [ERROR] MCP server "claude-self-reflect" Server stderr: INFO Starting MCP server
Cause: Claude Code displays all stderr output as errors
Solution: This is not an actual error - the MCP is working correctly. The INFO message confirms successful startup.
3. "No JSONL files found"
Symptom: Setup can't find any conversation files
Cause: Claude Code hasn't been used yet or stores files elsewhere
Solution:
# Check if files exist ls ~/.claude/projects/ # If empty, use Claude Code to create some conversations first # The watcher will import them automatically
4. Docker volume mount issues
Symptom: Import fails with permission errors
Cause: Docker can't access home directory
Solution:
# Ensure Docker has file sharing permissions # macOS: Docker Desktop → Settings → Resources → File Sharing # Add: /Users/YOUR_USERNAME/.claude # Restart Docker and re-run setup docker compose down claude-self-reflect setup
5. Qdrant not accessible
Symptom: Can't connect to localhost:6333
Solution:
# Start services docker compose --profile mcp up -d # Check if running docker compose ps # View logs for errors docker compose logs qdrant
Diagnostic Tools
Run Comprehensive Diagnostics
claude-self-reflect doctor
This checks:
- Docker installation and configuration
- Environment variables and paths
- Claude projects and JSONL files
- Import status and collections
- Service health
Check Logs
# View all service logs docker compose logs -f # View specific service docker compose logs qdrant docker compose logs watcher
Generate Diagnostic Report
# Create diagnostic file for issue reporting claude-self-reflect doctor > diagnostic.txt
Getting Help
-
Check Documentation
-
Community Support
-
Report Issues
- GitHub Issues
- Include diagnostic output when reporting
Contributors
Special thanks to our contributors:
- @TheGordon - Fixed timestamp parsing (#10)
- @akamalov - Ubuntu WSL insights
- @kylesnowschwartz - Security review (#6)
Built with care by ramakay for the Claude community.



