Python 3.11 – 3.14
v1.3.0 — Graph Engine
🏆 #20 Product of the Day
Memory that
ages gracefully.
Biologically-inspired persistent memory for AI agents. Automatically prunes stale data, reinforces useful context, and connects related memories through a graph layer.
Get started in 2 commands# recall("Python backend services")
── Round 1: vector search ──
"Sachit uses Python at MongoDB" sim 0.61
── Round 2: graph expansion ──
"Docker + K8s production deploys" via graph
"Uses React" 0.04 (Decayed)
✓ 1 stale fact pruned · 1 graph neighbour surfaced
Benchmarked on
public data.
Three external datasets. All scripts are public and reproducible — full methodology in BENCHMARKS.md.
LoCoMo Recall@5
snap-research/LoCoMo · 1,534 QA pairs · 10 sessions
benchmarks/locomo_4way.py · BM25 + vector + graph + decay · 20 Apr 2026
YourMemory BM25 + vector + graph + decay 59%
Zep Cloud 10/10 samples 28%
Dataset
Sourcesnap-research/LoCoMo
Filelocomo10.json
QA pairs1,534
Sessions10/10
Scriptbenchmarks/locomo_4way.py
-
+31pp vs Zep (111% relative)
Both systems ran all 10 samples to completion on the same 1,534 QA pairs. YourMemory led every single session — the only memory layer tested that completed the full benchmark clean.
-
Reproducible — script is public
The full benchmark script is benchmarks/locomo_4way.py. Same dataset, same hit rule, same top-5 limit for every system. Run it yourself.
-
Zero LLM calls for retrieval
All retrieval, pruning, and graph expansion runs fully on your machine — no cloud inference cost, no data leaving your environment.
Workflow Efficiency
3-session developer workflow simulation — stateless baseline vs YourMemory
−84%
Token savings
At 30 sessions. Memory block stays flat (~76–91 tokens) while stateless history grows O(n). At 3 sessions: −19.7% tokens, −28% per-session context.
3 sessions−19.7%
30 sessions−84.1%
Stale tokens−100%
−14%
Fewer LLM calls
Recalled context eliminates clarifying questions at the start of new sessions. Each clarifying round is a full LLM call that produces zero implementation output.
Session 10 saved
Session 2−1 clarify call
Session 3+−1 clarify call
−4%
Context pruning
Memories below Ebbinghaus strength 0.05 are pruned from retrieval entirely. 3/15 memories pruned in a 60-day synthetic set. Compounds at scale (200+ memories).
Pruned memories20%
Top-5 tokens74 → 71
No stale factsinjected
Three layers. One engine.
Vector search finds what you asked for. The graph finds what you forgot to ask for. Ebbinghaus decides what survives.
Biologically Pruned
Different kinds of memory age at different rates. Important facts persist longer; transient context fades naturally. Related memories stay alive together — no orphaned facts.
Hybrid Graph + Vector
Two-round retrieval finds not just what you searched for, but what you forgot to search for. Related memories surface even when they don't share vocabulary with the query.
Multi-Agent Memory
Multiple agents share context or keep secrets. API keys (ym_ prefix) authenticate each agent. Shared vs private visibility per memory.
New in v1.3.0
Smarter retrieval.
Semantic search alone misses memories that are related but worded differently. A second retrieval pass surfaces them automatically.
1
Semantic search
Finds the most relevant memories for your query — fast and precise.
2
Context expansion
Related memories that didn't match the query directly are surfaced through the graph layer — nothing slips through.
↻
Recall propagation
Using a memory keeps its connected context cluster fresh automatically — the more you use it, the longer it survives.
Connected memory graph
Chain-aware pruning
Memories don't decay in isolation. Before a memory is pruned, its connected neighbours are checked — if any are still relevant, the whole cluster stays alive. Related facts age together.
Recall propagation
Every time a memory is recalled, its connected neighbours get a freshness boost. The more a cluster of related memories is used, the longer the whole group persists — the system learns what matters to you.
Multi-agent shared memory.
Multiple AI agents share context or keep secrets. Each agent authenticates with an API key. You control exactly what each agent can read and write.
1
Register an agent
Each agent gets a unique ym_ API key. Shown once, never stored in plaintext. Revoke anytime.
result = register_agent(
agent_id="coding-agent",
user_id="sachit",
)
# → ym_xxxx (save once)
2
Store shared or private
Pass the API key in any MCP call. Set visibility to control who can see it.
# shared — all agents see this
store_memory(
content="DB is Postgres 16",
api_key="ym_xxxx",
visibility="shared"
)
# private — only this agent
store_memory(
content="staging key sk-xxx",
visibility="private"
)
3
Recall with scope
Without a key → shared memories only. With a key → shared + that agent's private memories.
# coding-agent recalls
recall_memory(
query="database production",
api_key="ym_xxxx"
)
# ← shared + private memories
# review-agent (different key)
recall_memory(query="database")
# ← shared memories only
Visibility matrix
| Memory stored as | Owner agent | Other agents | No API key |
|---|---|---|---|
shared |
✓ | ✓ | ✓ |
private |
✓ | ✗ | ✗ |
Keys hashed with SHA-256 before storage. Revoke anytime with revoke_agent(agent_id, user_id).
Auto-configured
Agent memory rules,
baked in.
yourmemory-setup automatically injects a curated instruction set into your agent's global context — telling it exactly when to recall, what to store, and how to prioritise memories. No manual configuration needed.
-
Recall policy — agent retrieves context before every task automatically
-
Store / update / ignore decision logic — no duplicate memories, no noise
-
Importance and category guidance — agent assigns decay rates and priority without being told
-
Written to
~/.claude/CLAUDE.md— applies globally across all your projects
[1/4] Downloading spaCy model…
✓ en_core_web_sm installed
[2/4] Initialising database…
✓ Database ready
[3/4] Writing MCP config…
✓ Claude Code → ~/.claude/settings.json
[4/4] Injecting memory rules…
✓ Memory rules → ~/.claude/CLAUDE.md
✓ Setup complete. Restart your AI client.
Two commands.
Install, run setup. That's it — spaCy model, database, and client configs are handled automatically.
1. Install
$ pip install yourmemory
2. Setup (run once)
$ yourmemory-setup
Configures everything automatically — language model, database, and MCP config for every detected client on your machine.
✓ Language model ready
✓ Database initialised
✓ Claude Code → ~/.claude/settings.json
✓ Claude Desktop → auto-detected if installed
✓ Cursor / Windsurf / Cline → auto-detected if installed
✓ Memory rules → injected into global agent context
Restart your AI client after setup. YourMemory starts automatically as an MCP server — no background process to manage.
Works with
Claude Code Claude Desktop Cline Cursor Windsurf Continue Zed
PostgreSQL (optional — teams / large datasets)
Install with Postgres support
$ pip install yourmemory[postgres]
Create a .env file
DATABASE_URL=postgresql://YOUR_USER@localhost:5432/yourmemory
Backend selected automatically from the connection string — no additional config required.
Graph backend — production scale (opt-in)
$ pip install 'yourmemory[neo4j]' $ GRAPH_BACKEND=neo4j yourmemory
Default graph runs fully in-process with zero setup. Switch to the production backend for large deployments via the GRAPH_BACKEND env var.
Read the thinking behind it