GitHub - hebbs-ai/hebbs-memory-engine

The memory engine for AI agents. Four recall strategies. Native consolidation. Automatic decay. One binary.

For our latest research and enterprise deployments, reach out to us at hebbs.ai.

HEBBS is a cognitive memory primitive purpose-built for AI agents. Vector search tells your agent what's similar. HEBBS tells your agent what happened, what caused it, and what worked before.

4 recall strategies · Native consolidation · Automatic decay · One skill file, zero config

brew install hebbs-ai/tap/hebbs

Or on any platform:

curl -sSf https://hebbs.ai/install | sh

Portable Cognition

.hebbs/ is a self-contained cognition layer that lives next to your files. Build the index once, then drop it anywhere: another machine, another agent, your whole team. Everyone gets the same memory instantly.

.hebbsignore works like .gitignore. Your private files stay private. Your agents only see what you allow.

Your files are the source of truth. The .hebbs/ directory is derived, rebuildable, and disposable. Delete it and run hebbs init . && hebbs index . to get it back.

Works Out of the Box with Your Agent

HEBBS ships as a skill for Claude Code and OpenClaw. No SDK integration. No glue code. No configuration. Install HEBBS, and your agent automatically stores memories, recalls with the right strategy, consolidates insights, and forgets what's stale.

Without HEBBS:                          With HEBBS:

1. Choose a vector DB                   brew install hebbs-ai/tap/hebbs
2. Set up embedding pipeline
3. Write storage layer                  Done.
4. Write retrieval layer
5. Add temporal logic                   Your agent now has:
6. Add graph traversal                  - 4 recall strategies
7. Wire it all together                 - Temporal + causal + analogical
8. Handle decay manually                - Native decay & reinforcement
9. Build consolidation pipeline         - Automatic consolidation
10. Maintain 4 services                 - Works with Claude & OpenClaw

~2,000 lines of glue                    0 lines of glue

The skill is published at hebbs-ai/hebbs-skill and works with any agent that reads SKILL.md.

Why HEBBS Exists

Every agent framework gives you similarity search and calls it memory. HEBBS gives your agent temporal reasoning, causal chains, analogical transfer, consolidation, and decay: the cognitive operations that turn retrieval into understanding.

What your agent's memory does today	What HEBBS does
Embed a question, find 5 nearest vectors	"What happened before this?" Temporal recall
Return them and hope for the best	"What caused this outcome?" Causal graph walk
Precision on temporal queries: ~23%	"What pattern transfers here?" Analogical matching
No decay, no consolidation, no revision	Memories decay. Important ones strengthen. Episodes consolidate into insights.

The delta isn't milliseconds. It's +68 percentage points on temporal queries and +63 on causal.

Four Recall Strategies

Everyone else has one. HEBBS has four.

Strategy	Question it answers	Example
Similarity	"What looks like this?"	Finding relevant objection responses
Temporal	"What happened, in order?"	Reconstructing a prospect's full history
Causal	"What led to this outcome?"	Understanding why a deal was lost
Analogical	"What's structurally similar in a different domain?"	Applying finance patterns to healthcare

All four run against a single engine. No fan-out across services.

Tunable Scoring

Every result is ranked by a composite score blending four signals:

Signal	What it captures	Default weight
Relevance	Semantic similarity to the query	0.50
Recency	How recently the memory was created	0.20
Importance	Salience set at encoding time	0.20
Reinforcement	How often the memory has been recalled	0.10

One parameter changes everything:

1:0:0:0 pure semantic (RAG mode)
0.2:0.8:0:0 favor recent (live context mode)
0.3:0.1:0.5:0.1 favor important (critical decisions mode)

Your Agent Learns, Not Just Stores

The reflect pipeline clusters raw memories, proposes insights, validates them, and stores consolidated knowledge with full lineage.

Raw memories (episodes):
  "Customer asked about pricing"
  "Customer mentioned competitor X"
  "Customer objected to annual commitment"
  "Deal lost to competitor X"

          | reflect (automatic consolidation)

Insight (with lineage):
  "Deals mentioning competitor X with pricing objections
   have 73% loss rate when annual commitment is pushed early"
   [confidence: 0.84, sources: 4 memories, tags: sales, pricing]

Your Agent Tunes Itself

HEBBS exposes tunable parameters that no other memory system does: four strategies, four scoring weights, adjustable k, cue expansion. Your agent uses these to learn how to retrieve better and stores that knowledge back into HEBBS.

Session 1: Agent runs eval queries against the vault
           Baseline: 54% keyword recall (default settings)

           Agent analyzes failures:
           - k too low → increase to 10
           - cue too generic → expand with entity names
           - wrong strategy → switch to temporal for timeline queries

           After tuning: 84% keyword recall

           Agent stores what worked:
           hebbs remember "RETRIEVAL-INSTRUCTION: For compliance queries,
           include entity names and expand acronyms. Use k=10."
           --importance 0.9 --entity-id retrieval-instructions

Session 2: Agent loads stored strategies at conversation start
           Applies learned tuning automatically
           Retrieval is better from the first query

This is the loop that no competitor can run. Vector databases give you one knob (top-k). HEBBS gives you four strategies, four scoring weights, entity scoping, and cue construction. Your agent optimizes all of them, measures the improvement, and remembers what worked.

Tested results: 54% to 84% with local embeddings, 75% to 90% with OpenAI embeddings.

Quick Start

Quick Start (Local)

# Initialize with OpenAI (recommended)
hebbs init . --provider openai --key $OPENAI_API_KEY

# Or with other providers
hebbs init . --provider anthropic --key $ANTHROPIC_API_KEY
hebbs init . --provider ollama

hebbs index .                         # index your markdown files
hebbs recall "your question here"     # recall with any strategy
hebbs panel                           # open the Memory Palace

One command configures both LLM and embeddings. --model is optional (defaults per provider). When using OpenAI, embedding auto-configures to text-embedding-3-small with the same key. No local model download needed. Other providers default to a local ONNX model (embeddinggemma-300m, ~600MB, downloaded once).

Controlling What Gets Indexed

By default HEBBS indexes every .md file in your vault, skipping .git/, .obsidian/, node_modules/, and .hebbs/. To exclude additional files or directories, create a .hebbsignore file at the vault root:

# .hebbsignore (same syntax as .gitignore)
templates/
drafts/*.md
archive/
*.template.md

Patterns from .hebbsignore are merged with the built-in defaults and any patterns in .hebbs/config.toml. Comments (#) and blank lines are supported. The daemon picks up changes to .hebbsignore automatically on its next config reload. No restart needed.

See docs/hebbsignore.md for the full reference.

Start a Server (Optional, for teams)

hebbs start                           # gRPC :6380, HTTP :6381
hebbs remember "hello world"          # store a memory (uses server via --endpoint)
hebbs recall "hello"                  # recall it

Connect from Python

from hebbs import HebbsClient

client = HebbsClient("localhost:6380")

await client.remember(
    content="Prospect mentioned competitor contract expires March 15",
    importance=0.95,
    entity_id="acme",
    context={"stage": "discovery", "signal": "urgency"},
)

# Four recall strategies
history = await client.recall(cue="acme engagement", strategy="temporal", entity_id="acme")
responses = await client.recall(cue="we built this in-house", strategy="similarity")
causes = await client.recall(cue="deal lost after pricing", strategy="causal")
patterns = await client.recall(cue="healthcare compliance objection", strategy="analogical")

# Consolidate and query insights
result = await client.reflect()
insights = await client.insights(entity_id="acme", max_results=10)

Connect from TypeScript

import { HebbsClient } from '@hebbs/sdk';

const client = new HebbsClient('localhost:6380', { apiKey: process.env.HEBBS_API_KEY });
await client.connect();

await client.remember({
    content: 'Prospect mentioned competitor contract expires March 15',
    importance: 0.95,
    entityId: 'acme',
    context: { stage: 'discovery', signal: 'urgency' },
});

// Four recall strategies
const history = await client.recall({ cue: 'acme engagement', strategies: ['temporal'], entityId: 'acme' });
const causes = await client.recall({ cue: 'deal lost after pricing', strategies: ['causal'] });
const patterns = await client.recall({ cue: 'healthcare compliance objection', strategies: ['analogical'] });

// Consolidate and query insights
const result = await client.reflect();
const insights = await client.insights({ entityId: 'acme', maxResults: 10 });

Reference Demos

The hebbs-python repo includes a full AI Sales Intelligence Agent demo with 7 scripted scenarios, 5 LLM providers, and Rich terminal panels.

pip install hebbs[demo]
hebbs-demo interactive --config gemini-vertex --verbosity verbose

The hebbs-typescript repo includes an equivalent TypeScript demo with 3 scenarios and an interactive mode.

cd hebbs-typescript/demo && npm install
npx tsx src/index.ts interactive --mock-llm

The API

Nine operations. Three groups. Each one is a cognitive primitive that didn't exist as a single call before.

Write

Operation	What it does	Why it matters
`remember()`	Store with importance scoring	Not append-only. Every memory is weighted at birth.
`revise()`	Update beliefs, keep lineage	Your agent corrects itself. No contradictory facts coexisting.
`forget()`	Prune by staleness, compliance	Real deletion. GDPR-proof. Signal-to-noise improves over time.

Read

Operation	What it does	Why it matters
`recall()`	4 strategies, composite scoring	Not just "find similar": find relevant, recent, causal, analogical.
`prime()`	Pre-load context	Start of conversation = agent already knows what matters.
`subscribe()`	Real-time push	Memories surface automatically when they become relevant.

Consolidate

Operation	What it does	Why it matters
`reflect()`	Consolidate episodes into insights	Your agent learns patterns, not just stores facts.
`insights()`	Query consolidated knowledge	Higher-order understanding, not raw retrieval.

Client Libraries

Language	Package	Repo	Status
Python	`pip install hebbs`	hebbs-ai/hebbs-python	Alpha (gRPC + embedded via PyO3)
TypeScript	`npm install @hebbs/sdk`	hebbs-ai/hebbs-typescript	Alpha (gRPC, Node.js 18+)
Rust	`hebbs` crate (direct)	This repo	Stable
Agent Skill	SKILL.md	hebbs-ai/hebbs-skill	Stable (Claude Code, OpenClaw)

Scoping: Entities and Tenants

HEBBS has two scoping dimensions.

entity_id: what the memory is about (a customer, project, user). Optional. Scope recall, prime, and forget to a subject.

tenant_id: who owns the data (an org, workspace). Structural isolation: storage keys are prefixed, index traversal is partitioned, cross-tenant queries are impossible.

hebbs --tenant acme-corp remember "Q2 forecast looks strong" --entity-id project-alpha

client = HebbsClient("localhost:6380", tenant_id="acme-corp")

const client = new HebbsClient('localhost:6380', { tenantId: 'acme-corp' });

Comparison

	pgvector	Qdrant	Neo4j	Memory Wrappers	HEBBS
Recall strategies	1	1	1-2	1-2	4
Temporal recall	No	No	No	No	Native
Causal reasoning	No	No	Partial	No	Native
Analogical transfer	No	No	No	No	Native
Native decay	No	No	No	No	Yes
Consolidation	No	No	No	Partial	Native
Revision with lineage	No	No	No	No	Yes
Agent skill (drop-in)	No	No	No	No	Yes
LLM calls on hot path	N/A	N/A	N/A	Yes	Zero
Recall latency (10M)	~20ms	~10ms	~50ms	50-200ms	<10ms
Runtime dependencies	Postgres	Qdrant	JVM + Neo4j	3-4 services	None

Performance

And it does all of this in under 10ms. Benchmarked on a single c6g.large instance (2 vCPU, 4GB RAM) with 10M stored memories.

Operation	p50	p99
`remember`	0.8ms	4ms
`recall` (similarity)	2ms	8ms
`recall` (temporal)	0.5ms	2ms
`recall` (causal)	4ms	12ms
`recall` (multi-strategy)	6ms	18ms
`subscribe` (event-to-push)	1ms	5ms

Scalability

Memories	`recall` p99 (similarity)	`recall` p99 (temporal)
100K	3ms	0.6ms
1M	5ms	0.8ms
10M	8ms	1.2ms
100M	12ms	2.0ms

Architecture

──────────────────────────────────────────────────────────
                     Client SDKs
             Python  |  TypeScript  |  Rust
──────────────────────────────────────────────────────────
               Agent Skills (SKILL.md)
            Claude Code  |  OpenClaw
──────────────────────────────────────────────────────────
                gRPC  |  HTTP/REST
──────────────────────────────────────────────────────────

                  Core Engine (Rust)

  +------------+ +------------+ +------------------+
  |  Remember   | |   Recall   | | Reflect Pipeline |
  |  Engine     | |   Engine   | | (background)     |
  |             | |            | |                  |
  | - encode    | | - prime    | | - cluster (Rust) |
  | - score     | | - query    | | - propose (LLM)  |
  | - index     | | - subscribe| | - validate (LLM) |
  | - decay     | | - merge    | | - store insights |
  +------+------+ +------+-----+ +--------+---------+
         |               |                |
  +------+---------------+----------------+-----------+
  |              Index Layer                          |
  |   Temporal (B-tree) | Vector (HNSW) | Graph       |
  +----------------------+----------------------------+
                         |
  +----------------------+----------------------------+
  |         Storage Engine (RocksDB)                  |
  |         Column Families per index type            |
  +---------------------------------------------------+

  +-----------------------+  +------------------------+
  | Embedding Engine      |  | LLM Provider Interface |
  | (ONNX Runtime,        |  | (Anthropic, OpenAI,    |
  |  built-in default)    |  |  Ollama, pluggable)    |
  +-----------------------+  +------------------------+

Built with:

Rust: no GC pauses, single static binary, C-level performance
RocksDB: embedded LSM storage, proven by TiKV and CockroachDB
HNSW: logarithmic-scaling vector index for similarity and analogical recall
ONNX Runtime: built-in CPU embeddings (<5ms), zero external API dependencies
gRPC: bidirectional streaming for real-time subscribe channels

Deployment

Standalone Server (the Redis model)

hebbs start                                       # gRPC :6380, HTTP :6381
HEBBS_AUTH_ENABLED=true hebbs start                # with API key authentication

Embedded Library (the SQLite model)

from hebbs import HEBBS

e = HEBBS.open("./agent-memory")  # No separate process
e.remember(...)

Edge Mode (robots, laptops, workstations): same API, different configuration. Runs the complete engine including local reflection with on-device LLMs.

Use Cases

Voice Sales Agents: Remember prospect history across calls, handle objections with proven responses, learn which pitches convert over time.

Customer Support: Recall past tickets, surface solutions from similar issues, reduce escalations through consolidated troubleshooting knowledge.

Coding Agents: Remember what approaches worked, recall past debugging sessions, avoid repeating failed strategies.

Robotics: Learn navigation patterns, share knowledge across a fleet, reflect on operational efficiency. Fully offline on edge hardware.

Personal Assistants: Remember preferences, learn routines, pick up context across conversations.

Contributing

We welcome contributions across the stack. See CONTRIBUTING.md for guidelines.

All contributors must sign our Contributor License Agreement before their first PR can be merged.

License

HEBBS uses a dual-license model.

The engine (hebbs-core, hebbs-storage, hebbs-index, hebbs-embed, hebbs-reflect, hebbs-server, hebbs-vault) is licensed under BSL 1.1. Same license as CockroachDB, Sentry, and Terraform. Use it freely in production. The only restriction: you cannot offer HEBBS as a hosted service to third parties. Every version converts to Apache 2.0 after four years.

Client libraries and protocol definitions (hebbs-client, hebbs-proto, hebbs-ffi) are licensed under Apache 2.0. Fully open source with no restrictions.

Educational institutions and non-profit organizations can use the full engine without restriction. For other licensing arrangements, reach out at parag@hebbs.ai.

Agents deserve better than a vector database and a prayer.