GitHub - varshith-Git/Valori-Kernel: Valori is a Deterministic Memory OS that sits between intelligence (LLMs) and reality (devices, products, decisions).

6 min read Original article ↗

Valori

The vector database that can mathematically prove it never lost your data.

Version License Build Determinism arXiv Tests

Q16.16 fixed-point arithmetic · BLAKE3 hash-chained audit log · openraft consensus · offline verifiable proofs


The Problem

Every vector database makes a silent assumption: float arithmetic on one machine produces the same result on another. It does not. SIMD units, cloud hardware migrations, and IEEE 754 implementation variance mean replicas silently diverge — and you can never verify they haven't.

In AI systems this compounds: agent memory drifts between restarts, crash recovery is unverifiable, and an audit trail built on float results cannot be reproduced anywhere else.

Valori eliminates all of this with one decision: integer-only vector math, provably identical on every machine.


Production Proof

# State hash before a forced restart
curl $VALORI_URL/v1/proof/state
# → {"final_state_hash": [174, 163, 169, 225, 123, 111, 34, 11, ...]}

# kill -9 — no graceful shutdown, no flush

# State hash after automatic recovery
curl $VALORI_URL/v1/proof/state
# → {"final_state_hash": [174, 163, 169, 225, 123, 111, 34, 11, ...]}
# identical — bit-perfect recovery, cryptographically verified

Every byte of state is recovered from the append-only, BLAKE3-chained event log and verified against the pre-crash root. No data loss. No manual intervention. No trust required.


Where Valori Sits in Your Stack

┌─────────────────────────────────────────────────────────────────────┐
│                      Your AI Application                            │
│   LangChain · LlamaIndex · OpenAI Agents · Custom Orchestrators    │
└────────────────────────┬────────────────────────────────────────────┘
                         │  Python SDK  /  HTTP  /  PyO3 FFI
┌────────────────────────▼────────────────────────────────────────────┐
│                         VALORI                                      │
│  ┌──────────────┐   ┌──────────────┐   ┌──────────────────────┐   │
│  │  Vector      │   │  Knowledge   │   │  Cryptographic       │   │
│  │  Memory      │   │  Graph       │   │  Audit Trail         │   │
│  │  (HNSW/Brute)│   │  (same store)│   │  (BLAKE3 + replay)   │   │
│  └──────────────┘   └──────────────┘   └──────────────────────┘   │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │           Q16.16 Fixed-Point Kernel  (no_std / no_alloc)    │  │
│  │   bit-identical results on x86 · ARM · RISC-V · Cortex-M4  │  │
│  └──────────────────────────────────────────────────────────────┘  │
│  ┌───────────────────────┐   ┌──────────────────────────────────┐  │
│  │   Standalone Node     │   │   3- or 5-Node Raft Cluster      │  │
│  └───────────────────────┘   └──────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────┘

Key Features

Determinism Q16.16 fixed-point — bit-identical across x86, ARM, RISC-V, Cortex-M4
Audit trail Append-only BLAKE3-chained event log; offline verifiable with no server
Tamper detection Locates the exact altered event, byte offset, and commit timestamp
Raft cluster 3/5-node consensus via openraft 0.9 + tonic/gRPC + mTLS
GraphRAG Vector search + subgraph traversal in one call, one consistent snapshot
Agent memory (MCP) valori-mcp — verifiable recall with BLAKE3 receipt; works with Claude Desktop
Recency decay decay_half_life_secs fades older memories in ranking without touching the state hash
Valori Reranker Server-side hybrid retrieval — vector top-K pooled then re-scored by term frequency; 90% accuracy on hard lexical queries, 0.4 s latency, no external dependency
Built-in ingest POST /v1/ingest — chunk + embed + insert + graph + audit in one call; works in standalone and 3/5-node cluster; VALORI_EMBED_PROVIDER=ollama|openai|custom; /v1/ingest/document for chunking only
Self-maintaining memory consolidate (supersede a memory) and contradict (flag conflicts) commit Supersedes/Contradicts edges to the audit chain
Multi-tenancy Up to 1 024 named collections; per-tenant API keys with RBAC
Point-in-time reads Replay to any past state hash or log index
GDPR erasure Crypto-shredding — DEK destruction = O(1) erasure, audit chain stays intact
Embedded no_std / no_alloc kernel; runs on microcontrollers with no heap
S3 offload Snapshot archival + WAL rotation to S3/MinIO/R2

Full feature list and phase history


Get Started

Option 1 — Python SDK, embedded (no server)

pip install valoricore
pip install "valoricore[local]"   # + SentenceTransformer embeddings
from valoricore import MemoryClient
from valoricore.embeddings import SentenceTransformerEmbedder

embedder = SentenceTransformerEmbedder("all-MiniLM-L6-v2")
db = MemoryClient(path="./my_db", dim=384, index_kind="hnsw")

db.add_document(text="The patient presented with hypertension.", embed=embedder)
hits = db.semantic_search("blood pressure", embed=embedder, k=5)
print(db.get_state_hash())   # 64-char BLAKE3 hex — same on any machine

Option 2 — HTTP server (standalone node)

VALORI_DIM=1536 \
VALORI_EVENT_LOG_PATH=./data/events.log \
VALORI_SNAPSHOT_PATH=./data/snapshot.bin \
  cargo run --release -p valori-node
from valoricore import SyncRemoteClient
db = SyncRemoteClient("http://localhost:3000")
db.insert([0.1, 0.2, ...], text="section title and body")   # index for reranking
hits = db.search([0.1, 0.2, ...], k=5)                             # vector only
hits = db.search([0.1, 0.2, ...], k=5, query_text="my query")     # hybrid rerank (default)
hits = db.search([0.1, 0.2, ...], k=5, decay_half_life_secs=86400) # recency-aware

Option 3 — One-call document ingest (chunk + embed on-node)

Start the node with an embed provider and POST raw text — no client-side embedding needed:

VALORI_DIM=768 \
VALORI_EMBED_PROVIDER=ollama \
VALORI_EMBED_MODEL=nomic-embed-text \
VALORI_EMBED_URL=http://localhost:11434 \
  cargo run --release -p valori-node
from valoricore import SyncRemoteClient
db = SyncRemoteClient("http://localhost:3000")

# One call: text → auto-chunk → embed → insert → graph nodes → metadata
result = db.ingest(text, source="paper.pdf", strategy="auto", collection="research")
print(f"{result['chunk_count']} chunks inserted, doc node {result['document_node_id']}")

# Chunking only (no embed step):
chunks = db.chunk_document(text, strategy="tree")
# → {"strategy_used": "tree", "chunk_count": 31, "chunks": [...]}

Option 3 — 3-node cluster

cargo install --path crates/valori-cli
valori setup   # interactive wizard

Cluster setup guide · Docker Compose · Helm chart · AWS/Azure Terraform

Option 4 — Agent memory via MCP

VALORI_URL=http://localhost:3000 valori-mcp
{ "mcpServers": { "valori": {
  "command": "valori-mcp",
  "env": { "VALORI_URL": "http://localhost:3000" }
} } }

crates/valori-mcp/README.md

Option 5 — Web dashboard with persistent projects

cd ui && npm install && npm run dev   # http://localhost:3001

Each project is an isolated, persistent workspace: its own node, port, and data dir under ~/.valori/projects/<name>/. The Home screen lists every project (even when its node is stopped); opening one auto-starts its node and restores state, and closing it writes a snapshot and locks the files at rest — they can only be deleted from the UI. → docs/phases/phase-6-persistent-projects.md


Build from Source

cargo build --release --workspace
cargo test -p valori-kernel -p valori-node
cd python && pip install -e ".[dev]"

Requires Rust stable. For Python FFI: cargo install maturin.


Documentation

Doc What it covers
docs/getting-started.md Full quickstart for all deployment modes
docs/api-reference.md Complete HTTP API reference
docs/python-reference.md Full Python SDK reference
docs/CLUSTER.md Cluster setup, operations, failover
docs/DR.md Backup, restore, cross-region DR runbook
docs/CAPACITY.md Capacity planning — vectors/GB, WAL growth, S3 cost
docs/THREAT_MODEL.md Security model and BLAKE3 MAC analysis
docs/DEPLOYMENT.md Docker, Kubernetes, S3, Terraform
docs/authentication.md API keys, RBAC, mTLS
docs/core-concepts.md Fixed-point math, audit chain, determinism
docs/phases/README.md Full build history and phase reports
benchmarks/RESULTS.md Benchmarks and comparison vs Pinecone/Qdrant/Weaviate

Research

Paper: Valori: A Deterministic Memory Substrate for AI Systems

@article{gudur2025valori,
  title   = {Valori: A Deterministic Memory Substrate for AI Systems},
  author  = {Gudur, Varshith},
  journal = {arXiv preprint arXiv:2512.22280},
  year    = {2025}
}

License

Dual-licensed under MIT OR Apache-2.0 — free for commercial use.

Contact: varshith.gudur17@gmail.com


Built in Rust. Proven in production. Auditable by mathematics.

If Valori is useful to you, a star helps others find the project.

Star History