GitHub - brainfish-ai/ReasonDB: The first database built to let AI agents think their way to the right answer using structural reasoning, rather than guessing based on vector similarity.

AI-Native Document Intelligence

The database that understands your documents.
Built for AI agents that need to reason, not just retrieve.

_{⚠️ Alpha Release — ReasonDB is under active development. APIs and features may change. We'd love your feedback!}

What is ReasonDB?

ReasonDB is an AI-native document database built in Rust, designed to go beyond simple retrieval. While traditional databases and vector stores treat documents as data to be indexed, ReasonDB treats them as knowledge to be understood - preserving document structure, enabling LLM-guided traversal, and extracting precise answers with full context.

ReasonDB introduces Hierarchical Reasoning Retrieval (HRR), a fundamentally new architecture where the LLM doesn't just consume retrieved content - it actively navigates your document structure to find exactly what it needs, like a human expert scanning summaries, drilling into relevant sections, and synthesizing answers.

ReasonDB is not another vector database. It's a reasoning engine that preserves document hierarchy, enabling AI to traverse your knowledge the way a domain expert would.

Key features of ReasonDB include:

Hierarchical Reasoning Retrieval: LLM-guided tree traversal with parallel beam search - AI navigates document structure instead of relying on similarity matching
RQL Query Language: SQL-like syntax with built-in SEARCH (BM25) and REASON (LLM) clauses in a single query
Plugin Architecture: Extensible extraction pipeline - PDF, Office, images, audio, and URLs out of the box via MarkItDown
Multi-Provider LLM Support: Anthropic, OpenAI, Gemini, Cohere, Vertex AI, AWS Bedrock, and more — switch providers without code changes
Production Ready: ACID-compliant storage, API key auth, rate limiting, async parallel traversal - all in a single Rust binary

The Problem

AI agents today are limited by their databases:

Approach	What It Does	Why It Fails
Vector DBs	Finds "similar" chunks	Loses structure. A contract's termination clause isn't "similar" to your question about exit terms - but it's the answer.
RAG Pipelines	Retrieves then generates	Garbage in, garbage out. Wrong chunks retrieved means wrong answers, no matter how capable the LLM.
Knowledge Graphs	Maps explicit relationships	Requires manual entity extraction. Can't handle the messy reality of real documents.

The result? AI agents that hallucinate, miss critical context, or drown in irrelevant chunks.

ReasonDB solves this by letting the LLM reason through your documents - not just search them.

Benchmark

Results on a real-world insurance document corpus (4 policy documents, ~1,900 nodes, 12 queries across 6 complexity tiers). Full benchmark script: tutorials/data/insurance/benchmark.py.

Retrieval quality vs. typical RAG

Metric	ReasonDB	Typical RAG
Pass rate	100% (12 / 12)	55 – 70%
Context recall (term match)	90% avg	60 – 75%
Median latency (RQL `REASON`)	6.1 s	15 – 45 s

"Typical RAG" = chunked-retrieval pipelines (LlamaIndex / LangChain defaults) on the same corpus. ReasonDB uses BM25 candidate selection + LLM-guided hierarchical tree traversal instead of flat similarity matching.

Per-category breakdown

Category	Avg latency	Term recall	Pass
Simple	7.1 s	100%	2 / 2
Specific	5.9 s	75%	2 / 2
Multi-condition	5.6 s	83%	2 / 2
Comparative	6.2 s	100%	2 / 2
Multi-hop	6.5 s	83%	2 / 2
Synthesis	6.5 s	100%	2 / 2

Cross-section reference retrieval

ReasonDB detects and follows intra-document cross-references during ingestion (LLM-extracted during summarization) and surfaces the referenced sections alongside primary results. This closes the "answer is split across two clauses" gap that defeats flat-chunk retrieval.

Metric	Value
Queries with ≥ 1 cross-ref surfaced	4 / 5
Avg recall, primary content only	62%
Avg recall, primary + cross-refs	80% (+18 pp)
Example gain	Recurrent disability query: 67% → 100% once cross-referenced policy schedule clause is included

Insurance Policy Analyser — Live Demo

The benchmark above is powered by this tutorial app. It queries four insurance policy documents using REASON and shows the full traversal trace — which nodes the LLM visited, why it selected them, and how it synthesized the final answer.

Full tutorial source: tutorials/06-insurance/

How It Works

%%{init: {'theme': 'dark'}}%%
flowchart TD
    subgraph Ingestion["Ingestion Pipeline (Plugin-Driven)"]
        A["Documents / URLs"] -->|Extractor Plugin| B["Markdown"]
        B -->|Post-Processor Plugin| C["Cleaned Markdown"]
        C -->|Chunker| D["Semantic Chunks"]
        P["Pre-chunked JSON"] -->|"bypasses extract + chunk"| D
        D --> E["Build Hierarchical Tree"]
        E -->|Bottom-up| F["LLM Summarizes Each Node"]
    end

    subgraph Search["Search & Reasoning"]
        G["Natural Language Query"] --> G1["BM25 Candidates + Title Boost"]
        G1 --> G2["Recursive Tree-Grep Pre-Filter"]
        G2 --> H["LLM Ranks by Summaries + Match Signals"]
        H -->|Selects relevant branches| I["Traverse Tree"]
        I -->|Parallel beam search| J["Drill Into Leaf Nodes"]
    end

    subgraph Result["Response"]
        J --> K["Extract Answer"]
        K --> L["Confidence Score + Reasoning Path"]
    end

    Ingestion --> Search

Extract - Extractor plugins convert documents and URLs to Markdown (built-in: MarkItDown)
Chunk - Content is split into semantic chunks with heading detection — or bypass entirely with pre-chunked JSON via /ingest/chunks
Build Tree - Chunks are organized into a hierarchical tree structure, preserving per-chunk metadata (page numbers, line ranges, custom attributes)
Summarize - LLM generates summaries for each node (bottom-up); pre-supplied summaries are used as-is
Search - 4-phase pipeline: BM25 candidate selection → recursive tree-grep filtering → LLM summary ranking → parallel beam-search traversal
Return - Relevant content with extracted answers, confidence scores, and the full reasoning path

Quick Start

Get from zero to intelligent document search in under 5 minutes.

Download pre-built binaries

Grab the latest release for your platform:

Platform	Architecture	Download
macOS	Apple Silicon (M1/M2/M3/M4)	aarch64-apple-darwin
Linux	x86_64	x86_64-unknown-linux-gnu
Linux	ARM64	aarch64-unknown-linux-gnu
Windows	x86_64	x86_64-pc-windows-msvc

ReasonDB Client: A desktop app is also available for macOS (.dmg) and Windows (.msi).

macOS: "ReasonDB.app" Not Opened

Since ReasonDB is in alpha, the desktop app is not yet signed with an Apple Developer certificate. macOS Gatekeeper will block it on first launch. To open it:

Right-click (or Control-click) on ReasonDB.app and select Open
Click Open in the confirmation dialog

Or remove the quarantine attribute from the terminal:

xattr -cr /Applications/ReasonDB.app

You can also go to System Settings → Privacy & Security, scroll down, and click Open Anyway next to the ReasonDB message.

Install with Homebrew (macOS / Linux)

brew tap brainfish-ai/reasondb-tap
brew install reasondb

Install with one line (macOS / Linux)

No Homebrew? Download and install directly:

# macOS Apple Silicon
curl -L https://github.com/brainfish-ai/reasondb/releases/latest/download/reasondb-$(curl -s https://api.github.com/repos/brainfish-ai/reasondb/releases/latest | grep tag_name | cut -d'"' -f4)-aarch64-apple-darwin.tar.gz | tar -xz && sudo mv reasondb /usr/local/bin/

# Linux x86_64
curl -L https://github.com/brainfish-ai/reasondb/releases/latest/download/reasondb-$(curl -s https://api.github.com/repos/brainfish-ai/reasondb/releases/latest | grep tag_name | cut -d'"' -f4)-x86_64-unknown-linux-gnu.tar.gz | tar -xz && sudo mv reasondb /usr/local/bin/

Install from source

git clone https://github.com/reasondb/reasondb.git && cd reasondb
cargo build --release

To also set up the desktop app and tutorials, install JS dependencies from the repo root:

yarn install          # installs apps/, packages/, and tutorials/ in one step
yarn build:packages   # builds the shared @reasondb/rql-editor package

Configure your LLM provider

Variable	Description	Required
`REASONDB_LLM_PROVIDER`	`openai`, `anthropic`, `gemini`, `cohere`, `glm`, `kimi`, `ollama`, `vertex`, `bedrock`	Yes
`REASONDB_LLM_API_KEY`	API key for the chosen provider	Yes
`REASONDB_MODEL`	Override the default model for the provider	No

Start the server

Server starts at http://localhost:4444 with Swagger UI at http://localhost:4444/swagger-ui/

Run using Docker

docker run --rm --pull always --name reasondb -p 4444:4444 \
  -e REASONDB_LLM_PROVIDER=openai \
  -e REASONDB_LLM_API_KEY=sk-... \
  brainfishai/reasondb:latest serve

Or use the Makefile for local development:

make docker-up        # Build and start
make docker-up-d      # Start in background
make docker-logs      # View logs
make docker-down      # Stop containers
make docker-down-v    # Stop and remove data volume
make docker-ps        # Check health status

ReasonDB Client (Desktop App)

The repo uses a Yarn workspace — a single yarn install at the root installs all JS dependencies across apps/, packages/, and tutorials/.

# One-time setup from repo root
yarn install
yarn build:packages   # build the shared @reasondb/rql-editor package

# Run the desktop app
make client-app          # dev mode (Tauri + Vite)
make client-app-build    # production build

# Or use yarn workspace commands directly
yarn workspace reasondb-client dev         # Vite web dev server only
yarn workspace reasondb-client tauri dev   # full Tauri desktop app (dev)
yarn workspace reasondb-client build       # production build

Interactive Tutorials

Six hands-on tutorial apps are included, each running a Next.js app against a live ReasonDB server:

Tutorial	Workspace name	Port
01 — RQL Basics	`tutorial-rql-basics`	5000
02 — Legal Search	`tutorial-legal-search`	5001
03 — Research Papers	`tutorial-research-papers`	5002
04 — Knowledge Base	`tutorial-knowledge-base`	5003
05 — PDF Financials	`tutorial-pdf-financials`	5004
06 — Insurance Analyser	`tutorial-insurance-demo`	5005

# Start any tutorial by its workspace name
yarn workspace tutorial-rql-basics dev
yarn workspace tutorial-insurance-demo dev

# Fetch sample data for all tutorials
yarn tutorials:fetch-all

All tutorials require a running ReasonDB server — start one with reasondb serve before launching a tutorial.

Query with RQL

ReasonDB uses RQL - a SQL-like query language with built-in SEARCH and REASON clauses.

Here's ReasonDB answering a question about itself — the README ingested as a document:

SELECT * FROM docs REASON 'What is ReasonDB?';

{
  "documents": [
    {
      "title": "ReasonDB README",
      "score": 0.97,
      "matched_nodes": [
        {
          "title": "What is ReasonDB?",
          "content": "ReasonDB is an AI-native document database built in Rust, designed to go beyond simple retrieval. It treats documents as knowledge to be understood — preserving document structure, enabling LLM-guided traversal, and extracting precise answers with full context.",
          "path": ["ReasonDB README", "What is ReasonDB?"],
          "confidence": 0.97,
          "reasoning_trace": [
            {
              "node_title": "ReasonDB README",
              "decision": "Introduction section directly addresses the query",
              "confidence": 0.91
            },
            {
              "node_title": "What is ReasonDB?",
              "decision": "Node title matches query exactly — drilling into content",
              "confidence": 0.97
            }
          ]
        }
      ]
    }
  ],
  "total_count": 1,
  "execution_time_ms": 4213
}

Every answer includes the matched node content, the path through the document tree the LLM traversed, a reasoning trace explaining each navigation decision, and a confidence score. No black-box retrieval — full transparency.

-- Fast keyword search (BM25, ~50ms)
SELECT * FROM contracts SEARCH 'payment terms' LIMIT 5;

-- LLM-guided reasoning (navigates the document tree)
SELECT * FROM contracts REASON 'What are the late fees and penalties?';

-- Combine filters, search, and reasoning in one query
SELECT * FROM contracts
WHERE tags CONTAINS ANY ('nda') AND metadata.value_usd > 10000
SEARCH 'termination clause'
REASON 'What are the exit conditions?'
LIMIT 5;

Compare: Vector DB vs ReasonDB

Vector DB Approach

Query: "What are the termination conditions?"

→ Embed query as vector
→ Find 5 "similar" chunks
→ Hope one contains the answer

Result: Random paragraphs mentioning "termination"
        scattered across the document. No context.
        LLM hallucinates missing details.

ReasonDB Approach

Query: "What are the termination conditions?"

→ LLM reads document summary
→ Identifies "Section 8: Termination" as relevant
→ Navigates to section, reads subsection summaries
→ Drills into "8.2 Conditions for Termination"
→ Extracts complete answer with full context

Result: Precise answer citing specific clauses,
        with confidence score and reasoning path.

Plugin Architecture

ReasonDB uses a plugin system for document extraction. Plugins are external processes (Python, Node.js, Bash, or compiled binaries) that communicate via JSON over stdin/stdout.

What ships out of the box	What you can add
markitdown - PDF, Word, Excel, PowerPoint, HTML, images (OCR), audio, YouTube, and more	Custom extractors, post-processors, chunkers, summarizers

# List installed plugins
curl http://localhost:4444/v1/plugins

# Test a plugin
curl -X POST http://localhost:4444/v1/plugins/markitdown/test \
  -H "Content-Type: application/json" \
  -d '{"operation":"extract","params":{"source_type":"file","path":"/tmp/doc.pdf"}}'

Community plugins can be installed by dropping a directory into $REASONDB_PLUGINS_DIR (default: ./plugins). See the Plugin Guide for details.

Use Cases

Legal Document Analysis - Navigate complex contracts, find specific clauses, compare terms across agreements
Research & Knowledge Management - Build searchable knowledge bases from papers, reports, and documentation
Customer Support Intelligence - Transform support docs into an AI agent that finds precise answers
Compliance & Policy - Query policy documents in natural language with full section references
AI Agent Data Layer - Give your agents structured access to unstructured knowledge with reasoning capabilities

Tech Stack

Component	Technology	Purpose
Storage	redb	Pure Rust, ACID-compliant embedded database
Search	tantivy	Blazing fast BM25 full-text search
Extraction	Plugin System	Process-based plugins (Python, Node.js, Bash, binaries)
Runtime	tokio	Async parallel branch exploration
HTTP	axum	Fast, ergonomic web framework
LLM	rig-core	Multi-provider LLM abstraction
API Docs	utoipa	OpenAPI 3.0 + Swagger UI
Container	Docker (Alpine)	Python 3, Node.js, and Bash runtimes

Documentation

Resource	Link
Full Documentation	reason-db.devdoc.sh
Quick Start Guide	reason-db.devdoc.sh/documentation/page/quickstart
Core Concepts	reason-db.devdoc.sh/documentation/page/concepts
Contributing	Contributing guide · docs
Plugin Guide	reason-db.devdoc.sh/documentation/page/guides/plugins
API Reference	reason-db.devdoc.sh/api-reference
Swagger UI	localhost:4444/swagger-ui (when server is running)

Community

Join our growing community for help, ideas, and discussions regarding ReasonDB.

View our Blog
Star us on GitHub

Contributing

We’d love your help. See the contributing guide for development setup, running tests, and how to submit PRs. You can also read it in the docs.

License

ReasonDB is source-available under the ReasonDB License v1.0.

You can:

Use ReasonDB for any purpose (commercial or non-commercial)
Modify the source code
Distribute copies and derivative works
Use in your own products and services

You cannot:

Offer ReasonDB as a hosted/managed database service (DBaaS)
Provide ReasonDB's functionality as a service to third parties