UnisonDB
Store, stream, and sync instantly — UnisonDB is a log-native, real-time database that replicates like a message bus for AI and Edge Computing.
What UnisonDB Is
UnisonDB is an open-source database designed specifically for Edge AI and Edge Computing.
It is a reactive, log-native and multi-model database built for real-time and edge-scale applications. UnisonDB combines a B+Tree storage engine with WAL-based (Write-Ahead Logging) streaming replication, enabling near-instant fan-out replication across hundreds of nodes — all while preserving strong consistency and durability.
Replication Model
Writes are committed by a Raft quorum on the write servers (if enabled); read‑only edge replicas/relayers stay ISR‑synced for low‑latency reads.
Key Features
- High Availability Writes: Raft consensus on write servers (quorum acks); relayer/replica use in-sync replica (ISR) replication
- Streaming Replication: In-sync replica (ISR)-based WAL streaming with sub-second fan-out to 1000+ edge replicas
- Multi-Modal Storage: Key-Value, Wide-Column, and Large Objects (LOB)
- Real-Time Notifications: ZeroMQ-based(Side-car) change notifications with sub-millisecond latency
- Durable: B+Tree storage with Write-Ahead Logging
- Edge-First Design: Optimized for edge computing and local-first architectures
- Namespace Isolation: Multi-tenancy support with namespace-based isolation
Use Cases
UnisonDB is built for distributed edge-first architectures systems where data and computation must live close together — reducing network hops, minimizing latency, and enabling real-time responsiveness at scale.
By co-locating data with the services that use it, UnisonDB removes the traditional boundary between the database and the application layer. Applications can react to local changes instantly, while UnisonDB’s WAL-based replication ensures eventual consistency across all replicas globally.
Fan-Out Scaling
UnisonDB can fan out updates to 100+ edge nodes in just a few milliseconds from a single upstream—and because it supports multi-hop relaying, that reach compounds naturally. Each hop carries the network + application latency of its link;
In a simple 2-hop topology:
- Hop 1: Primary → 100 hubs (≈250–500ms)
- Hop 2: Each hub → 100 downstream edge nodes (similar latency)
- Total reach: 100 + 10,000 = 10,100 nodes
Even at 60k–80k SET ops/sec with 1 KB values, UnisonDB can propagate those updates across 10,000+ nodes within seconds—without Kafka, Pub/Sub, CDC pipelines, or heavyweight brokers. (See the Relayer vs Latency benchmarks below for measured numbers.)
Quick Start
# Clone the repository git clone https://github.com/ankur-anand/unisondb cd unisondb # Build go build -o unisondb ./cmd/unisondb # Run in server mode (primary) ./unisondb server --config config.toml # Use the HTTP API curl -X PUT http://localhost:4000/api/v1/default/kv/mykey \ -H "Content-Type: application/json" \ -d '{"value":"bXl2YWx1ZQ=="}'
Documentation
- Getting Started with UnisonDB
- Complete Configuration Guide
- Architecture Overview
- HTTP API Reference
- Backup and Restore
- Deployment Topologies
- Rough Roadmap
UnisonDB implements a pluggable storage backend architecture supporting two BTree implementations:
- BoltDB: Single-file, ACID-compliant BTree.
- LMDB: Memory-mapped ACID-compliant BTree with copy-on-write semantics.
Redis-Compatible Benchmark: UnisonDB vs BadgerDB vs BoltDB vs LMDB
This benchmark compares the write and read performance of four databases — UnisonDB, BadgerDB, LMDB* and BoltDB — using a Redis-compatible interface and the official redis-benchmark tool.
What We Measured
- Throughput: Requests per second for
SET(write) andGET(read) operations - Latency: p50 latency in milliseconds
- Workload: 50 iterations of mixed
SETandGEToperations (200k ops per run) - Concurrency: 10 parallel clients, 10 pipelined requests, 4 threads
- Payload Size: 1KB
Test Environment
Chip: Apple M2 Pro
Total Number of Cores: 10 (6 performance and 4 efficiency)
Memory: 16 GB
Unisondb Btree Backend - LMDB
All three databases were tested under identical conditions to highlight differences in write path efficiency, read performance, and I/O characteristics. The Redis-compatible server implementation can be found in internal/benchtests/cmd/redis-server/.
Results
Performance Testing: Local Replication
Test Setup
We validated the WAL-based replication architecture using the pkg/replicator component. It Uses the same redis-compatible
bench tool but this time the server is started a n=[100,200,500,750,1000] goroutine that is an independent WAL reader, capturing critical performance metrics:
- Physical Latency Tracking: Measures p50, p90, p99, and max latencies Vs Relayer.
- SET, GET Latency vs Relayer
- SET, GET Throughput Vs Relayer.
Results
Test Replication Flow
Why UnisonDB
Traditional databases persist. Stream systems propagate. UnisonDB does both — turning every write into a durable, queryable stream that replicates seamlessly across the edge.
The Problem: Storage and Streaming Live in Different Worlds
Modern systems are reactive — every change needs to propagate instantly to dashboards, APIs, caches, and edge devices.
Yet, databases were built for persistence, not propagation.
You write to a database, then stream through Kafka.
You replicate via CDC.
You patch syncs between cache and storage.
This split between state and stream creates friction:
- Two systems to maintain and monitor
- Eventual consistency between write path and read path
- Network latency on every read or update
- Complex fan-out when scaling to hundreds of edges
The Gap
LMDB and BoltDB excel at local speed — but stop at one node.
etcd and Consul replicate state — but are consensus-bound and small-cluster only.
Kafka and NATS stream messages — but aren’t queryable databases.
| System | Strength | Limitation |
|---|---|---|
| LMDB / BoltDB | Fast local storage | No replication |
| etcd / Consul | Cluster consistency | No local queries, low fan-out |
| Kafka / NATS | Scalable streams | No storage or query model |
The Solution: Log-Native by Design
UnisonDB fuses database semantics with streaming mechanics — the log is the database.
Every write is durable, ordered, and instantly available as a replication stream.
No CDC, no brokers, no external pipelines.
Just one unified engine that:
- Stores data in B+Trees for predictable reads
- Streams data via WAL replication to thousands of nodes
- Reacts instantly with sub-second fan-out
- Keeps local replicas fully queryable, even offline
UnisonDB eliminates the divide between “database” and “message bus,”
enabling reactive, distributed, and local-first systems — without the operational sprawl.
UnisonDB collapses two worlds — storage and streaming — into one unified log-native core.
The result: a single system that stores, replicates, and reacts — instantly.
Core Architecture
UnisonDB is built on three foundational layers:
- WALFS - Write-Ahead Log File System (mmap-based, optimized for reading at scale).
- Engine - Hybrid storage combining WAL, MemTable, and B-Tree
- Replication - WAL-based streaming with offset tracking
The Layered View
UnisonDB stacks a multi-model engine on top of WALFS — a log-native core that unifies storage, replication, and streaming into one continuous data flow.
+-----------------------------------------------------------+
| Multi-Model API Layer |
| (KV, Wide-Column, LOB, Txn Engine, Query) |
+-----------------------------------------------------------+
| Engine Layer |
| WALFS-backed MemTable + B-Tree Store |
| (writes → WALFS, reads → B-Tree + MemTable) |
+-----------------------------------------------------------+
| WALFS (Core Log) | Replication Layer |
| Append-only, mmap-based | WAL-based streaming |
| segmented log | (followers tail WAL)|
| Commit-ordered, replication-safe | Offset tracking, |
| | catch-up, tailing |
+-----------------------------------------------------------+
| Disk |
+-----------------------------------------------------------+
1. WALFS (Write-Ahead Log)
Overview
WALFS is a memory-mapped, segmented write-ahead log implementation designed for both writing AND reading at scale. Unlike traditional WALs that optimize only for sequential writes, WALFS provides efficient random access for replication, and real-time tailing.
Segment Structure
Each WALFS segment consists of two regions:
+----------------------+-----------------------------+-------------+
| Segment Header | Record 1 | Record 2 |
| (64 bytes) | Header + Data + Trailer | ... |
+----------------------+-----------------------------+-------------+
Segment Header (64 bytes)
| Offset | Size | Field | Description |
|---|---|---|---|
| 0 | 4 | Magic | Magic number (0x5557414C) |
| 4 | 4 | Version | Metadata format version |
| 8 | 8 | CreatedAt | Creation timestamp (nanoseconds) |
| 16 | 8 | LastModifiedAt | Last modification timestamp (nanoseconds) |
| 24 | 8 | WriteOffset | Offset where next chunk will be written |
| 32 | 8 | EntryCount | Total number of chunks written |
| 40 | 4 | Flags | Segment state flags (e.g. Active, Sealed) |
| 44 | 12 | Reserved | Reserved for future use |
| 56 | 4 | CRC | CRC32 checksum of first 56 bytes |
| 60 | 4 | Padding | Ensures 64-byte alignment |
Record Format (8-byte aligned)
Each record is written in its own aligned frame:
| Offset | Size | Field | Description |
|---|---|---|---|
| 0 | 4 bytes | CRC | CRC32 of [Length | Data] |
| 4 | 4 bytes | Length | Size of the data payload in bytes |
| 8 | N bytes | Data | User payload (FlatBuffer-encoded LogRecord) |
| 8 + N | 8 bytes | Trailer | Canary marker (0xDEADBEEFFEEEDFACE) |
| ... | ≥0 bytes | Padding | Zero padding to align to 8-byte boundary |
WALFS Reader Capabilities
WALFS provides powerful reading capabilities essential for replication and recovery:
1. Forward-Only Iterator
reader := walLog.NewReader() defer reader.Close() for { data, pos, err := reader.Next() if err == io.EOF { break } // Process record }
- Zero-copy reads - data is a memory-mapped slice
- Position tracking - each record returns its
(SegmentID, Offset)position - Automatic segment traversal - seamlessly reads across segment boundaries
2. Offset-Based Reads
// Read from a specific offset (for replication catch-up) offset := Offset{SegmentID: 5, Offset: 1024} reader, err := walLog.NewReaderWithStart(&offset)
Use cases:
- Efficient seek without scanning
- Follower catch-up from last synced position
- Recovery from checkpoint
3. Active Tail Following
// For real-time replication (tailing active WAL) reader, err := walLog.NewReaderWithTail(&offset) for { data, pos, err := reader.Next() if err == ErrNoNewData { // No new data yet, can retry or wait continue } }
Behavior:
- Returns ErrNoNewData when caught up (not io.EOF)
- Enables low-latency streaming
- Supports multiple parallel readers
Why WALFS is Different
Unlike traditional "write-once, read-on-crash" WALs, WALFS optimizes for:
- Continuous replication - Followers constantly read from primary's WAL
- Real-time tailing - Low-latency streaming of new writes
- Parallel readers - Multiple replicas read concurrently without contention
2. Engine (dbkernel)
Overview
The Engine orchestrates writes, reads, and persistence using three components:
- WAL (WALFS) - Durability and replication source
- MemTable (SkipList) - In-memory write buffer
- B-Tree Store - Persistent index for efficient reads
Flow Diagram
FlatBuffer Schema
UnisonDB uses FlatBuffers for zero-copy serialization of WAL records:
Benefits:
- No deserialization on replicas
- Fast replication
Why FlatBuffers?
Replication efficiency - No deserialization needed on replicas
Transaction Support
UnisonDB provides atomic multi-key transactions:
txn := engine.BeginTxn() txn.Put("k1", value1) txn.Put("k2", value2) txn.Put("k3", value3) txn.Commit() // All or nothing
Flow
Transaction Properties:
- Atomicity - All writes become visible on commit, or none on abort
- Isolation - Uncommitted writes are hidden from readers
LOB (Large Object) Support
Large values can be chunked and streamed using TXN.
Flow
LOB Properties:
- Transactional - All chunks committed atomically
- Streaming - Can write/read chunks incrementally
- Efficient replication - Replicas get chunks as they arrive
Wide-Column Support
UnisonDB supports partial updates to column families:
Benefits:
- Efficient updates - Only modified columns are written/replicated
- Flexible schema - Columns can be added dynamically
- Merge semantics - New columns merged with existing row
3. Replication Architecture
Overview
Replication in UnisonDB is WAL-based streaming - designed around the WALFS reader capabilities. Followers continuously stream WAL records from the primary's WALFS and apply them locally.
Design Principles
- Offset-based positioning - Followers track their replication offset
(SegmentID, Offset) - Catch-up from any offset - Can resume replication from any position
- Real-time streaming - Active tail following for low-latency replication
- Self-describing records - FlatBuffer LogRecords are self-contained
- Batched streaming - Records sent in batches for efficiency
Replication Flow
- Offset-based positioning - Followers track (SegmentID, Offset) Independently.
- Catch-up from any offset - Resume from any position
- Real-time streaming - Active tail following for low latency
Why is Traditional KV Replication Insufficient?
Most traditional key-value stores were designed for simple, point-in-time key-value operations — and their replication models reflect that. While this works for basic use cases, it quickly breaks down under real-world demands like multi-key transactions, large object handling, and fine-grained updates.
Key-Level Replication Only
Replication is often limited to raw key-value pairs. There’s no understanding of higher-level constructs like rows, columns, or chunks — making it impossible to efficiently replicate partial updates or large structured objects.
No Transactional Consistency
Replication happens on a per-operation basis, not as part of an atomic unit. Without multi-key transactional guarantees, systems can fall into inconsistent states across replicas, especially during batch operations, network partitions, or mid-transaction failures.
Chunked LOB Writes Become Risky
When large values are chunked and streamed to the store, traditional replication models expose chunks as they arrive. If a transfer fails mid-way, replicas may store incomplete or corrupted objects, with no rollback or recovery mechanism.
No Awareness of Column-Level Changes
Wide-column data is treated as flat keys or opaque blobs. If only a single column is modified, traditional systems replicate the entire row, wasting bandwidth, increasing storage overhead, and making efficient synchronization impossible.
Operational Complexity Falls on the User
Without built-in transactional semantics, developers must implement their own logic for deduplication, rollback, consistency checks, and coordination — which adds fragility and complexity to the system.
Storage Engine Tradeoffs
• LSM-Trees (e.g., RocksDB) excel at fast writes but suffer from high read amplification and costly background compactions, which hurt latency and predictability.
• B+Trees (e.g., BoltDB,LMDB) offer efficient point lookups and range scans, but struggle with high-speed inserts and lack native replication support.
How UnisonDB Solves This. ✅
UnisonDB combines append-only logs for high-throughput ingest with B-Trees for fast and efficient range reads — while offering:
- Transactional, multi-key replication with commit visibility guarantees.
- Chunked LOB writes that are fully atomic.
- Column-aware replication for efficient syncing of wide-column updates.
- Isolation by default — once a network-aware transaction is started, all intermediate writes are fully isolated and not visible to readers until a successful txn.Commit().
- Built-in replication via gRPC WAL streaming + B-Tree snapshots.
- Zero-compaction overhead, high write throughput, and optimized reads.
Development
certificate for Local host
brew install mkcert ## install local CA mkcert -install ## Generate gRPC TLS Certificates ## these certificate are valid for hostnames/IPs localhost 127.0.0.1 ::1 mkcert -key-file grpc.key -cert-file grpc.crt localhost 127.0.0.1 ::1












