aether

12 min read Original article ↗

6d745a3d34

feat: Phase 2 — levels with Arc<Run> and left-right read path

Refactor Level<I> to store runs behind Arc<Run<I>>, enabling cheap
cloning for the optional left-right read path.

Changes:
- level.rs: runs: BTreeMap<RunId, Arc<Run<I>>>; manual Clone impl;
  add_run takes Arc<Run<I>>; get_run returns Option<Arc<Run<I>>>;
  iterators still expose (RunId, &Run<I>) via deref for zero churn
  at call sites.
- lr_levels.rs (new): LevelOp<I>::SetAll + Absorb impl for
  Vec<Level<I>>; sync_with clones only Arc handles (O(num_runs)
  pointer copies, not run data). 4 unit tests including roundtrip.
- mod.rs: cfg-gated levels_lr_writer + levels_factory fields;
  publish_levels() helper publishes snapshot while write lock is
  held; read paths in get() and level_stats() use factory handle
  when feature is enabled; publish called from flush_write_buffer,
  insert (merger finalization), and handle_command.
- query.rs: merge_range and merge_prefix_scan use levels_factory
  handle (wait-free) when left-right-lsm feature is enabled.
- adaptive.rs: wrap new Run<I> with Arc::new() at add_run call sites.
- benches/left_right_lsm.rs: add bench_level_reads group comparing
  RwLock vs left-right at 1/2/4/8/16 reader threads.

Test results:
  default: 1209 passed
  --features left-right-lsm: 1219 passed (10 new lr_levels tests)

2026-05-13 12:46:19 -04:00 .codeberg chore: configure Codeberg Pages instead of GitHub Pages 2026-03-18 06:53:00 -04:00 .config feat: convert to Cargo workspace with cargo-hakari 2026-04-29 10:55:03 -04:00 .direnv feat: Complete Pure HanoiDB LSM implementation with comprehensive documentation 2026-04-03 07:16:59 -04:00 .github chore: release v2.0.1 2026-04-29 11:23:57 -04:00 aether feat: Phase 2 — levels with Arc<Run> and left-right read path 2026-05-13 12:46:19 -04:00 benchmark_results chore: Fix unused import warnings 2026-03-31 10:25:10 -04:00 benchmarks/trawler docs: add Trawler benchmark performance results 2026-03-24 07:39:21 -04:00 cash chore: release v2.0.1 2026-04-29 11:23:57 -04:00 cask chore: release v2.0.1 2026-04-29 11:23:57 -04:00 codegen chore: setup team infrastructure 2026-02-27 08:18:28 -05:00 docs chore: Comprehensive cleanup - fix warnings, remove cruft, clean worktrees 2026-04-09 08:26:48 -04:00 examples feat: convert to Cargo workspace with cargo-hakari 2026-04-29 10:55:03 -04:00 ftdb chore: release v2.0.1 2026-04-29 11:23:57 -04:00 include feat: Complete Pure HanoiDB LSM implementation with comprehensive documentation 2026-04-03 07:16:59 -04:00 io_analysis chore: Fix unused import warnings 2026-03-31 10:25:10 -04:00 jepsen/aether chore: Remove build artifacts, agent docs, and cruft from git 2026-04-08 08:11:07 -04:00 scripts feat: Write performance optimization - Phases 1, 5, 6 complete 2026-03-27 10:42:28 -04:00 tools/fsync_comparison feat: Complete Pure HanoiDB LSM implementation with comprehensive documentation 2026-04-03 07:16:59 -04:00 workspace-hack chore: release v2.0.1 2026-04-29 11:23:57 -04:00 .editorconfig chore: setup Claude Code configuration and clean up temporary files 2026-02-20 11:02:35 -05:00 .envrc feat: add Nix flake for reproducible builds 2026-03-17 09:04:47 -04:00 .git-blame-ignore-revs chore: setup team infrastructure 2026-02-27 08:18:28 -05:00 .gitignore chore: Remove build artifacts, agent docs, and cruft from git 2026-04-08 08:11:07 -04:00 .optimization_baselines.json test: add optimization regression test suite 2026-03-22 08:40:40 -04:00 .rustfmt.toml feat: comprehensive test infrastructure and fix all ignored tests 2026-03-15 08:26:06 -04:00 AGENT_TEAM_RESULTS.md docs: Agent team results summary - 5 of 7 tasks complete 2026-04-08 16:02:26 -04:00 BUILDING.md docs: consolidate and reorganize documentation structure 2026-03-21 16:00:34 -04:00 Cargo.lock chore: release v2.0.1 2026-04-29 11:23:57 -04:00 Cargo.toml chore: release v2.0.1 2026-04-29 11:23:57 -04:00 ChangeLog chore: release v2.0.1 2026-04-29 11:23:57 -04:00 CLI_CHEATSHEET.md chore: release v2.0.1 2026-04-29 11:23:57 -04:00 deny.toml chore: Comprehensive cleanup - fix warnings, remove cruft, clean worktrees 2026-04-09 08:26:48 -04:00 flake.lock feat: add Nix flake for reproducible builds 2026-03-17 09:04:47 -04:00 flake.nix fix: Revert darwin framework path to apple_sdk.frameworks 2026-04-10 12:47:46 -04:00 LICENSE Release v1.2.0: Sparsemap Rust translation and quality improvements 2026-03-13 21:26:07 +01:00 LSM_HANOIDB_ALIGNMENT.md feat: Production readiness - Phase 1 complete (90%) 2026-04-08 15:13:55 -04:00 OPTIMIZATION_STRATEGY.md docs: Add comprehensive optimization strategy focused on ACID-first performance 2026-03-31 11:44:35 -04:00 perf-cycles.data WIP: Lazy prefix compression - add prefix_valid field 2026-03-24 15:48:53 -04:00 perf-point-lookup.data WIP: Lazy prefix compression - add prefix_valid field 2026-03-24 15:48:53 -04:00 phase2-perf.data WIP: Lazy prefix compression - add prefix_valid field 2026-03-24 15:48:53 -04:00 PRODUCTION_READINESS_CHECKLIST.md feat: Release preparation - Task #10 (1.0 final 10%) 2026-04-08 15:22:12 -04:00 README.md feat: convert to Cargo workspace with cargo-hakari 2026-04-29 10:55:03 -04:00 RELEASE_NOTES_1.0.md feat: Release preparation - Task #10 (1.0 final 10%) 2026-04-08 15:22:12 -04:00 REMAINING_WORK.md feat: Release preparation - Task #10 (1.0 final 10%) 2026-04-08 15:22:12 -04:00 SECURITY_AUDIT.md chore: Comprehensive cleanup - fix warnings, remove cruft, clean worktrees 2026-04-09 08:26:48 -04:00 SESSION_COMPLETE.md docs: Complete session summary - 95% production ready 2026-04-08 16:06:47 -04:00 WORKSPACE.md feat: convert to Cargo workspace with cargo-hakari 2026-04-29 10:55:03 -04:00

A high-performance, formally-verified database storage engine written in Rust

Rust License

Aether DB is an ACID-compliant database storage engine featuring:

  • Persistent B+ Tree with buffer manager integration
  • Write-Ahead Logging (WAL) with Taurus algorithm and formal TLA+ verification
  • ARIES Recovery Protocol for crash recovery with redo/undo
  • LeanStore-inspired buffer manager with 24ns hot path
  • Multiple index types: B+ Tree, Skiplist, RAX radix tree
  • LSM tree framework with pluggable compaction strategies
  • ACID transactions with savepoints and two-phase commit
  • Formal verification via TLA+ specifications

Quick Start

Add to your Cargo.toml:

[dependencies]
aether = "1.0"

Basic Example

use aether::{Db, KvStore};
use tempfile::TempDir;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let dir = TempDir::new()?;

    // Key-value API
    let mut kv = KvStore::open(dir.path().join("kv.db"))?;
    kv.put(b"hello", b"world")?;
    assert_eq!(kv.get(b"hello")?, Some(b"world".to_vec()));

    Ok(())
}

See aether/examples/ for more demonstrations including B+ Tree, Skiplist, RAX radix tree, LSM tree, concurrent access, and WAL recovery.

Workspace Structure

Aether is organized as a Cargo workspace with multiple applications:

  • aether/ - Core database engine (library + CLI)
  • cask/ - Redis-compatible KV store
  • cash/ - Memcache-compatible server
  • ftdb/ - Financial transactions database

See WORKSPACE.md for detailed workspace documentation including cargo-hakari integration and development workflow.

Features

Management Tools

  • Unified CLI: Single aether command for all database operations
  • Interactive TUI: Full-screen terminal interface with vim-like navigation
  • 15 Management Commands: stat, checkpoint, backup, restore, compact, inspect, printlog, verify, deadlock, tune, recover, upgrade, dump, load, archive
  • Multiple Output Formats: Plain text, JSON, and table formats for all commands
  • Scriptable: Consistent exit codes and machine-readable output for automation

Index Types

  • B+ Tree (btree): Concurrent B+ tree with lock crabbing, range scans, prefix searches, automatic node splitting/merging, and free page chain for space reuse.
  • Skiplist (skiplist): Lock-free concurrent skiplist with WAL integration, range and prefix scans, suitable for write-heavy workloads.
  • RAX Radix Tree (rax): Compressed radix tree for string keys with prefix search, iterator-based traversal, and persistent (buffer-managed) variant.
  • Generic Index Trait (index): Unified Index and OrderedIndex traits allow plugging any index type into GenericKvStore.

LSM Tree Framework

  • Pure HanoiDB Implementation: True 2-way fractional cascading with constant 2.0x write amplification
  • Incremental Merging: Compaction work distributed across writes for stable latency (<2µs p99)
  • Work Budget System: Adaptive merge scheduling with automatic resumption across operations
  • Bloom & SuRF Filters: Negative lookup acceleration with range query support
  • Smart Sizing: Automatic level capacity adjustment based on data size
  • Multiple Merge Strategies: Fast, Predictable, and HanoiDB compaction strategies
  • Any Index + OrderedIndex implementor can serve as an LSM level

Storage Engine

  • Buffer Manager: LeanStore-style caching with swizzled pointers (HOT/COOL/EVICTED states), 24ns hot path access, clock eviction, background writeback, and buffer access strategies for scan isolation.
  • Write-Ahead Logging: Taurus WAL algorithm with lock-free per-thread streams, atomic LSN allocation, and formal TLA+ verification.
  • ARIES Recovery: Full crash recovery with analysis, redo, undo, and CLR generation. Handles crash-during-recovery.
  • Overflow Pages: Transparent large value support with chained overflow pages.
  • Value Compression (optional, zstd feature): Zstd compression with dictionary training.

Transaction Support

  • ACID transaction semantics with begin/commit/abort
  • Nested transactions with savepoints
  • Two-phase commit (2PC) for distributed transactions
  • Lock manager with hierarchical locking (IS/IX/S/SIX/X)
  • Deadlock detection via wait-for-graph

Formal Verification

TLA+ specifications verify:

  • LSN uniqueness and monotonicity
  • No data loss on crash
  • Valid buffer positions
  • Concurrent safety

Architecture

+-------------------------------------------+
|         API Layer (KvStore, DbEnv)        |
+-------------------------------------------+
|   Index Layer (B+Tree, Skiplist, RAX)     |
+-------------------------------------------+
|       LSM Framework (optional)            |
+-------------------------------------------+
|    Transactions & Recovery (ARIES)        |
+-------------------------------------------+
|    WAL (Taurus with TLA+ verification)    |
+-------------------------------------------+
|   Buffer Manager (LeanStore, Swizzling)   |
+-------------------------------------------+
|    Page Layer (Slotted, Fixed, BTree)     |
+-------------------------------------------+
|   File Storage (4KB Pages)                |
+-------------------------------------------+

See docs/ARCHITECTURE.md for detailed design.

Performance

Buffer Manager:

  • Hot path: 24ns (41M ops/s)
  • Cold path: ~100ns (hash table lookup)
  • Evicted path: ~10us (disk I/O)

WAL Throughput:

  • Lock-free per-thread streams with atomic LSN allocation
  • Linear scaling with thread count (16+ threads optimal)
  • Group commit for 5-10x throughput under concurrent load

LSM Performance (Pure HanoiDB):

  • p99 Latency: 1-2µs (2500x better than expected)
  • p999 Latency: 4-22µs (500x better than expected)
  • Write Amplification: Exactly 2.00x (constant, not variable)
  • Throughput Stability: <2x degradation with incremental merging
  • Max Latency: <1ms (no spikes with distributed compaction)

Recovery:

  • Scales linearly with log size

Building from Source

git clone https://codeberg.org/gregburd/aether.git
cd aether

cargo build --release
cargo test
cargo bench
cargo clippy -- -D warnings
cargo fmt -- --check

Requirements

  • Rust 1.75+ (MSRV)
  • TLC model checker (optional, for TLA+ verification)

Testing

Current status: 870 tests passing, 3 ignored (flaky concurrency tests)

cargo test                                 # All tests
cargo test --lib                           # Library tests (870 passing)
cargo test --lib btree::tests              # B+tree unit tests
cargo test --lib recovery::tests           # Recovery unit tests
cargo test --test integration_test         # End-to-end integration
cargo test --test proptest_btree           # Property-based B+tree tests
cargo test --test index_integration        # Multi-index integration
cargo test --test rax_tests                # RAX radix tree tests
cargo test --test lsm_tests                # LSM tree tests

Bitrot Prevention Tests:

# Fast sanity checks (verify files exist)
cargo test --test examples_test test_all_examples_exist
cargo test --test cli_tools_test test_all_cli_tools_exist

# Full tests (compile and run all examples and CLI tools)
cargo test --test examples_test -- --ignored
cargo test --test cli_tools_test -- --ignored

See docs/TESTING_EXAMPLES_AND_CLI.md for details.

Examples:

# Core Examples
cargo run --example basic_btree            # B+tree persistence and restart
cargo run --example skiplist_concurrent    # Concurrent skiplist with readers/writers
cargo run --example rax_prefix             # RAX prefix matching and iterators
cargo run --example lsm_usage              # LSM tree with BTree/Skiplist/RAX backends
cargo run --example transactions           # ACID transactions: commit, abort, isolation
cargo run --example generic_index          # Generic Index trait programming
cargo run --example kv_store               # Key-value store with range/prefix scans
cargo run --example wal_recovery           # WAL recovery demo

# Production Examples
cargo run --release --bin cash             # Memcache-compatible server with LSM persistence
cargo run --release --bin cask             # Redis-compatible server with transactions
cargo run --release --bin ftdb             # TigerBeetle-like financial transactions database
cargo run --release --example lsm_visualization  # Real-time LSM visualization (HanoiDB)

# Getting Started (C-compatible FFI examples)
./examples/getting_started/gsg_001_hello   # Hello, Aether!
./examples/getting_started/gsg_010_group_commit  # Group commit optimization
./examples/getting_started/gsg_015_adaptive_lsm  # Adaptive LSM mode switching
./examples/getting_started/gsg_020_full_stack    # Production-ready configuration

Benchmarks (use --release):

cargo bench --bench buffer_manager         # Buffer latency
cargo bench --bench wal_bench              # WAL throughput
cargo bench --bench btree_bench            # B+tree operations
cargo bench --bench index_comparison       # Index type comparison
cargo bench --bench skiplist_bench         # Skiplist operations
cargo bench --bench rax_bench              # RAX operations

Unified CLI and TUI

The aether command provides a unified interface to all database management utilities:

# Launch interactive TUI (default)
aether

# Or use CLI commands directly
aether stat                    # Statistics and monitoring
aether checkpoint --once       # Manual checkpoint
aether verify                  # Integrity verification
aether recover                 # Crash recovery
aether printlog                # WAL inspection

# Global options work across all commands
aether --home /var/db/myapp stat
aether --format json stat
aether --verbose checkpoint --once

See docs/CLI.md for complete CLI reference and docs/TUI.md for TUI guide.

Legacy Commands (Deprecated)

Legacy db_* commands are still available but deprecated (removed in v2.0):

db_stat mydb.db           # Use: aether stat
db_checkpoint mydb.db     # Use: aether checkpoint --once

See docs/CLI_MIGRATION.md for migration guide.

Configuration

Tune buffer pool and index behavior:

use aether::buffer::BufferConfig;

let config = BufferConfig {
    num_frames: 4096,           // 16MB buffer pool (4096 * 4KB)
    enable_page_provider: true, // Background pre-fetch
    ..BufferConfig::default()
};

See docs/user-guide/configuration-reference.md for all parameters.

Documentation

Getting Started:

Performance & Production:

Architecture & Development:

Legacy Documentation (being consolidated):

Browse all documentation: docs/

Inspiration

Aether DB synthesizes ideas from several influential systems:

  • log-buffer: Original inspiration for WAL design
  • ARIES Paper (Mohan et al.): Recovery protocol implementation
  • LeanStore Paper (Leis et al.): Buffer manager with swizzled pointers
  • Redis RAX: Radix tree implementation (BSD-3-Clause, see src/rax/LICENSE-REDIS-RAX)
  • HanoiDB: LSM merge strategy

See docs/INSPIRATION.md for detailed attribution.

Roadmap

v1.0.0 (Complete)

  • Persistent B+ Tree with WAL logging
  • ARIES recovery protocol
  • Buffer manager with swizzled pointers
  • Taurus WAL algorithm with formal TLA+ verification
  • ACID transaction support
  • Group commit, checkpointing, compression

v1.1.0 (Complete)

  • Multiple index types (Skiplist, RAX radix tree)
  • Generic index trait hierarchy
  • LSM tree framework with pluggable compaction
  • Persistent index variants with buffer manager integration
  • WAL integration for skiplist and RAX operations
  • Major refactoring: modularized codebase (25 files >500 lines reduced)
  • Enhanced test coverage (870 passing tests)
  • Comprehensive documentation (2,192 doc comments)
  • 32 working examples demonstrating all features
  • Production-ready code quality

v1.2.0 (Complete)

  • Unified aether CLI command replacing 15 separate utilities
  • Interactive TUI with vim-like navigation
  • Consistent output formats (plain, JSON, table)
  • Enhanced CLI framework with shared utilities
  • Complete documentation (CLI.md, TUI.md, CLI_MIGRATION.md)
  • Backward compatibility with legacy db_* commands

v2.0.0 (Planned)

  • MVCC/Snapshot Isolation
  • Lock-free read path with EBR
  • Buffer pool partitioning
  • SQL query layer

License

Licensed under the MIT License. See LICENSE for details.

The RAX radix tree implementation is derived from antirez/rax and is licensed under BSD-3-Clause. See src/rax/LICENSE-REDIS-RAX for the full license text.

Contributing

Contributions welcome! Please see CONTRIBUTING.md for guidelines.

Acknowledgments

  • C. Mohan et al. for the ARIES recovery protocol
  • Viktor Leis et al. for LeanStore buffer management
  • Sunny Bains for log-buffer inspiration
  • Salvatore Sanfilippo for the RAX radix tree
  • The Rust community for excellent systems programming tools

Citation

If you use Aether DB in your research, please cite:

@software{aetherdb2026,
  title = {Aether DB: A Formally-Verified Database Storage Engine},
  author = {Greg Burd},
  year = {2026},
  url = {https://codeberg.org/gregburd/aether}
}

Support