High-performance lock-free messaging for Rust
Features • Performance • Quick Start • Architecture
Overview
Kaos provides lock-free ring buffers for inter-thread, inter-process, and network communication. Built on the LMAX Disruptor and Aeron high performance networking patterns with modern Rust.
Note: Preview release. APIs may change. Even though safe counterparts provided as much as I could do still uses unsafe in some hot paths, so until 1.0 consume with a grain of salt.
Crates
| Crate | Description |
|---|---|
| kaos | Lock-free ring buffers (SPSC, MPSC, SPMC, MPMC) |
| kaos-ipc | Shared memory IPC via mmap |
| kaos-rudp | Reliable UDP with NAK/ACK, mux server |
| kaos-archive | Persistent message archive |
| kaos-driver | Media driver (IPC ↔ UDP bridge) |
| kaos-shared | Shared protocol types |
Features
| Category | Feature | Status |
|---|---|---|
| Core | Lock-free ring buffers | ✅ |
| SPSC, MPSC, SPMC, MPMC | ✅ | |
| Batch operations | ✅ | |
| IPC | Shared memory (mmap) | ✅ |
| Zero-copy reads | ✅ | |
| Network | Reliable UDP | ✅ |
| Congestion control (AIMD) | ✅ | |
| Archive | Persistent message storage | ✅ |
| Retransmission from disk | ✅ | |
| Late joiner replay | ✅ | |
| Linux | sendmmsg/recvmmsg batch I/O | ✅ |
| io_uring | ✅ | |
| AF_XDP kernel bypass | ||
| NUMA / thread affinity | ||
| Observability | Tracing / Tracy | ✅ |
Observability
Feature-gated via tracing crate. Zero-cost when disabled.
Console Logs
kaos = { version = "0.1", features = ["tracing"] } tracing-subscriber = "0.3"
tracing_subscriber::fmt::init();
2024-01-15T10:30:45 TRACE send bytes=64
2024-01-15T10:30:45 TRACE recv bytes=64
2024-01-15T10:30:46 WARN backpressure
Tracy Profiler
Real-time visualization of latency, throughput, and bottlenecks.
kaos = { version = "0.1", features = ["tracy"] }
brew install tracy # Install cargo run -p kaos --example profile --features tracy --release # Run profiler tracy # Connect → 127.0.0.1
See PROFILING.md for detailed guide.
Performance
Measured on Apple M1 Pro (cargo bench).
| Benchmark | Kaos | Aeron |
|---|---|---|
| Ring buffer (batch) | 3.2-3.6 G/s | — |
| IPC (8B messages) | 150 M/s | — |
| RUDP (500K events) | 2.9-3.4 M/s | 3.5 M/s |
| Driver mode (IPC→UDP) | 5-6.5 M/s | 3.5 M/s |
| Archive write | 30 M/s | — |
# Run benchmarks cargo bench -p kaos --bench bench_trace -- "100M" cd ext-benches/disruptor-rs-bench && cargo bench --bench bench_trace_events cd ext-benches/disruptor-java-bench && mvn compile -q && \ java -cp "target/classes:$(mvn dependency:build-classpath -q -Dmdep.outputFile=/dev/stdout)" \ com.kaos.TraceEventsBenchmark
API Selection Guide
| Use Case | Ring Buffer | Producer | Speed |
|---|---|---|---|
| Fastest (single producer) | RingBuffer |
CachedProducer |
2.1 G/s |
| Broadcast (fan-out) | BroadcastRingBuffer |
direct | 1.1 G/s |
| Multi-producer, single consumer | MpscRingBuffer |
CachedMpscProducer |
390 M/s |
| Work distribution | SpmcRingBuffer |
direct | 1.1 G/s |
| Full flexibility | MpmcRingBuffer |
CachedMpmcProducer |
30 M/s |
When to use Cached* producers:
CachedProducer- Caches consumer position, avoids atomic loads on hot pathCachedMpscProducer- Same caching + closure API for zero-copy writesCachedMpmcProducer- Same caching + closure API, essential for MPMC performance
Rule of thumb: Always prefer Cached* producers when available.
Quick Start
Batch API
use kaos::disruptor::{RingBuffer, Slot8}; let ring = RingBuffer::<Slot8>::new(1024)?; // Producer: claim batch, write, publish if let Some((seq, slots)) = ring.try_claim_slots(10, cursor) { for (i, slot) in slots.iter_mut().enumerate() { slot.value = i as u64; } ring.publish(seq + slots.len() as u64); } // Consumer: read batch, advance let slots = ring.get_read_batch(0, 10); ring.update_consumer(10);
Per-Event API
use kaos::disruptor::{RingBuffer, Slot8, CachedProducer}; let ring = Arc::new(RingBuffer::<Slot8>::new(1024)?); let mut producer = CachedProducer::new(ring.clone()); // Publish with in-place mutation producer.publish(|slot| { slot.value = 42; });
Archived RUDP
Combine reliable UDP with persistent archive for:
- Retransmission from disk — When ring buffer wraps, retransmit from archive
- Late joiner replay — New subscribers catch up from any sequence
- Crash recovery — Resume from persisted state
use kaos_rudp::ArchivedTransport; let mut transport = ArchivedTransport::new( "127.0.0.1:9000".parse().unwrap(), "127.0.0.1:9001".parse().unwrap(), 65536, // Ring buffer window "/tmp/rudp-archive", // Archive path 1024 * 1024 * 1024, // 1GB archive ).unwrap(); // Send — automatically archived for durability transport.send(b"hello").unwrap(); // Retransmit from archive (even after ring buffer wrapped) transport.retransmit_from_archive(sequence_number); // Replay range for late joiner transport.replay(0, 1000, |seq, msg| { println!("Replaying seq {}: {} bytes", seq, msg.len()); });
Enable with feature flag:
kaos-rudp = { version = "0.1", features = ["archive"] }
Architecture
Testing
# Unit tests cargo test --workspace # Loom concurrency verification (exhaustive state exploration) RUSTFLAGS="--cfg loom" cargo test -p kaos --test loom_ring_buffer --release # Memory analysis (macOS) leaks --atExit -- ./target/release/examples/spsc_basic # Memory analysis (Linux) cargo valgrind run --example spsc_basic -p kaos --release
Loom tests verify lock-free correctness by exploring all possible thread interleavings. See kaos/tests/loom_ring_buffer.rs.
Profiling guide with flamegraphs, valgrind, leaks and Instruments: kaos/docs/PROFILING.md
Platform Support
| Platform | Status |
|---|---|
| macOS ARM64 | ✅ Tested |
| Linux x86_64 | ✅ Tested |
| Windows | Not supported |
Design Principles
- Lock-free — Atomic sequences, no mutex contention
- Zero-copy reads — Consumers get direct slice access (writes copy to buffer)
- Cache-aligned — 128-byte padding prevents false sharing
- Batch operations — Amortize synchronization overhead
Glossary
| Term | Meaning |
|---|---|
| SPSC | Single Producer, Single Consumer |
| MPSC | Multiple Producers, Single Consumer |
| SPMC | Single Producer, Multiple Consumers |
| MPMC | Multiple Producers, Multiple Consumers |
| CAS | Compare-And-Swap (atomic operation for lock-free coordination) |
| IPC | Inter-Process Communication |
| mmap | Memory-mapped file (shared memory) |
| RUDP | Reliable UDP (guaranteed delivery) |
| NAK | Negative Acknowledgment (request retransmit) |
| ACK | Acknowledgment (confirm receipt) |
| AIMD | Additive Increase, Multiplicative Decrease (congestion control) |
| io_uring | Linux async I/O interface |
| AF_XDP | Linux kernel bypass for networking |
| sendmmsg | Linux batched send syscall |
License
MIT OR Apache-2.0
Inspired by LMAX Disruptor and Aeron