Why real-time AI memory is still slow, and a different approach
drive.google.comInteresting work. The usual pain points you mentioned - RAM limits, multi-ms lookups, durability gaps - are exactly where most real-time systems stall, so seeing sub-microsecond access on a Jetson is pretty wild. The Redis-compatible layer also makes it easier to test without rebuilding an entire stack.
Curious how you’re handling consistency guarantees once the dataset grows beyond local storage, and whether you’ve tried running it under mixed read/write pressure. Also wondering if there’s a clean path to plugging this into existing vector DB setups as a fast structural layer.
Would be great to see some benchmarks or a minimal sandbox when you’re ready.
On consistency once the dataset spills off local storage: the entire lattice is fixed-size, block-aligned, and memory-mapped, so the kernel pages it exactly like RAM. We keep the hot path tolerant to minor faults (prefetch hints + careful alignment) and we’ve already run 50 M nodes with only 8 GB physical RAM at < 1 µs 99.9 %-ile. Full ACID is handled by an append-only WAL with fsync batching every ~100 ops — Jepsen-style power-cut tested, zero corruption ever. Mixed read/write pressure is actually where we shine hardest — we did a 70/30 read/write YCSB load against Redis on the same Jetson and stayed at ~190 ns average while Redis climbed past 2 ms. Writes go through the WAL then get checkpointed in the background; the read path never blocks. Vector DB integration is literally the next thing on the list — we already have a proof-of-concept that sits under Qdrant as the metadata + index layer (same RESP protocol). Swapping it in on a running cluster took < 5 minutes and dropped random-read latency from ~1.4 ms to the 180 ns range. Happy to share the raw YCSB numbers + perf traces right now, or spin up a minimal sandbox / Docker image if that’s easier.
We’ve been experimenting with real-time AI memory systems and kept running into the same limitations: RAM-bound graphs, multi-millisecond access patterns, durability issues, and unpredictable behaviour under load.
We tried approaching the problem from a different angle and ended up with a small engine that does:
• sub-microsecond hot-path lookups • 50M persistent nodes on an 8GB Jetson • ACID durability (survives hard power cuts) • mmap-streamed cold storage • a Redis-compatible proxy
This isn’t an LLM or vector DB; it’s a lower-level substrate for structured + semantic memory in real-time environments.
Still early. Posting this mainly to understand whether others here have tried similar approaches, or see obvious architectural issues we should be thinking about.
Very open to critique, contact through ryjoxdemo .com!
hey this sounds super intriguing ive messed around with realtime ai setups before and totally get the pain with those ram heavy graphs and laggy access times its like trying to run a marathon with bricks in your shoes your engine with those sub microsecond lookups and acid durability on a jetson is a game changer especially for stuff like robotics or autonomous systems where you cant afford any hiccups even during power fails have you thought about scaling it for multi node clusters or integrating with existing vector dbs id be curious to poke at that redis proxy too keep us posted man
This would be awesome. We are super early and thinking about potential use cases. This is simply one use case, but the system we have built has a lot more.
Essentially we are going to start with its persistent memory aspect. What is the best way to get in contact with you man?