Show HN: Spice Cayenne – SQL acceleration built on Vortex
spice.aiHi HN, we’re Luke and Phillip, and we’re building Spice.ai OSS - a lightweight, portable data and AI engine and powered by Apache DataFusion & Ballista for SQL query, hybrid-search, and LLM-inference across disaggregated-storage used by enterprises like Barracuda Networks and Twilio.
We first introduced Spice [1] on HN in 2021 and re-launched it on HN [2] in 2024 re-built from the ground up in Rust.
Spice includes the concept of a Data Accelerator [3], which is a way to materialize data from disparate sources, such as other databases, in embedded databases like SQLite and DuckDB.
Today we’re excited to announce a new Ducklake-inspired Data Accelerator built on Vortex [3], a highly performant, extensible columnar data format that claims 100x faster random access, 10-20x faster scans, 5x faster writes with a similar compression ratio vs. Apache Parquet.
In our tests with Spice, Vortex performs faster than DuckDB with a third of the memory usage, and is much more scalable (multi-file). For real-world deployments, we see the DuckDB Data Accelerator often capping out around 1TB, but Spice Cayenne can do Petabyte-scale.
You can read about it at https://spice.ai/blog and in the Spice OSS release notes [4].
This is just the first version, and we’d love to get your feedback!
GitHub: https://github.com/spiceai/spiceai
[1] https://news.ycombinator.com/item?id=28448887
[2] https://news.ycombinator.com/item?id=39854584
[3] https://github.com/vortex-data/vortex
[4] https://spiceai.org/blog/releases/v1.9.0 Let's add Spice to ClickBench: https://benchmark.clickhouse.com/ how does this compare to CedarDB ? CedarDB is a super cool project, we're fans! CedarDB focuses on being a high-performance HTAP database whereas Spice's was built from day 1 to enable high-peformance data and search for data-intensive applications and AI. So Spice natively has data acceleration, federation, hybrid-search (vector + BM25 full-text-search), and LLM inference in the core runtime so you can zero-copy data across them, which you would not normally see in a database like CedarDB. CedarDB is not free to use beyond a size limit.