Show HN: Spice Cayenne – SQL acceleration built on Vortex

42 points by lukekim 11 days ago · 4 comments · 2 min read

Reader

Hi HN, we’re Luke and Phillip, and we’re building Spice.ai OSS - a lightweight, portable data and AI engine and powered by Apache DataFusion & Ballista for SQL query, hybrid-search, and LLM-inference across disaggregated-storage used by enterprises like Barracuda Networks and Twilio.

We first introduced Spice [1] on HN in 2021 and re-launched it on HN [2] in 2024 re-built from the ground up in Rust.

Spice includes the concept of a Data Accelerator [3], which is a way to materialize data from disparate sources, such as other databases, in embedded databases like SQLite and DuckDB.

Today we’re excited to announce a new Ducklake-inspired Data Accelerator built on Vortex [3], a highly performant, extensible columnar data format that claims 100x faster random access, 10-20x faster scans, 5x faster writes with a similar compression ratio vs. Apache Parquet.

In our tests with Spice, Vortex performs faster than DuckDB with a third of the memory usage, and is much more scalable (multi-file). For real-world deployments, we see the DuckDB Data Accelerator often capping out around 1TB, but Spice Cayenne can do Petabyte-scale.

You can read about it at https://spice.ai/blog and in the Spice OSS release notes [4].

This is just the first version, and we’d love to get your feedback!

GitHub: https://github.com/spiceai/spiceai

[1] https://news.ycombinator.com/item?id=28448887

[2] https://news.ycombinator.com/item?id=39854584

[3] https://github.com/vortex-data/vortex

[4] https://spiceai.org/blog/releases/v1.9.0

zX41ZdbW 4 days ago

Let's add Spice to ClickBench: https://benchmark.clickhouse.com/

gjvc 11 days ago

how does this compare to CedarDB ?

lukekimOP 11 days ago

CedarDB is a super cool project, we're fans!
CedarDB focuses on being a high-performance HTAP database whereas Spice's was built from day 1 to enable high-peformance data and search for data-intensive applications and AI.
So Spice natively has data acceleration, federation, hybrid-search (vector + BM25 full-text-search), and LLM inference in the core runtime so you can zero-copy data across them, which you would not normally see in a database like CedarDB.
OutOfHere 11 days ago

CedarDB is not free to use beyond a size limit.

Settings

Show HN: Spice Cayenne – SQL acceleration built on Vortex

Keyboard Shortcuts