Accelerated SQL for JSON with AVX512 (Golang)
github.comThis is a killer idea. However , I do not see anything in the README about distributed querying. Is that something you wish to tackle?
Also, any benchmarks comparing this to Apache Arrow or Apache Presto?
Hi! One of the authors here. We do have support for distributed querying, but it's not implemented in the command-line tool. (It makes for a much more complicated demo if you need multiple machines.) The query planner is happy to use as many machines as you can throw at it.
We don't yet have good comparative benchmarks against Arrow or Presto, although I'm hoping we can get those.
Sneller head of product here. Arrow is a data exchange format, are you referring to benchmarking against DataFusion or Ballista? Also, on Presto - we did early benchmarks against Amazon's Athena (Presto under the covers) running on parquet, and will rerun these benchmarks shortly. The interesting thing to note vs Presto is that it is clunky to use with raw JSON - see https://prestodb.io/docs/current/functions/json.html. While benchmarking against Athena we actually used AWS Glue (Spark under the hood) to transform JSON into parquet, but that adds both complexity and latency to the overall pipeline, which doesn't show up in just query timings
If you check out the Kubernetes folder in the repo, then you find the Kubernetes setup to run in a distributed environment (that is also highly available).
Disclaimer: I'm one of the authors of sneller core. We have been working on this project for more than a year. It's has got neat AVX512-centered architecture and many neat tricks inside.
Am I missing the comparison of avx ve non-avx performance?
Sneller founder here: we do not have any non-AVX code so we cannot compare directly against that. But generally speaking our code always works on 16 lanes in parallel per core, so that gives a huge speed-up.
This code is really nice. How will you profit?
Thank for for the feedback, that is nice to hear. And as for the business question, we plan on launching a Sneller Cloud offering. (Sneller founder here)