How to Pick a Vector Database – Quantitative Analysis

4 points by crazy_marksman 2 years ago · 6 comments

Reader

andre-z 2 years ago

>> Qdrant stores both the vectors and the metadata in a sqlite database. LOL Guys, before developing a DB, learn to write and read code. Qdrant is not using any third party db solutions under the hood (besides Rocks DB for metadata) but written from scratch in Rust. What you benchmarking is the local mode, a client side implementation for quickly trying out stuff without starting a docker. EvoDB, another "Database" written in PURE Python.

woshiwan 2 years ago

Thanks for your feedback. We should have emphasized that we are benchmarking Qdrant in local mode. Based on our analysis of the local mode, Qdrant persists the index in this manner: at the directory where qdrant persists index /data/workspace/vector-benchmark/trainQDRANTIndex
> du -sh storage.sqlite
2.0G storage.sqlite
--------------------------------------
> sqlite3 storage.sqlite
--------------------------------------
> sqlite> PRAGMA table_info(points);
0|id|TEXT|0||1
1|point|BLOB|0||0
--------------------------------------
> sqlite> select count(*) from points;
1000000
--------------------------------------
> sqlite> select * from points limit 1;
gARLAS4=|��,
--------------------------------------
We have updated the post to clarify that Qdrant is being evaluated in local mode.
- timvisee 2 years ago
  
  I think it shouldn't be in this article at all. You're comparing a tool that is not optimized nor efficient in any way against others that are. That while Qdrant has a very well optimized offering as well. This is comparing a walker against modern cars and paints a bad picture.

crazy_marksmanOP 2 years ago

Key Insights:

1. Many database back up their data in a sqlite database. Some even push vectors into sqlite, but others store vectors in their own format.

2. Qdrant has higher client connection and index initialization time that can shadow its benefit on fast and accurate vector search.

generall 2 years ago

This article contains a lot of inaccuracies.
Based on your statements, like
> Qdrant stores both the vectors and the metadata in a sqlite database.
It looks like you have benchmarked local mode of qdrant. It doesn't even use vector indexes and is not designed for any kind of production usage.
For anyone reading this article, I urge you to do your own benchmarks and not rely on claims that do not have open source code attached to them to replicate the results
- jarulraj 2 years ago
  
  Hi Andrey. Thanks for your feedback. We should have better emphasized that we are benchmarking Qdrant in local mode. We have updated the post to clarify that Qdrant is being evaluated in local mode. We plan to next evaluate the server mode.
  We went with the local mode as several Python AI apps are using Qdrant in that mode based on the suggestion here: https://qdrant.tech/documentation/quick-start/.
  We also believe in open-sourced benchmark code. Please find the code here: https://github.com/jiashenC/vectordb-benchmark-and-optimize/....

Settings

How to Pick a Vector Database – Quantitative Analysis

Keyboard Shortcuts