Qdrant 1.7.0
qdrant.tech> Traditional keyword-based search mechanisms often rely on algorithms like TF-IDF, BM25, or comparable methods. While these techniques internally utilize vectors, they typically involve sparse vector representations. In these methods, the vectors are predominantly filled with zeros, containing a relatively small number of non-zero values. Those sparse vectors are theoretically high dimensional, definitely way higher than the dense vectors used in semantic search. However, since the majority of dimensions are usually zeros, we store them differently and just keep the non-zero dimensions.
Yeah Heap's Law is a bit of a bitch in these situations. Like you'll definitely make sure those vectors are 64 bit if you plan of indexing a proper large number of documents.
I'd also advise caution leaning too much into the vector interpretation of these algorithms, as it's largely viewed as a quaint historical artifact that was bit of a dead end (e.g. as in Croft, Metzler & Strohman 7.2.1)
> e.g. as in Croft, Metzler & Strohman 7.2.1
Actually I think I transposed two digits in that reference, it's 7.1.2.
I like their pricing page[0] and their business model which it shows: (1) Apache-2.0 license source library, (2) Free to try and tiny hosting, (3) profit from serious hosting and/or consulting services.
I was looking for the fine print on their "Try For Free"/"Free Tier Available" and was pleasantly surprised by
Qdrant Vector Search Cloud
Start building now!
A free forever 1GB cluster included for trying out.
No credit card required.
[0] https://qdrant.tech/pricing/Just the other day I played with qdrant, using its Python client. Pretty smooth onboarding experience.
I came across two questions. Perhaps some kind folks with more experience can shed some light on these qdrant use cases.
1. for embeddings for use cases such as LLM chat bots, I split internal data into chunks. Those chunks are then vectorized and stored. Alongside the entry itself, I stored the original chunk in metadata. That way, a lookup can immediately feed that into the LLM prompt context, without lookup in a secondary data store by some ID. Feels like a hack. Is that a sensible use case?
2. I resorted to using `fastembed` and generated all embedding client-side. Why is it that qdrant queries, in the ordinary case (also showcased a lot in their docs, e.g. [0]), expect a ready-made vector? I thought the point of vector DBs was to vectorize input data, store it, and later vectorize any text queries themselves?
Having to do all that client-side feels besides the point; for example, what if two separate clients use separate models (I used [1])? Their vectorizations will differ. I thought the DB is the source of truth here.
In any case, fascinating technology. Thanks for putting it together and making it this accessible.
[0]: https://qdrant.tech/documentation/quick-start/#run-a-query
[1]: `sentence-transformers/all-MiniLM-L6-v2`, following https://qdrant.tech/documentation/tutorials/neural-search-fa...
Your observations for using a vector DB for retrieval-augmented generation are consistent with my own.
For my applications, I use pgvector since I can also use fulltext indexes and JOINs with the rest of my business logic which is stored in a postgres database. This also makes it easier to implement hybrid search, where the fulltext results and semantic search results are combined and reranked.
I think the main selling-point for standalone vector databases is scale, i.e., when you have a single "corpus" of over 10^7 chunks and embedding vectors that needs to serve hundreds of req/s. In my opinion, the overhead of maintaining a separate database that requires syncing with your primary database did not make sense for my application.
1. Yes, that's reasonable and saves running another DB
2. You often can perform the embedding in the DB, but there are a lot of use cases where you want to manage your embedding models outside the DB. This way you aren't dependent on which models the DB supports and you don't duplicate them throughout your system
You can have a look at this sheet: https://docs.google.com/spreadsheets/d/170HErOyOkLDjQfy3TJ6a...
It shows which Vector DBs have a particular feature. "In-built Text Embeddings creation" is a column you can look at.
Qdrant is the vectordb that ChatGPT and Grok use (e.g. when you add docs to a custom GPT or tweets in Grok)
Interesting they both do.
Does Qdrant look like a winning horse then?
Was about to use Weaviate for a project today and this gives me pause. Anyone have some strong opinions? pg_vector also been on my radar recently. Qdrant vs Weaviate I know is partially a rust vs go topic.
As another signal, check out this report by Streamlit, which shows the popularity of different vector databases among Streamlit apps: https://state-of-llm.streamlit.app/#third
Faiss and Pinecone are at the top (disclosure: I'm from Pinecone). But Faiss isn't really a full-fledged vector DB. Pinecone is a managed option which is out of the question for a company like Twitter and maybe for you (although you should consider it). After that comes Chroma in third, and then Qdrant, and then Weaviate.
Chroma has a big following by virtue of being plugged into the AI ecosystem in SF. Qdrant seems to be doing great work but their location in Europe is probably not helping.
Regarding your last sentence: the European HQ might not exactly help for non-EU customers, but much much more so for actual EU customers (which is a multi billion Dollar market by itself). Sensitive EU companies would not use Pinecone, even if they wanted to.
We have been working this year to increase our US presence, and we're hiring now: https://join.com/companies/qdrant/9929148-cloud-platform-dev...
Source: I work at Qdrant from the US :)
The local API mode is very nice feature over others. Having an almost sqlite3 style local db that then works the same in server mode is a very good feature.
I will be trying it instead of Weaviate this week. I’m also slightly confused about embedding vector generation vs other db who have that built in. Just need to read more I guess
Interestingly the effect of location is also visible when you compare LMQL and Guidance for constrained generation. LMQL seems to be a great option but has received much less attention compared to Guidance, maybe partly due to the fact that it comes from Europe not the US.
Disagree on location as a determining factor for great technology. You’re citing stats around market adoption because of marketing - not quality of the technology.
If Twitter chose to use Qdrant for Grok, it doesn’t matter that Qdrant is out of Berlin.
What matters is that Qdrant is the most performant, and it’s an open-source vectordb, not a closed-source vectordb like Pinecone.
We've been using qdrant in production for over a year. It's excellent and the team are very responsive to the few issues we've had. Qdrant does one job and scales well.
(disclosure: I work at Weaviate) I think it depends on your use-case. From what I've seen, Weaviate and Qdrant have similar offerings in terms of features, open-sourceness, flexible deployment, and integrations with other services. Weaviate does provide certain things easier to set up, like built-in hybrid search, modules for integrations, vectorization, etc. (so it's just a one-line config change), and it's all custom-built from the ground up. But Qdrant gets a lot of support from the Rust community and has a slightly more flexible free cloud tier.
Also, don't believe everything posted on the internet ;)
I've been using pgvector, it has worked as expected, which is all I want. Personally my choice was based entirely on the fact that I already use postgres, and that it will still probably exist in it's current form after the dust settles on the vector db market.
I’m interested in building a locally ran app. Is qdrant appropriate for that? Is it like SQLite where there is little overhead for doing a server less implementation?
If you will be the only app user, then the Python SDK's local mode might be suitable. However, in the long run, when you decide to publish the app, you rather have to switch to an on-premise or cloud environment. Using Qdrant from the very beginning might be a good idea, as the interfaces are kept the same, and the switch is seamless.
Local mode: https://github.com/qdrant/qdrant-client#local-mode
I was in that spot a few weeks ago. My requirements were not huge but a) I was on Windows and b), didn't want to waste too much time setting it up.
Tried a few DBs that didn't work well (e.g. I think it was ChromaDB that didn't support Python 3.12) and ended up picking LanceDB.
Very simple onboarding (just built on top of parquet) but there are a few rough edges.
Curious how it compares with qdrant for non-crazy problems
I'm unsure if there is any comparison of LanceDB and Qdrant available out there, but there shouldn't be any issues with Python 3.12 and qdrant-client compatibility. Windows is also not a problem, as the typical local setup is usually based on Docker. Are there any specific features you are interested in?
Would also be curious. Wondering what the state of the art is for local vector stores. i.e. the sqlite of vector stores.
I’m a big fan of Qdrant, I also have heard rumors than OpenAI uses Qdrant as their vector database of choice.
I’ve been building a Hasura Data Connector for Qdrant and it’s been too much fun. Glad to see them getting talked about here.
Qdrant is great vector DB ...with the strangest hero image on their release announcement. A robot crab with a galleon in the background??
In rust we trust. I think the whole thing is built with rust hence the crab references all over the place
Any suggestion what should one read/watch to understand the difference between this and relational DB?
https://qdrant.tech/articles/dedicated-service/ - we have some arguments on this