Settings

Theme

Hybrid search (BM25/vectors/RRF) barely improved over pure semantic

1 points by pjmalandrino 2 months ago · 0 comments · 1 min read


My setup: ~600 technical docs (50 pages avg, lots of schemas/diagrams), chunked and embedded with BGE-M3, PgVector as vector DB.

Semantic retrieval was ok but not great on our technical docs. Read everywhere that hybrid search with RRF was supposed to be the next level. Implemented it, BM25 + vector + RRF fusion.

Result: almost no improvement. Like, negligible.

Am I missing something obvious? Is hybrid overhyped on technical docs with lots of schemas/tables or is my setup just broken?

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection