Show HN: Beating SOTA embeddings on DeepMind's LIMIT benchmark (94% vs. 18%)
github.comDeepMind's recent paper "On the Theoretical Limitations of Embedding-Based Retrieval" identified a capacity bottleneck in dense embeddings, showing that even SOTA models like E5-Mistral and GritLM struggle on their LIMIT benchmark (scoring ~8-18% Recall@100).
I hypothesized that this isn't a retrieval limit, but a compression limit.
I built Numen, a retrieval engine based on high-dimensional sparse-dense n-gram hashing (32k dimensions) rather than learned embeddings.
The Results (on LIMIT test set):
BM25 (Baseline): 93.6% E5-Mistral: 8.3% GritLM 7B: 12.9% Numen (My implementation): 93.9% It beats BM25 while maintaining a vector architecture, completely sidestepping the geometric bottleneck of dense models.
The benchmark script ( numen.ipynb ) is in the repo for reproduction.
No comments yet.