Settings

Theme

Show HN: Multimodal Benchmarks

github.com

2 points by Beefin 14 days ago · 0 comments · 1 min read

Reader

I built a set of open multimodal retrieval benchmarks because existing IR evals are still mostly text-only and don’t capture real-world complexity.

This repo includes ground-truth datasets, queries, and relevance judgments for 3 hard domains:

• Financial documents (SEC filings with tables, charts, footnotes) • Medical device IFUs (diagrams, nested sections, regulatory language) • Educational videos (temporal alignment, code + lecture context)

Runs in ~1 second with demo data. Leaderboards + evaluator included. Contributions welcome.

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection