Settings

Theme

PaperQA2 tops the RAG-QA Arena science benchmark

futurehouse.org

1 points by mskar 10 months ago · 1 comment

Reader

mskarOP 10 months ago

We measured PaperQA2 (https://github.com/Future-House/paper-qa) against the science portion of the RAG-Arena benchmark (https://arxiv.org/abs/2407.13998), it's the first time we've compared PaperQA2 against other systems based on Cohere or Contextual.ai. PaperQA2 achieves a 12.4% higher score than Contextual.ai on the same dataset (1,404 questions and 1.7M documents).

We're thrilled about this because it's open source, and getting better every day -- check out the code to reproduce this result in our cookbook here: https://futurehouse.gitbook.io/futurehouse-cookbook/paperqa/....

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection