Settings

Theme

Autoselect the best AI model for any health question using HealthBench scores

twitter.com

2 points by shipilovya 7 months ago · 1 comment

Reader

shipilovyaOP 7 months ago

Last week OpenAI released HealthBench, the most comprehensive set of evals for health to date. The top 3 scoring models all spiked on different things:

- GPT-4.1 is best when you need a straight answer - o3 is best for complex cases - Grok is best at clarifying important info (“truthseeking”)

Made this prototype mostly to understand HealthBench deeper. I will probably use it in the future products I make.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection