Settings

Theme

Sam Altman's blind spot on AI model power

vibesbench.substack.com

3 points by firasd 2 months ago · 1 comment

Reader

firasdOP 2 months ago

"Model power" is being conflated with benchmark and reasoning scores.

There are models that do well on academic or coding benchmarks but have poor world knowledge and conversational fluency. That gap isn't visible in standard evals, but it matters a lot for actual use.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection