HWE Bench: A new unbounded Benchmark for LLMs (GPT 5.5 is on top)

6 points by fesens 2 months ago · 3 comments

Reader

fesensOP 2 months ago

Current benchmarks have ceilings, usually 100%. This benchmark aims to be a long lasting, high correlation with the ability to solve real world problems and follow complex instructions, and unbounded (meaning it can always go higher).

paulobeckhauser 2 months ago

Very nice!!

fabiofachini92 2 months ago

Amazing!

Settings

HWE Bench: A new unbounded Benchmark for LLMs (GPT 5.5 is on top)

Keyboard Shortcuts