Ask HN: AI progress – what are the main ways people measure it?
I’m interested in how AI progress is currently evaluated and trying to build a list of the major approaches people actually use.
I’m aware that all of these measures have limitations and that many are controversial or imperfect by design. I’m not assuming they’re “good” or that they cleanly map to real-world capability.
I’d love to hear:
- What measures, benchmarks, or methodologies you think belong on this list
- What you see as their key strengths and failure modes
- How (or whether) you personally use them to interpret AI progress
My goal here is discovery and understanding, not to defend or attack any particular framework.
No comments yet.