Ask HN: What are the main measures of AI progress?
I’m interested in how AI progress is currently evaluated and trying to build a list of the major approaches people actually use.
I’m aware that all of these measures have limitations and that many are controversial or imperfect. My goal is discovery and understanding, not to defend or attack any particular framework.
I’d love to hear:
- What measures, benchmarks, or methodologies you think belong on this list
- What you see as their key strengths and failure modes
- How (or whether) you personally use them to interpret AI progress There'd first have to be an intense evaluation and standardization process for AI / measuring AGI now. All current benchmarks are tailored to one use case (e.g. SWE) or are evaluations that can be gamed and manipulated. I think this would take the form of something more abstract instead of concrete with raw numbers, like a revised Turing Test. Yeah. I think the Turing Test has passed its sell-by date. As all things inevitably do. I'd be interested in how the "revised Turing Test" you propose looks. I'm not smart enough to know what that'd be, but it'd be interesting as a starting point. It's a great question that I haven't seen discussed on HN yet (though I'm not that active), I think this crowd is still very attuned to interesting but more deterministic problems technically. This might sound basic but I keep coming back to this idea again and again. Alex Garland really did have the right idea with Ex-Machina, where in the film Caleb claims that he purposefully designed Ava (the AI robot) to have all the internal mechanisms shown, so people would understand always they are interacting with a machine. The point of his Turing test was to show whether they could see past the machine and still empathize with it as a human.