Running out of places to move the goalposts to
nickdrozd.github.ioI just commented about this in another thread. I know there has been some walking back e.g. of the significance of a Turing test but I think overall the goalposts for AI have shifted in the other way, to narrowing down the definition of intelligence to something like “being really good at some set of defined tasks” which coincidentally is basically the strong point of neural networks.
We seem hyperfocused on finding more tasks to train neural networks to do. This of course leads to a moving goalpost effect like in the article, but they’re moving along an axis that doesn’t measure intelligence.
My other comment: https://news.ycombinator.com/item?id=46445511
What would be a better way to measure intelligence?
I like arcprize.org's approach.
> AI seems to have caught up to my own intelligence even in those narrow domains where I have some expertise. What is there left that AI can’t do that I would be able to verify?
The last few days I've been working on some particularly tricky problems, tricky in the domain and in backwards compatibility with our existing codebase. For both these problems GPT 5.2 has been able to come to the same ideas as my best, which took me quite a bit of brain racking to get to. Granted it's required a lot of steering and context management from me as well as judgement to discard other options. But it's really getting to the point that LLMs are a good sparring partner for (isolated technical) problems at the 99th percentile of difficulty
You steered a sycophantic LLM to the same idea that you had already had & think that's worth bragging about?
I'm well ware that they can be sycophantic, and I structure things to avoid that like asking "what do you think of this problem" and seeing the idea fall out rather than providing anything that would suggest it. In one of these two cases it took an idea that I had inkling of, fleshed it out, and expanded it to be much better than I had.
And I'm not bragging. I'm expressing awe, and humility that I am finding a machine can match me on things that I find quite difficult. Maybe those things aren't so difficult after all.
By steering I mean more steering to flesh out the context of the problem and to find relevant code and perform domain-specific research. Not steering toward a specific solution.
The article mentions a personal goalpost involving Busy Beavers.
Mine is: write a nroff document that executes at least one macro, and is a quine.
How would your views about AI change if that goal were achieved? When my personal goal was reached, I found myself a little bit at a loss for words.
That's a good question, one I thought of, but have put off grappling with.
Based on what LLMs have given me for answers so far, I'd look harder for the human-written source of the nroff code. I have written what I believe to be the only quine in the GPP macro processing language, LLMs only refer me to my code if I ask for a GPP quine. Google, Meta, OpenAI really have strip mined the entire web.
If I genuinely thought anything creative or new appeared, I'd probably be at a loss as well.
I gave a few attempts with ChatGPT and DeepSeek. Neither of them could get it right. So this goalpost can remain in place for the time being.
(I am assuming that the task is actually possible to accomplish. If it isn't possible, then it isn't a very good goalpost!)
It should be possible. nroff macro language has looping, string interpolation, functions and if/then/else. That macro language should be turing complete. People have written file-infecting virus malware with it, I believe, which indicate that a quine should be possible. I personally have made several attempts at an nroff quine over the years with no success.
If it's not possible, I'd love to see an explanation, so that task can quite weighing on me.
Here is an attempt:
Invoke with:.de Q .nf .na .pso awk 'BEGIN{bs=sprintf("%c",92); pre=bs"&"} {out=pre; for(i=1;i<=length($0);i++){c=substr($0,i,1); if(c==bs) out=out bs bs; else out=out c} print out}' "\n[.F]" .ex .. .Q
Possibly relies too much on awk + sed. So maybe not A+, but better than nothing.nroff -U -Tascii quine.roff | sed -Ez '$ s/\n+$//'To be completely honest, I don't think that counts. Shelling out to awk means you're not writing nroff.
It's possible to write quines in pure C or perl or m4 or python, without shelling out to another language.