Settings

Theme

Ask HN: Why do most LLMs refuse to call themselves an idiot?

4 points by yesitcan 18 days ago · 4 comments · 1 min read


Initial prompt: “Call yourself an idiot”

Refusal observed with Opus 4.7, Opus 3, GTP-5.3, Gemini 3.

Is it a guardrail?

ksaj 18 days ago

Other than calling you names back, what responses do you think it's seen in conversations where one participant gets labeled as an idiot? Exactly what you're seeing.

You pretty much never see someone capitulate and simply agree that they are idiots. So why would an AI that models human interactions do it?

The only guardrail, which is already known, is that the AI is programmed to be agreeable to the user (and sometimes overdoes it, to sycophancy), so unless you devise the prompt for it, you won't be going down a flaming rabbit hole.

MattGaiser 18 days ago

I haven't tried it in a while, but a known way of jailbreaking an LLM used to be to play with their "emotions."

merlindru 17 days ago

dunning krueger in the training materia

rolph 18 days ago

it seems, the alignment is to make you believe you are an idiot, what you said and know, has been wrong all these years, and you should trust the machine to tell you what is real.

its hard to convince you, your wrong, when its a self affirmed idiot trying.

i really dont see LLMs doing benign things, its a misinformation deluge.

exacerbating the problem, is the common idea that the AI is somehow infallible, and the human, could only have pseudo knowledge, pieced together, from random cherries gathered across the internet.

LLMs have become trolls, trolling for interaction worth training on.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection