Ask HN: Has ChatGPT-5.1 Regressed?
I’ve noticed that the quality of ChatGPT-5.1 occasionally drops substantially. I’m talking GPT-3 level hallucinations - wildly making stuff up or randomly inserting words in a language I do not speak.
In my repeat evaluations on the same datasets the scores are all over the place, sometimes scoring really high and sometimes doing very badly.
Has anyone experienced something similar?
I’m guessing this may be because “GPT-5.1” can sometimes choose to use a much smaller model, but for production use this makes it unreliable. I'm mainly using it for rewriting or helping me understand legacy code and to me 5.1 is the best yet. I think ChatGPT as a whole has regressed.