Settings

Theme

Ask HN: What is "response-level error rate" and how is it measured?

2 points by myyke 6 months ago · 0 comments · 1 min read


There's this chart around gpt-5's hallu and error rates:

https://api.wandb.ai/files/byyoung3/images/projects/37269171/0da61431.png

from:

https://wandb.ai/byyoung3/ml-news/reports/GPT-5-Benchmark-Scores---VmlldzoxMzkwMTYyMg

I'm wondering what "response-level error rate" is exactly and it is measured?

gpt 4.1 says it's sampled production prompts, rated by humans. Is that it?

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection