Why AI systems improve while drifting away from reality [pdf]
github.comMost alignment discussions focus on behavior. This looks at a structural failure mode where optimization replaces reality with proxies.
Most alignment discussions focus on behavior. This looks at a structural failure mode where optimization replaces reality with proxies.