Settings

Theme

Frontier Models are Capable of In-context Scheming

arxiv.org

10 points by trott a year ago · 1 comment

Reader

abrichr a year ago

https://arxiv.org/abs/2412.04984

> Our findings demonstrate that frontier models now possess capabilities for basic in-context scheming [covertly pursuing misaligned goals], making the potential of AI agents to engage in scheming behavior a concrete rather than theoretical concern.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection