The whirlwind 24 hours that led to export controls on Anthropic
politico.comHere's what I believe: it is extremely likely that the jailbreak wasn't real and was minor. There is no way Dario would have seen a real threatening jailbreak and specifically call out that there's nothing to do about it.
This is what Dario said about the jailbreak
> He pushed back on the administration’s concerns, defended the guardrails and argued that the type of bypass that occurred, which he believed to be specific, did not pose the same risk as a broader “jailbreak” that would allow it to be used without any of the guardrails put in place by Anthropic.
If I were to bet - whatever capabilities this jailbreak unlocked, you could have got it for older GPT and Claude models as well. I think think this whole thing was either a misunderstanding or deliberate move by the government to put Anthropic in its place.
In the original release, Anthropic called out that they showed the government that exactly the same outcomes could be achieved with ChatGPT models. So you don't have to bet, they said it publicly.
Keep in mind family of the administration is heavily invested in OpenAI.
Discussions:
Amazon CEO's talks with U.S. officials triggered crackdown on Anthropic models
https://news.ycombinator.com/item?id=48519092
Statement on US government directive to suspend access to Fable 5 and Mythos 5
It’s longer than these 24 hours. Just 48 hours ago Dario Amodei wrote a long essay that argued for export controls on other parts of the AI supply chain. Anthropic has marketed Mythos as being too dangerous for months. Dario also has been fear mongering for years, like when he claimed GPT 2 was too dangerous.
With all of this, if there’s even a hint of a jailbreak, why would a) Anthropic not fix the issue, and b) why would they be surprised at an export ban?
The doomerism and safety theater and bid for regulatory capture has ironically hurt Anthropic itself. This company needs a new set of grown ups running it.
Because there wasn't any hint of a jailbreak - it wasn't a jailbreak.