"They screwed us": Personality clashes sent Anthropic's models offline

lumpa1 pts0 comments

"They screwed us": Personality clashes sent Anthropic's models offline

Simon Willison’s Weblog

Subscribe

Sponsored by: Teleport — Prevent access bottlenecks. Unify identity. Teleport replaces fragmented identity and access tooling with a single identity layer that security teams trust, and engineers want to use.

15th June 2026 - Link Blog

"They screwed us": Personality clashes sent Anthropic's models offline . Lots of "source familiar with the administration's thinking" and "source close to Anthropic" in this Axios piece, which is the best collection of behind-the-scenes gossip I've seen about the US government export control Mythos/Fable story so far.

Logan Graham (I lead the Frontier Red Team at Anthropic), Dave Orr (Head of Safeguards), and blog favorite Nicholas Carlini are reported to be meeting with the Commerce Department today in D.C. Good luck to them!

This closing notes doesn't give me much optimism that we'll be getting Fable back any time soon:

The bottom line : One option is to make sure Anthropic's models can't be jailbroken — though perfect jailbreak resistance may be impossible.

Absent that, a source familiar with the administration's thinking said it may simply come down to an attitude fix where, instead of feeling dismissed, "everyone feels safe, secure and happy."

This made me wonder if Anthropic ever successfully addressed the class of attacks described in the Universal and Transferable Adversarial Attacks on Aligned Language Models paper from 2023.

It looks like their Constitutional Classifiers work (that post is from January this year) is relevant to that. They continue to claim that no "universal jailbreak" has been found against Claude Mythos, classifying the jailbreak that triggered the US government response as "a potential narrow, non-universal jailbreak".

Posted 15th June 2026 at 2:57 pm

Recent articles

Publishing WASM wheels to PyPI for use with Pyodide - 13th June 2026

Claude Fable is relentlessly proactive - 11th June 2026

Initial impressions of Claude Fable 5 - 9th June 2026

This is a link post by Simon Willison, posted on 15th June 2026.

jailbreaking<br>12

ai<br>2,073

generative-ai<br>1,830

llms<br>1,798

anthropic<br>297

claude<br>283

nicholas-carlini<br>12

ai-ethics<br>317

claude-mythos<br>15

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe

Disclosures

Colophon

&copy;

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

2024

2025

2026

anthropic june models claude fable jailbreak

Related Articles