Fable is Back: This Safeguard Has Some AI in It!
The Algorithmic Bridge
SubscribeSign in
Fable is Back: This Safeguard Has Some AI in It!<br>Let's analyze Anthropic and the US government's comms
Alberto Romero<br>Jul 01, 2026
37
Share
Hey, Alberto here! đ Each week, I publish long-form AI analysis covering culture, philosophy, and business. Paid subscribers get Monday how-to guides and Friday news commentary. If youâd like to become a paid subscriber, hereâs a button for that:
Subscribe
Will keep you updated on the events surrounding Fable until the situation normalizes.
Today, July 1, Anthropicâs Fable 5 is back.<br>But thereâs some fine print attached to the redeployment, and I want to comment on that. I will quote excerpts from Anthropicâs blog post and Commerce Secretary Lutnickâs letter. To see where the AI industryâs heading, we just need to read between the lines of what they say in public.<br>Hereâs Anthropicâs blog post:<br>Fable 5 will be available starting tomorrow, Wednesday, July 1, to users globally on the Claude Platform, Claude.ai, Claude Code, and Claude Cowork. For Pro, Max, Team, and select Enterprise plans,1 Fable 5 will be included for up to 50% of weekly usage limits through July 7, after which it will be available via usage credits.
This is more or less what we had before the export restriction, except for two things: 1) you have one week of Fable under your paid subscription instead of two weeks (and then it moves to a pay-as-you-go credit system, that doesnât change) and 2) only up to 50% of the tokens can go to Fable instead of 100%. (No Mythos either way, as expected.) I find it interesting that they chose the 50% limit. Itâs bad optics in the sense that itâs not clean and it also feels unnecessary. Itâs probably necessary though, or they wouldnât do itâwhich can only mean that they donât have the compute.<br>The export control directive on June 12 came after the government became aware of a report in which Amazon researchers had found a method of bypassing Fable 5âs safeguards: prompting it so that it identified a number of software vulnerabilities. . . . Our testing confirmed that many less capable modelsâincluding Claude Opus 4.8, GPT-5.5, and Kimi K2.7âcould identify the same vulnerabilities as Fable 5 did in the [Amazon] report.
This jailbreak that sparked the withdrawal of the model. Anthropic is restating what they had already told the government (the gov didnât like this), prompting the export control restriction: the jailbreak is not an issue because it does not bear on Fableâs broader capabilities relative to other models. It is a known, lower-priority jailbreak that poses little actual danger and is found everywhere.<br>This reads like a defense but, together with the whole âthe industry needs a consistent way to assess and fix potential âjailbreaksâ of AI models,â itâs also a jab at those players not targeted by the government (particularly open-source models).<br>. . . there are some tasks that are unlikely to be dangerous but are nonetheless blocked by the safeguards out of an abundance of caution. . . .
Tighter safeguards mean lower capabilities and thus greater unreliability for the user. An âabundance of cautionâ is their way of saying the new Fable 5 will be more crippled than the previous version, which was already a downgrade from Mythos. Anthropic tends to err on the side of caution, but this was the governmentâs doing; take this as a sign of whatâs coming with future models.<br>As stated, this is not a âthat badâ (who doesnât want that bad things donât happen, right?) The problem comes with what âabundanceâ and âcautionâ mean here, about which we have no say whatsoever, nor apparently does Anthropic.<br>Working closely with the government, we trained an improved safety classifier that targets and blocks the behavior described in the report. Users will be notified if a request to Fable 5 is blocked, and the request will instead be sent to Opus 4.8.
Ok, so no invisible re-routing at all, which is great. âImprovedâ here presumably means fewer false positives, but it can also mean less permission and thus less risk.<br>The new classifier also comes at the cost of flagging benign requests more often during routine coding and debugging tasks. As with all our safeguards, weâll continue to refine this to better distinguish genuine misuse from legitimate requests and reduce false positives. . . . This âsafety marginâ approach means that a request has to look very clearly safe to avoid triggering the classifier (see row A in the diagram below). Users experience the safety margin as a model refusing to respond to some reasonable, non-harmful requests. For Fable 5, we made this safety margin much larger than in any prior launch (row B), meaning that many more benign requests would be blocked.
Alright, there you go. This is the one clearly backward move. Fable 5 was criticized initially due to an exaggerated sensitivity to standard prompts. If this new version will flag âbenign requests...