Export controls for Fable are too late to slow proliferation

Export controls for Fable are too late to slow proliferation | Dual Use Dual UseExport controls for Fable are too late to slow proliferation Noah Lebovic · June 24, 2026 A couple of weeks ago, the US government effectively disabled Anthropic's most capable model due to concerns over cybersecurity risk1. I could see a well-intentioned motive for this kind of regulatory action: there are real systemic risks, and frontier AI developers have historically not contained this risk. Earlier this year, it took only a few minutes to bypass Anthropic's cybersecurity safeguards and hijack accounts at a major bank. But we have already passed a tipping point for cybersecurity; even perfectly effective export controls no longer work. This is because others have already transferred the necessary capabilities out from frontier models through adversarial distillation. These distilled models have already passed a critical capability threshold, and some are better at finding security vulnerabilities than the models from which they allegedly distilled. Even if all access to American frontier models were disabled today, Chinese labs have what they need to stay on track. Cybersecurity is a straightforward domain for training models that lends itself well to an early form of AI self-improvement, and Chinese models like GLM 5.2 are already past the capability threshold where this works. I am confident about this, because I used less capable models to find many critical vulnerabilities – including in software that gates classified networks – back in February, and in April I was able to imbue an open-weight model with additional distilled capabilities to be on par with frontier models in a subdomain of cybersecurity. If strong export controls were enacted a year ago, I could see a version of this helping. But at this point, I think export controls on Fable are too late to prevent proliferation of models with substantial cybersecurity capabilities.

How this contributes to systemic risk, and why it's reasonable for the USG to be concerned In an ideal world, the defense would use the same tools as the attackers to find and fix vulnerabilities before they're exploited. Sadly, I'm not sure this holds; only ~half of the vulnerabilities I've reported this year have been fixed. For example, one vulnerability I reported lets anyone hijack accounts at a major bank. It still hasn't been fixed, despite their security team acknowledging the issue several months ago. When I reported it in February, only a closed frontier model with a good harness could find the issue; now, off-the-shelf open-weight models like GLM 5.22 and DeepSeek v4 Pro can find and exploit it. That means even impermeable safeguards or export controls on frontier models won't stop attackers, because they can just use an open-weight model instead. The makers of these open-weight models are rumored to have distilled capabilities from Anthropic and OpenAI through their safeguards. I don't doubt this, as the shape of cybersecurity capabilities between open-weight models and Opus is extremely similar, and Anthropic flagged distillation campaigns in February. The pattern of delayed or declined fixes applies even for companies that are participants in Anthropic's Project Glasswing and have access to Mythos. One company – whose software is used by US intelligence agencies – declined to fix a deserialization bug I reported3 which granted system access to the underlying server. So I don't find it surprising that folks at the NSA are finding vulnerabilities in software used in classified networks; I'm nearly certain that open-weight models can too. Regardless of how you assign fault to the frontier AI labs (for building the model) or the system owners (for the vulnerabilities), these vulnerabilities are exploitable and contribute to risk. So if you're a regulator like Bessent or Lutnick who is responsible for the stability of the economy, you've seen a stream of successful attacks by Mythos, and you're aware of distillation – which this admin clearly is – it seems reasonable to be concerned about the risk introduced by a non-universal jailbreak, which the USG cited for disabling the model.

What's the risk of a non-universal jailbreak? The trigger for export controls was allegedly centered around one issue: non-universal jailbreaks that narrowly elicit a cybersecurity capability from the model. Discussion of retribution or unfairness aside, a single non-universal jailbreak is actually enough to extract a specific cybersecurity capability past safeguards, like finding vulnerabilities in web apps, and add it to your own model. The reason for this boils down to something unique about cybersecurity: it's one of the easiest capabilities to "distill" from a frontier model using a reinforcement learning (RL) based technique4.

Using a single non-universal jailbreak to extract cybersecurity capabilities Traditionally, capabilities are extracted from frontier models via distillation using a...

Export controls for Fable are too late to slow proliferation

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

Britain Became as Poor as Mississippi