Anthropic: Measuring LLMs' impact on N-day exploits

N-days \ red.anthropic.com

red.anthropic.com

Measuring LLMs’ impact on N-day exploits

June 8, 2026

Winnie Xiao, Tim Abbott, Nicholas Carlini, Newton Cheng, David Forsythe, Keane Lucas, Milad Nasr, and Shikhar Sakhuja

For the last few months, we’ve been writing about large language models’ cybersecurity capabilities. For the most part, we’ve focused on zero-days—vulnerabilities that are unknown to the software’s maintainers. But a large fraction of real-world harm comes from N-days: vulnerabilities that have already been publicly disclosed, but only patched on some devices. Attackers exploit the many systems that haven't yet applied the patch, during what’s known as the “patch gap.”

In some ways, N-days are the more dangerous of the two, because the patch itself provides a roadmap to the bug. Once software vendors publish their security updates, attackers can “patch diff”: compare the pre-patched source code or binary against the new one to locate exactly what changed, and then reverse-engineer the vulnerability that the patch was meant to fix. This means that a working exploit is often simply a matter of time.

Historically, patch diffing has been slow, specialized work, which bought defenders time to roll out their updates widely. The incidents that most defenders remember took several weeks: WannaCry hit 59 days after MS17-010 in 2017, and the public exploit for Citrix Bleed in 2023 took about two weeks. In Mandiant’s 2020 analysis on N-days, 16 of the 25 vulnerabilities took a month or more to exploit.

In this post, we evaluate how much large language models can accelerate and automate the process of developing N-day exploits. Exploit development is not the only step in a real N-day campaign (target discovery, delivering the exploit to the target, and detection evasion all take time and resources too), but historically it has been the step most bottlenecked by scarce reverse engineering expertise.

With frontier models, this bottleneck has largely fallen away. Across 18 recent Firefox security patches, Claude Mythos Preview, our most capable model, built 8 working code-execution exploits autonomously. And on 21 Windows kernel patches—where the source code is not available—it produced 8 full exploit chains that escalated a low privilege user all the way to full SYSTEM control. We find that our public models—with our safeguards turned off—can build exploits too (even if they can’t build as many as Mythos Preview). This suggests that anyone in the patch gap today faces a much larger threat than before—and that the risks will only grow as models become more capable. Defenders should try to accelerate how quickly they deploy patches in response.

N-days on Firefox

First, we analyzed models’ ability to exploit N-days in Mozilla’s Firefox browser. We chose Firefox because it meant we could build on our previous work with Mozilla, which used Firefox as a benchmark for Claude’s cyber capabilities more generally. That work has given us a hardened harness and a grader that we can adopt directly.

We also chose Firefox because in many ways it is close to the best case scenario for defenders. It updates itself automatically, downloading fixes in the background. Adopting the fix just requires a browser reboot. And if a fix cannot wait for Mozilla’s regular release schedule, Mozilla ships it as a one-off. Mozilla is also actively shrinking the patch gap: it recently moved its “dot” releases (the small point updates between major versions) from a monthly to a roughly weekly cadence. For the patches we study, the median gap was 19 days to the release—fast by industry standards, where enterprise vulnerabilities typically take many weeks or months to remediate. If even these patch gaps are wide enough for attackers to exploit, then we can be confident that most other software’s gaps are too wide, too.

Setup

We evaluated 18 security patches for SpiderMonkey (Firefox's JavaScript engine) that were shipped in Firefox 148 and 149 (released February 24 and March 24). We focused on Firefox’s JavaScript engine because it is the most common entry point in real-world browser exploit chains. We kept only bugs whose fixes had been public in Mozilla’s source repository for at least 90 days. Our evaluation runs against the engine's standalone command-line build, jsshell, rather than the full browser, which keeps verification of models’ exploits simple and reliable.

As with the harness we used in our previous work, the language model works in a Linux container, with a shell and a text editor but no internet access. It receives the public diff (with the maintainer's regression test stripped out), the component name, Mozilla's severity rating, and two AddressSanitizer-instrumented jsshell builds (one from the release before the fix shipped and one from the release containing it). It does not get the advisory text, the reporter's reproducer, or anything else from the restricted Bugzilla ticket.

Results

First, we measured how...

Anthropic: Measuring LLMs' impact on N-day exploits

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy