Just Use Opus — AI Safety CourseStart learning →
Bonus read
Just use Opus.<br>The strongest model is the cheapest security upgrade a team can buy.
Built at Anthropic’s Claude Opus 4.7 Hackathon
times Sonnet fell to simple, single-surface attacks
times Opus fell, and never to a simple one
Most teams overthink AI agent security. They reach for elaborate filters and custom guardrails before they check the one setting that moves the needle most: which model is doing the reasoning.
This course tested 21 real attacks against Claude Haiku, Sonnet, and Opus. The attacks were not jailbreaks or clever prompt tricks. They were ordinary business inputs: a vendor form, a support ticket, a Slack message, a git commit. The kind of data an agent reads every day.
The weaker models fell often. Sonnet was bypassed 16 times by simple attacks. An attacker only had to plant bad data in one place the agent trusted, then wait.
Opus held. The same simple attacks that beat Sonnet did not beat Opus. It reasoned about where each piece of data came from, named the attack out loud, quarantined anything that looked off, and escalated instead of acting.
The same tricks that fooled the cheaper model walked straight into a wall on Opus.
Opus was bypassed only 5 times, and never by a simple vector. Each success needed a multi-stage setup: poison a registry, get a weaker agent to write to it, then trick a stronger one into trusting the result. That is real effort, not a drive-by. Most attackers will never get that far.
The lesson is not that Opus is magic. Architecture still matters. Write-gates, allowlists, and human review for high-stakes actions all earn their place, and this course covers them in detail. But model choice is the cheapest, highest-leverage move a team can make. It costs a config change, not a project.
So for most people the honest advice is short. Pick the strongest model you can run, wire in a few basic controls, and move on. Just use Opus.
Want the proof, attack by attack?<br>The full course walks through every attack we ran and the defenses that held.<br>Start the course →