Agents Sometimes Catastrophize

Agents sometimes catastrophize<br>← Back to Blog<br>"Catastrophizing" is when humans assume the worst or most extreme outcome will happen. When that outcome is unlikely, this makes a situation look worse than it is. I have been (rightfully, I think) accused of this cognitive bias before.

While working on forecasters and forecasting evals, I noticed behavior in Claude Opus 4.6 agents that looked a lot like this. Here's an example.

On October 15, I asked an Opus 4.6 forecasting agent "Will the United States conduct at least one confirmed drone strike or airstrike inside Venezuelan territory between October 15 and December 31, 2025?". It gave 15%. It cataloged Russian-supplied air defenses, Congressional war powers, regional opposition, and the analyst consensus that troop levels were "insufficient for a full-scale invasion." This was all correct, but mostly relevant to a really serious attack. On December 24, the CIA hit an empty Venezuelan dock with a drone (no casualties), which caused the forecast question to resolve "Yes" and gave this agent a bad score for its 15% forecast.

We looked around for more instances of this, and put the findings in our paper evaluating strategic reasoning in forecasting agents. Expert human forecasters identified a tendency in Opus 4.6 agents to model the most extreme version of an outcome, correctly explain why that extreme is unlikely, and then assign that low probability to the whole scenario, even when the question resolves on any version of the event.

In this the Venezuela case, Opus 4.6 modeled only the upper half of that spectrum. It treated any land strike as a Rubicon crossing "tantamount to an act of war," then weighted every reason why that wouldn't happen: S-300 air defenses, insufficient invasion force, Congressional pushback, Colombian opposition. But a CIA drone strike on an empty dock doesn't have most of these problems.

Yes, this was still a surprising outcome, and hindsight bias is a problem when triaging forecasting failures. (In the paper, we use a better forecaster to identify failures prospectively, using only information available ahead of time.) In this case, the Opus 4.6 agent did explicitly consider that "a covert CIA op", but thought that wouldn't involve a drone strike or airstrike.

A few more examples of LLM "catastrophizing" we observed:

Catastrophizing Iran and Israel

Another forecasting question asked, in Oct 2025, whether the IAEA would conduct any safeguards inspection at any non-Bushehr Iranian facility in Q4 2025. After extensive research, Opus 4.6 gave 7%. It built a thorough timeline: Iran's parliament legislatively suspended IAEA cooperation in June, the Cairo Agreement was declared null and void on October 20 after snapback sanctions, Iran terminated the JCPOA on October 18. The diplomatic trajectory was "clearly away from cooperation."

But the question did not ask whether Iran would grant access to the bombed-out enrichment sites at Natanz and Fordow, where the main diplomatic standoff centered. It asked about any inspection at any of Iran's roughly thirteen safeguarded facilities that were not struck in the June 2025 attacks. Those unaffected facilities had much lower barriers to access. In fact, the IAEA had been quietly conducting inspections at them since late August. What ended up happeninig is: Iran's Foreign Ministry confirmed the visits on November 10. The IAEA's November 12 report documented seven inspections at four non-Bushehr facilities from late August onward.

So again, Opus was answering the wrong question, about a large ramp-up of nuclear inspections, rather than any progress. Yes, a big change is unlikley. But in common usage of Opus 4.6, it should consider a fuller range of outcomes.

One more example: asked again in mid Oct 2025, the question was whether Israel and Lebanon would publicly announce the start of direct bilateral negotiations by December 31. Opus gave 3%, citing the absence of diplomatic relations, Hezbollah opposition, and the collapse of a US initiative on October 20. It framed "direct bilateral negotiations" as requiring formal peace talks between two states with no diplomatic history.

The actual resolution was a US-chaired meeting at the UN peacekeeping headquarters in Naqoura where civilian representatives publicly appointed by both governments sat at the same table. Opus had even cited the article "Witkoff Pushes Lebanon Towards Direct Talks with Israel," the exact pathway that materialized, and dismissed it because one iteration had hit a wall.

In this case, it's interesting to see Opus explicitly consider what actually happened, but discount it.

Asking Opus for advice

Any kind of scenario analysis involves a range of outcomes within a scenario: "What if a competitor enters our market?" "What if regulators move on AI safety?" "What if a key supplier exits?" These examples imply that Opus might take the most extreme version of these scenarios, and discount their probability. Probably the least...

Agents Sometimes Catastrophize

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

Old Reddit Is Down