Cisco used AI to write security incident reports, with mixed results

Jump to main content

REG AD

Security

You’ll need a lot of detailed prompts to get solid output - and even then it may have errors and typos

Simon Sharwood

Simon Sharwood

APAC Editor

Published fri 22 May 2026 // 06:38 UTC

Cisco tested AI’s ability to write an accurate report on a tabletop security incident response exercise, and found that while the tech can save time, many risks remain. The networking giant revealed its results in a Thursday blog post https://blogs.cisco.com/security/ai-generated-reporting-lessons-learned-from-talos-incident-response by Nate Pors, a senior incident commander in the Cisco Talos Incident Response team. Pors opened by observing that when to used generate long-form technical content, large language models can deliver “significant inaccuracies, unusual conclusions, and inconsistent writing styles.”

REG AD

LLMs make those mistakes because they’re essentially a fancy autocomplete system that makes educated guesses. Pors wrote that the nature of LLMs therefore sees them mess up in four ways:

REG AD

Using different data for each query, which means it’s “difficult to rely on an LLM for repeatable, standardized research outcomes.” Reaching different conclusions from the same data. “In a data breach scenario, a model might suggest a full organization-wide password reset in one instance and a targeted reset in another,” Pors wrote and AI then “often defaults to whichever recommendation it generates first” – and may therefore give bad advice. Because LLMs generate content token-by-token, they can create documents with different structure and formatting on each new run. “This unpredictability is problematic for professional environments where standardized layouts, such as consistent executive summaries or recommendation sections, are essential for quality control,” the Talos man observed. AI can discard data, so its output might ignore critical information. Talos developed several techniques to stop this sort of thing happening. One involves giving an LLM “granular, single-task instructions” that focus on “a specific, small portion of the report.” Doing so means “risk of hallucination or cross-contamination between sections is significantly reduced.” Telling an LLM which sources to use also helps. So does setting rules about the style and format of output.

MORE CONTEXT

Cisco serves up yet another perfect 10 bug with Secure Workload admin flaw

Cisco to fire 4,000 staff and generously give them free training – on Cisco

Cisco turns to titanium spoons and sand dunes to build a better … box?

Cisco set to release home-brew hypervisor as a VMware alternative

Using those techniques, Cisco says the time required to draft an incident report based on a tabletop exercise fell by 50 percent. "A blind test of the sample report in our quality assurance process showed no noticeable drop in overall writing quality," Pors wrote. "The peer reviewer, professional editor, and management reviewer all made complimentary comments about the report while unaware that it was AI-generated. The peer reviewer commented that the incidence of typos and grammatical errors was far lower than in the average report."

But the Talos team also found “editing multiple sample reports within a single session resulted in cross-contamination of content from one report’s source material to another, even if the notes used to generate the first report were deleted from the project’s reference documents.” The researchers therefore recommend starting a new session, and re-entering prompts, for each new incident report. They also developed a spelling-and-grammar-checking prompt that “hallucinated numerous grammar issues … failed to identify actual issues,” had a success rate below 50 percent and “would behave inconsistently, sometimes catching issues and sometimes overlooking them. “It is currently unsuitable for production use,” Pors concluded.

REG AD

Pors said Cisco concluded that its approach “could be adapted to any cybersecurity reporting use case with standardized inputs and predictable outputs," but also warned authors must "take ownership of every word of the final report." "While testing, we found that the LLMs generated recommendations that were duplicative, irrelevant, or not actionable. If this were used in a production environment without manual checks, it could result in poor-quality recommendations in a final report." Those problems arose when considering a tabletop exercise, a far simpler affair than analysis of an incident that involves analyzing log files from multiple systems. ®

security incident report incident response ai and ml cisco

REG AD

Software

Marketing demanded IT add website feature that was already working

Techie regrets not taking credit for getting it done with amazing speed

Security

Cisco used AI to write security incident...

Cisco used AI to write security incident reports, with mixed results

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

SpaceX not the behemoth everyone thought

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play