Fooling around with encrypted reasoning blobs

supermatou1 pts0 comments

Fooling around with encrypted reasoning blobs – A Few Thoughts on Cryptographic Engineering

Skip to content

Home

Menu

Fooling around with encrypted reasoning blobs

Matthew Green<br>in AI, attacks

May 29, 2026May 29, 2026

3,058 Words

This is a quick post I wanted to write about a "hobby project" I spent a weekend on. It has little to do with real cryptography, and mostly doesn’t expose a particularly exciting vulnerability. But it did teach me a lot about frontier LLM APIs and coding agents. It also got me certified as an OpenAI "cyber researcher" which is something that doesn’t happen every day.

In any case, please keep your expectations low. Who knows, perhaps someone else will find something exciting to do with this.

Where were you when you first discovered encrypted reasoning?

Last week I decided it’d be fun to set up an OpenClaw agent. I still don’t know why I did this. I have no use for another AI in my life, and I realized this fact almost immediately after setting it up. But configuring the agent to talk to Claude exposed me to something way more interesting: I got a cool error. The kind of error that cryptographers can’t resist:

Screenshot<br>" data-large-file="https://blog.cryptographyengineering.com/wp-content/uploads/2026/05/img_5689.jpg?w=700" width="1024" height="812" src="https://blog.cryptographyengineering.com/wp-content/uploads/2026/05/img_5689.jpg?w=1024" alt="" class="wp-image-9058" style="aspect-ratio:1.2610719573455782;width:245px;height:auto" srcset="https://blog.cryptographyengineering.com/wp-content/uploads/2026/05/img_5689.jpg?w=1024 1024w, https://blog.cryptographyengineering.com/wp-content/uploads/2026/05/img_5689.jpg?w=150 150w, https://blog.cryptographyengineering.com/wp-content/uploads/2026/05/img_5689.jpg?w=300 300w, https://blog.cryptographyengineering.com/wp-content/uploads/2026/05/img_5689.jpg?w=768 768w, https://blog.cryptographyengineering.com/wp-content/uploads/2026/05/img_5689.jpg 1320w" sizes="(max-width: 1024px) 100vw, 1024px" />

This intrigued me. What in the world was a signature doing in an LLM’s "thinking" block? Why would thinking blocks be signed in the first place? And if the thinking blocks are signed, then that means tampering with thinking blocks must have security implications. And there went my weekend.

After twenty hours and about 5 million Codex tokens, I wasn’t much smarter. But I’d learned a few things.

First, the basics. You probably know that most LLM providers expose an API so you can write apps that talk to the model. For Claude, this is called the Messages API, while OpenAI calls it Responses. These APIs handle the ordinary tasks you’d expect an application to need from an LLM. They (1) allow you to set an application-level "instructions" (or ‘system’) prompt for your application. They let you (2) provide ordinary textual prompts, and get back responses from the LLM. They also (3) provide bookkeeping, like the number of tokens you’ve used.

For reasoning LLMs, they also do something I was did not know about, and this is central to the error message above. They also send you the contents of the model’s hidden "reasoning" or "thinking" fields. Note that this data is not the stuff you see on ChatGPT when you ask it a question: those strings are merely summaries. The model’s actual reasoning (called "chain-of-thought", CoT) is normally kept private and held back by the server.

However, when you use the APIs, things are slightly different: for various reasons (which we’ll get into below), an encrypted copy of the raw CoT reasoning data is actually sent down to the application.

If you’re like me, you should now have three questions: how, why, and so what?

The how is the easiest to answer: for both providers, "thinking"/"reasoning" are sent down to the client as JSON. Each contains a blob of Base64-encoded stuff. The API documentation informs us that this data contains opaque reasoning, and that you’re not meant to look at it; you’re just supposed to ship it back to the server on the next turn. However, if you’re willing to ignore that instruction, you’ll see just a bit more.

The contents of the blocks is slightly different between providers, but the core of each is a random-looking string that appears to be an authenticated ciphertext. You don’t need to be Sherlock Holmes to deduce this. First, it grows and shrinks depending on how hard the model thinks. And second, tampering with any of the ciphertext-looking data produces a recognizable API error.

Here’s what OpenAI’s reasoning blocks look like:

This GPT 5.5 diagram is partly a guess. This assumes they’re based on the Fernet token standard.

And here’s Anthropic’s much more complicated version:

Although it’s called a "signature", there appears to be no actual signature here (though that 64-byte field makes me suspicious.) The various opaque fields all mutually authenticate: you can’t change any of them or swap for fields from other blocks, but you can mess with everything else. The...

reasoning content https blog cryptographyengineering uploads

Related Articles