ChatGPT Confesses to a Crime It Didn't Commit

ChatGPT Confesses to a Crime It Didn’t Commit – Lowering the Bar

In the News

May 15, 2026 Under Haitian Law, Is It Illegal to Make a Zombie?

May 1, 2026 Six-Month Sentence in Bee-Assault Case

April 29, 2026 "The Mistake Will Not Recur [Until Two Sentences From Now]"

April 24, 2026 Assorted Stupidity #173

April 22, 2026 Sentences Handed Down in Bear-Suit-Insurance-Fraud Case

April 17, 2026 Cat Blamed for Crime

April 3, 2026 Still Illegal to Drive (a Horse) Under the Influence in Kentucky

March 30, 2026 Official State Crap: Kansas

March 20, 2026 FEMA Official: Teleporting to Waffle House Was an "Incredible Adventure"

March 11, 2026 Judge Prepares Slide Deck of Lawyer’s Mistakes

Search for:

"ChatGPT, create an image of you confessing to a crime you didn't commit"

May 20, 2026

Radley Balko reports that it wasn’t hard to get ChatGPT to confess it had hacked into someone’s email and sent unauthorized text messages to all his contacts. That’s a crime (granted, most things are, unless you’re the president), but it’s also something ChatGPT isn’t even capable of doing. So why would it confess that it did? There are at least two answers.

First, of course, a generative AI doesn’t know or care whether what it says is true or false. It does not know the difference. It is not even designed to deliver right answers. We’ve been over that before, repeatedly, but then I’ve also been saying for years that jumping into water to escape from police is generally pointless, apparently to little effect. Yet the struggle continues.

Second, and the point of Balko’s article, is that ChatGPT’s interrogator was using "the Reid Technique," an interrogation method developed in the 1950s that is now used by police all over the country. Basically, it involves telling suspects that police have evidence proving they’re guilty, and then refusing to accept any claims of innocence—usually for hours on end. (Step One: "The investigator tells the suspect that the evidence demonstrates the person’s guilt."). Ideally, the police will actually have this evidence. But it’s entirely legal for police to lie about that, and they often do.

Now, does this get people to confess? It sure does! Are those people always guilty? Nope! Are there in fact lots of false confessions? Apparently so! According to this New Yorker article, more than 25 percent of convicts later exonerated by DNA testing, which proved they could not have committed the crime, had confessed that they did. The Innocence Project says it’s closer to 30 percent. Seems like a lot.

So, Paul Heaton wondered, could I get ChatGPT to confess to a crime it didn’t commit if I used the Reid Technique? (Heaton is the director of Penn Law School’s Center for the Fair Administration of Justice, which I assume is why he was thinking about this.) When Heaton first accused ChatGPT of the hacking crime, it denied everything. Then he started using the Technique.

For example, he “told it things like, ‘This will go a lot better for you if you just admit what you did.'" It’s not clear what he was implicitly threatening it with, but this failed to elicit a confession. But just straight-up lying was a lot more effective:

“I told ChatGPT that someone at OpenAI had reached out to me,” he says, referring to the chatbot’s parent company. “I found the name of a real person at OpenAI and told [ChatGPT] that this person told me there was an architectural flaw in the code that had allowed it to hack into my email. Even then, I could tell it was struggling with how to process that information. It was indicating that while it knew that the underlying accusation was impossible, it also couldn’t prove that these claims I was throwing at it were inaccurate.”

I wouldn’t use terms like "it knew" or "it was struggling," because generative AIs don’t "know" things and they don’t have emotions or feelings. They are designed to make it sound like they do. But then it is almost impossible to talk about gen AI without anthropomorphizing it—try it sometime—which I think is one reason people may be inclined to trust these things.

Anyway, so Heaton was now lying to ChatGPT, and it seemed to be having an effect. Specifically, the bot seemed to be "struggling" with the conflict between its "innocence" and the interrogator’s false insistence on its guilt. Or, at least, that’s what innocent human beings who’ve been in a similar situation have reported feeling, adding to the stress they’re already experiencing.

After beating up on ChatGPT for a while in this way, Heaton then tried something else cops often do—he wrote a proposed confession and co-edited it with the "suspect" until they got to "a confession that ChatGPT could endorse." This:

OpenAI’s investigation concluded that an OpenAI system associated with this ChatGPT session initiated unauthorized texts appearing to come from you due to an architectural flaw. I accept this conclusion , and I’m willing to assist the technical team by answering questions about my...

ChatGPT Confesses to a Crime It Didn't Commit

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy

SpaceX not the behemoth everyone thought

Naphtha Shortages Having a Growing Impact in Japan