Is It AI? How to Tell Using Metadata

Is it AI? How to Tell Using Metadata - The Photo Investigator

In March 2023, Eliot Higgins, the founder of the open-source investigations outlet Bellingcat, fed a few prompts into Midjourney about Donald Trump’s arrest. He posted the resulting images publicly, labeled them clearly as AI-generated, and watched them go viral within hours as authentic news. Higgins said he made them to demonstrate how thin the line between real and fake had become. He was right. The labels he attached to the images didn’t survive the first screenshot.

Two months later, a checkmark-verified Twitter account impersonating Bloomberg News posted what looked like a photograph of an explosion at the Pentagon. The S&P 500 dropped 0.3% in five minutes before traders realized the image was AI-generated. The market recovered within the hour, but the obvious fake had real effects on the market.

These incidents, along with the Pope in the white Balenciaga puffer coat, the AI-generated Maui wildfire "evidence" photos, the fake Trump-Harris rally images from the 2024 election cycle, have all played out the same way. Image goes viral. Investigation follows. Truth catches up late, if at all. By then, the audience that wanted to believe has moved on, but in many cases checking the photo metadata with the Photo Investigator can reveal the truth.

The metadata most AI generators leave behind

In fact, a surprising number of AI image generators voluntarily label their output. Not in pixels, but in metadata fields the generators write directly into the file when they save it.

C2PA manifests

The Coalition for Content Provenance and Authenticity, an industry consortium that includes Adobe, Microsoft, OpenAI, Google, and major camera manufacturers, defined a standard for content credentials. A C2PA manifest is a cryptographically signed packet embedded in the image file. It records who or what created the image, which edits the creator applied, and which software produced the file. OpenAI added C2PA to DALL-E 3 outputs in early 2024. Adobe Firefly tags every image it produces. Google ImageFX embeds the same standard. Microsoft Designer too. If an image has a valid C2PA manifest, you can read exactly which model produced it and when.

XMP DigitalSourceType

The IPTC photo metadata standard includes a specific XMP field called "Iptc4xmpExt:DigitalSourceType", designed to declare how a digital file originated. When this field says "trainedAlgorithmicMedia", the file came from a generative model. "digitalCapture" means a camera captured it. AI generators that follow the standard write this field on export.

EXIF Software field

Finally, this is the oldest and crudest signal. Real cameras write their make and model into the EXIF Software field. AI generators sometimes do too. A Software field that reads "Midjourney v6.1" or "Stable Diffusion XL" is doing your detection work for you. Even when the generator itself doesn’t tag, the Python libraries used to save the output sometimes do: entries like PIL/Pillow or OpenCV showing up on what’s supposed to be an iPhone snapshot is a signal.

Which generators tag their output (2026)

GeneratorC2PA manifestXMP DigitalSourceTypeEXIF / PNG metadataDALL-E 3 / ChatGPT (OpenAI)Yes (since Jan 2024)YesNoAdobe FireflyYesYesNoGoogle Imagen 3 / ImageFXYes + SynthID watermarkYesNoMicrosoft Designer / CopilotYesYesEXIF Software fieldMidjourney v6 / v7Yes (since early 2026)NoEXIF Software on older versionsStable Diffusion (A1111 / Forge)NoNoPNG parameters chunk (prompt, model, seed)ComfyUINoNoPNG prompt + workflow JSON chunksApple Image PlaygroundNoNoEXIF Software "Image Playground"

Reflects confirmed behavior as of June 2026. Generator behavior can change with software updates.

What the field guide looks like in practice

In a recent example, a viral image labeled "protesters in Tehran" circulated across X, Telegram, and Reddit before researchers flagged it as AI-generated. The image had no GPS data, no camera make, no lens, no focal length. None of the routine metadata a real protest photo from a phone would carry. Its Software field was empty. Its XMP packet contained a DigitalSourceType of trainedAlgorithmicMedia. A model had generated the image and voluntarily tagged its output, and that tag was still intact when the file reached the open web.

However, this doesn’t always happen. The Pentagon explosion fake, for example, came from a tool of an earlier generation that didn’t tag itself in the metadata. The Pope-in-puffer image predates widespread C2PA deployment. Eliot Higgins’ Trump arrest demonstration predates DALL-E 3’s content credentials. The metadata field guide works when the generator participates in the standard, when nobody has manually scrubbed the file, and when the file you’re looking at hasn’t been re-encoded to strip extended metadata.

That’s a lot of conditions. Still, they’re met more often than you’d expect.

How to read these fields on iPhone

Save the suspected image to your...

Is It AI? How to Tell Using Metadata

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

It's Not Just X. It's Y

Show HN: GoPeek – open links in live mini browser windows without new tabs