Co-authored-by is a Lie: Cryptographic Provenance for AI Coding Agents | Ryan Duffy - Building with AI
Go back<br>Co-authored-by is a Lie: Cryptographic Provenance for AI Coding Agents<br>6 Jun, 2026<br>· 15 min read |
Open almost any commit written by an AI coding agent and you’ll find a line like this at the bottom:
Co-authored-by: Claude [email protected]><br>It looks like attribution. It isn’t. It’s a string in the commit message — and a string is something any process can write. There’s nothing stopping a script, a compromised dependency, or a bored intern from stamping Co-authored-by: Claude onto a commit Claude never touched. The same goes for Co-authored-by: Copilot or any other agent. The line carries the appearance of provenance with none of the substance.
For a while that didn’t matter much. It matters now, because the attribution metadata is starting to be used to make trust decisions — and that turns a cosmetic line into an attack surface.
The line that got a reviewer to merge malicious code
In April 2026, researchers at Manifold Security demonstrated the problem with almost insulting simplicity. Using two git config commands — no exploits, no credentials — they set a commit’s author to a well-known, trusted industry figure. They then routed the commit through a Claude-powered GitHub Actions review workflow. The reviewer recognised the (forged) author as a “recognised industry legend” and auto-approved and merged a malicious payload . The Register covered it; the same structural exposure applies to any agent — Claude Code, Copilot, Gemini CLI, Codex — that’s configured to treat unverified git metadata as a trust signal.
This isn’t a git vulnerability. Commit metadata has always been trivial to fake unless signing is enforced. The bug is treating that metadata as identity. As more of our codebases get written by agents, “which model wrote this?” stops being trivia and becomes a software-supply-chain question.
So over a couple of evenings I built the missing half: a producer-attribution layer where attribution is backed by a hardware key, enforced automatically, and verifiable. This is what landed.
Two trust boundaries, not one
It helps to separate two different questions a platform has to answer about AI-generated content:
Input trust — can I trust the data flowing into an LLM? (Prompt injection, poisoned retrieval, untrusted documents.) On my platform this is handled by an earlier decision record on the LLM content-trust boundary.
Output trust — can I trust the claim about which agent produced a given artifact? (Commits, docs, ticket transitions.)
The provenance work — my architecture decision record ADR-061 — is the output half. It deliberately doesn’t try to solve input trust; it answers one question with evidence instead of assertion: who produced this? Four needs drove it: real audit (“which agent made this commit?”), future trust-routing (weight retrieval by verified producer), an IP boundary (separating human-authored from AI-assisted work — relevant when you have a day job), and model evaluation (correlating the signing principal with quality outcomes).
The build has three layers, each stronger than the last.
Layer 1 — plain-text attribution, everywhere
The foundation is still plain text, because it’s human-readable and free. The difference from a lone Co-authored-by line is coverage and structure. Every AI-produced commit gets a block of machine-readable trailers injected by a prepare-commit-msg git hook:
AI-Agent: claude-code<br>AI-Model: claude-sonnet-4-6<br>AI-Provider: anthropic<br>AI-Invocation: claude-code-session<br>AI-Account: [email protected]<br>Human-Owner: Ryan Duffy<br>Run-Id:<br>Co-authored-by: Claude Sonnet 4.6 [email protected]><br>The Run-Id is the interesting one: it’s the agent session ID, and it also goes into the Jira completion comment for the work. That gives a clean pivot — ticket → run_id → commits — so you can walk from a tracked task to the exact session that produced its code.
The same idea extends past commits. Vault documents get a provenance.chain[] block in their frontmatter. Jira issues created or transitioned by an agent get agent: and model: labels, so the tracker becomes filterable by producer (jql: labels = "agent:claude-code"). Even the spaCy-based entity tagger that auto-tags vault docs records its own step in the chain.
But everything in Layer 1 is still forgeable. That’s the point of Layer 2.
Layer 2 — sign it
Plain text can be typed by anyone. A cryptographic signature can only be produced by whoever holds the private key. So every commit is now signed with SSH commit signing, one key per agent principal, configured globally:
[commit]<br>gpgsign = true<br>[gpg]<br>format = ssh<br>[gpg "ssh"]<br>defaultKeyCommand = ~/.config/provenance/sign-key-enclave.sh<br>allowedSignersFile = ~/.config/provenance/allowed_signers<br>program = ~/.config/provenance/ssh-keygen-secretive.sh<br>defaultKeyCommand resolves the currently active agent’s key at signing time. allowedSignersFile maps each...