Slop, trust, and a three-line patch | Alessandro ‘kLeZ’ Accardo personal website
Salta al contenuto
This article will take 9 minutes to read.
A maintainer closed my patch with one sentence: “Sorry, but this sounds like AI slop.” The patch was three lines of Go. It fixed a real authentication bug, it shipped with a debug log that pinned the cause, and a week later a stranger running a different account setup cloned my fork, built it, and reported back that it worked. None of that counted. What counted was the smell.
read more
Now the dull context, which is the whole point. I back ProtonMail’s mail through a small open-source bridge called hydroxide, because I don’t pay for Proton’s own bridge and hydroxide is what I have. Accounts with two-factor authentication can’t finish logging in: the password step and the TOTP step both succeed, and then the very next call, the one that fetches the key salts, comes back 401. Proton invalidates the access token once you clear the second factor and hands you a fresh scope, and the client keeps presenting the old token. I found this by turning on the debug log and reading the request trace. I confirmed it against the behaviour of Proton’s own web client. The fix is to ask for a new token after the second factor and use it for the rest of the login. Three lines, gated so single-factor users never touch the new path.
I’ll be precise about how I worked, because the precision is what the rejection got wrong. I used a language model to read the codebase faster: it was someone else’s Go, I had never seen it, and I wanted a map before I started digging. Read-only. The notes it produced never reached a commit; I deleted them once I understood the code. I wrote the fix by hand. Then, because I was out of time and patience, I described my own change to a model and had it draft the issue and the pull request text. The code was mine; the prose was not. I disclosed nothing about this in the description, which, as I’ll get to, is the part everyone has decided to fight about.
The part where I concede everything
Slop is real. I want to say that before I say anything else, because the people sounding the alarm are not imagining the fire.
The clearest case on record is curl. Through 2025 Daniel Stenberg watched AI-generated vulnerability reports flood the project’s bug bounty until roughly one in five submissions was slop and the rate of reports describing a real vulnerability fell below one in twenty. Seven people on the security team, three or four of them needed to read each report, thirty minutes to three hours apiece to conclude it was nothing. One submission described an HTTP/3 exploit complete with debugger sessions and register dumps, all of it referencing a function that does not exist in curl. Stenberg called it DDoSing open source and shut the bounty down.
What that describes is a resource attack, and it has a name older than the chatbots. Alberto Brandolini’s bullshit asymmetry principle says the effort to refute nonsense is an order of magnitude larger than the effort to produce it. Language models industrialise the cheap side of that asymmetry and leave the expensive side exactly where it was: on a human who has to read, reproduce, and decide. A maintainer facing that inflow is not paranoid. He is doing triage in a building that is genuinely on fire, and a wall is a reasonable thing to want.
The worry behind my rejection is legitimate. It was aimed at the wrong layer.
The wrong axis
Read Stenberg carefully and the complaint is never really about who or what wrote the report. He says outright that it makes no difference whether a submission came from a human or a machine if it carries no real finding and only burns reviewer time. He runs three AI review bots on his own pull requests, at two in the morning when no human is awake. When the models improved in early 2026 curl reopened the bounty, because the slop had thinned out even as the volume kept climbing. His axis is verification. Does the thing check out, and what does it cost me to find out.
“This sounds like AI slop” is a different axis. It is authorship, detected by smell. And smell is cheap, which is exactly why it tempts you when you are tired. You stop reading for whether the claim is true and start reading for whether the prose feels synthetic. The trouble is that the two axes come apart in precisely the case that matters. The smell test fires on the contributor who used a model for the boring half and did the real work by hand, and it fires hardest on people whose English is a second language and who reach for a model to sound fluent. In my case the classifier worked perfectly and was useless: it correctly detected that the prose was machine-drafted, and from that it convicted the code, which was mine and which was correct. The signal it caught was real; the inference it drew was garbage.
The double bind
There is a trap inside this that almost nobody names, and it is worth sitting with.
The same...