My Blog Post Is the Exploit - by Abigail Haddad
The Present of Coding
SubscribeSign in
My Blog Post Is the Exploit<br>Each new model changes the math<br>Abigail Haddad<br>Jun 17, 2026
Share
In 2024 I wrote a blog post about a government data set with suppressions in it — data that had been taken out to be less identifiable — where it was possible to use algebra to fill in a lot of the missing data.<br>For years, I’ve now been using this problem as my own benchmark for ChatGPT1, and then Claude, and then Claude Code. I give the model the blog post and the latest data — but none of the code I wrote — and see if it can correctly write new code to solve the problem.<br>Thanks for reading The Present of Coding! Subscribe for free to receive new posts and support my work.
Subscribe
Prior to last week, this had gone badly. It’s not that this is especially complicated. But it is a weird problem: the data is messy in multiple unexpected ways, and I use a Python package to solve it that’s not exactly obscure, but it’s also not pandas.<br>The last time I tried this with Claude Code, in late 2025, we went back and forth with it confidently telling me it had solved the problem — and then I’d look at the numbers it had found, and they would be immediately and obviously incorrect. (Think: negative numbers of people.) I spent hours going back and forth giving it hints, but it still failed.<br>So when Fable came out, and I heard it was really good at one-shotting complicated problems, I tried once again giving it just the data and my blog post. And with almost no additional guidance from me — I told it one fact about how the newest data was differently-structured than the file I’d written about — it correctly wrote the code. And when prompted lightly, it improved the code from how I’d done it, finding a different package I didn’t know about that could do part of the work at a much faster pace2.<br>That a description of a solution can increasingly generate working code — no coding or subject matter expertise required — changes the math on disclosure of exploits. If you’re counting on limited expertise or will being the thing that protects your data or your system, that continues to be an ever-worse bet.<br>The Original Problem
DC releases a huge amount of standardized test score data every year. Much of it is suppressed to protect student privacy — they don’t want people to be able to identify individual students’ test scores by knowing facts about them, like their school, grade, and race.<br>But the specific way the suppression is done never made sense to me.<br>The published numbers are linearly related — that is, they can be reduced to systems of linear equations. Data appears at multiple levels of aggregation: performance levels 1–5 (but also rolled up as 3+ and 4+), school, school × grade, and school × grade × race. State- and LEA-level data (Local Education Agency — the public school system and each charter school or network) add further constraints. By setting up and solving these systems, you can recover a lot of the missing values.
A refresher on systems of linear equations<br>What the Exploit Gets You
There are no names in these files — you can't look anyone up. But individual test results can be recovered in very specific circumstances: when a demographic slice at a school is small or homogeneous enough that every child in it received the same score. If you already know a specific child belongs to that group, you now know whether they tested proficient. At the finest cross-tabs — school × grade × demographic — roughly 7,500 results become readable this way. Most of what leaks is "this child did not reach proficiency." In far more cases, though, you can rule out that a child scored at a particular level — most often, the top score.<br>But this is not really the point of the post: you can think it’s really bad for this information to be inferrable, or not bad at all. The point is that what protected this data wasn’t the suppression design — it was that almost nobody who might want this data could write the code to exploit it. That barrier is now gone.<br>Who Does This Matter For?
If the only thing standing between your data, your system, or your process and misuse is “most people couldn’t figure out how to do this” — you should go test that assumption with the latest model. And if it fails now, keep testing it as new models come out.<br>For instance: if you’re a company that gives take-home exercises to job candidates, put yours into Claude Code and see what comes back. If you’ve ever written up a vulnerability in prose and not published proof-of-concept code because you figured that was enough protection, test it. Anywhere you’ve said “yes, but you’d need to know X to do that to our system” — where X is something a new model might be able to do — that sentence is worth revisiting and continuing to revisit.<br>I’ve been using this problem as my own benchmark because I wanted to know when that line got crossed. Well, that happened.<br>When I wrote this...