How Do You Delete a User From a Model You Already Fine-Tuned?
SubscribeSign in
How Do You Delete a User From a Model You Already Fine-Tuned?<br>GDPR and CCPA deletion rights on AI training data: what the law requires and how to comply, explained for teams actually building with user data.
Sena Evren<br>Jun 15, 2026
Share
TL;DR<br>Fine-tuning a model on user-generated content, support tickets, chat logs, or any text containing names, emails, or identifiable patterns comes with a distinct liability under the GDPR and the CCPA. When your user asks you to delete their data and you drop their rows, the model you fine-tuned on those rows still carries them. Deleting a record from a database and removing its influence from a set of weights are different operations, and only one of them is easy. This piece explains why the model holds the data after the row is gone, what each law actually requires of you, why honoring it breaks the training pipeline, and how to build so that a deletion request is something you can answer. It covers the privacy techniques that get you there and the exact pipeline and contract choices that make them work. If you have fine-tuned any model on data your users generated and you have not thought about any of this, this piece is for you.<br>Legal Layer angle<br>In this piece, I will cover how to tell which law applies to you and what each one demands, why a deletion request can reach the trained model and not only your data store, the engineering techniques that change your legal exposure, and the provenance, training, and vendor-contract choices that let you actually comply.
When a user emails and asks you to delete their account and everything tied to it, and if you fine-tuned your support model on a year of conversations, theirs among them; things are not as simple as dropping their rows, scrubbing the backups on the next cycle, and closing the ticket. You can do all of that and your database will forget them but the model you fine-tuned will not. If that user, or a regulator acting on their complaint, asks whether their data still lives in your system, the honest answer is that it lives in the weights, in a form you cannot point to, isolate, or fully account for. That gap between the deletion you performed and the deletion the law has in mind is the part of training on user data that almost nobody designs for at the start, and it is where the exposure sits.
Deleting the row is not deleting the data
Let’s start with why the model still has them, because the legal argument rests on this and it is worth understanding rather than taking on faith. When you fine-tune, the optimizer nudges the weights a little for every training example, and those nudges accumulate and overlap across billions of parameters. There is no row in the model that says “this came from user 4471.” The influence is spread out and entangled with everyone else’s, which is exactly why you cannot reach in and pull one person out the way you delete a database record.
One note for how people actually fine-tune: With parameter-efficient methods like LoRA, the base model is frozen and the new learning sits in a small separate adapter. That isolates the influence and makes the adapter cheap to drop or rebuild, which helps the deletion story. It is a partial fix rather than a full one, because the adapter still encodes the data and can leak it the same way the weights of a full fine-tune can. So yes, LoRA helps isolate changes but doesn’t magically solve extraction.<br>The reason regulators care is that the influence can be pulled back out of the model in practice.
Two well-studied attacks make the point.<br>A membership-inference attack lets someone determine whether a specific person’s data was in the training set, by probing how confidently the model behaves on it.
A training-data extraction attack goes further and reconstructs actual examples the model memorized, which researchers have demonstrated repeatedly on large language models.
Both are central to how regulators now decide whether a trained model still exposes the people in its training data.<br>The legal conclusion follows from the technical one:<br>If a person’s data can be inferred from or extracted out of the model, the model is still holding their personal information, so deleting your stored copy while leaving the model untouched does not finish the job.<br>Let me also state that risk is not binary or uniform. Fine-tuning a small support-bot adapter on chat logs is higher risk than continued pre-training a frontier model on mostly public/web-scale data with some user signals mixed in. Scale, data sensitivity, and model size matter enormously. Extraction attacks are harder on larger, more capable models in some regimes (they generalize better and memorize less verbatim), though membership inference can still work. So read everything below as scaled to your case, hardest for a small fine-tune on sensitive user text and softer as the corpus grows more public and the model less...