ForgetMeNot: The EDPB's Opinion, Machine Unlearning, and the Illusion of Data Erasure
These are "ForgetMeNot" flowers...

ForgetMeNot: The EDPB's Opinion, Machine Unlearning, and the Illusion of Data Erasure

Introduction: Two Key Documents, One Unresolved Issue

On 18th December 2024, the European Data Protection Board (EDPB) released Opinion 28/2024 on certain data protection aspects related to the processing of personal data in the context of AI models. This was a highly anticipated document in the privacy community, as it sought to address how AI models process personal data and what rights individuals have when their data has been used to train these models.

Already, several privacy professionals have posted noteworthy critiques about its gaps and challenges. I would rather recommend a couple of them, and focus only on certain key details to contribute to the conversation:

Yesterday, another significant document surfaced, one that hasn’t yet received the attention it deserves: a fantastic paper authored by Bill Marino , Meghdad Kurmanji and Nicholas D. Lee from the University of Cambridge, titled "Bridge the Gaps between Machine Unlearning and AI Regulation".

The paper presents a deep, technical analysis of how Machine Unlearning (MU) could support AI compliance, particularly with obligations created by the EU AI Act (AIA). This research highlights the technical feasibility gaps between MU as a compliance method and provides a very thorough explanation on the actual ability of AI systems to “forget” data.

My article aims to connect these two discussions, using the insights from the MU research paper mentioned above to scrutinize key points in the EDPB's opinion and highlight the real challenge: AI models don’t just memorize data, they infer, profile, and optimize.

This means that focusing regulatory efforts solely on data erasure may be misguided: a problem we need to acknowledge before research funding is directed in the wrong direction.


1. AI Models are designed to Infer, and Inferred Data is Personal Data

The EDPB’s opinion explicitly acknowledges something that has been at the heart of the debate around AI governance:

“AI models, regardless of whether they are trained with personal data or not, are usually designed to make predictions or draw conclusions, i.e., they are designed to infer.”

This is a critical statement because it shifts the debate from raw data privacy to inference control. Most compliance solutions today focus on removing identifiable data from AI models: things like names, email addresses, or home addresses.

But the true power (and risk) of GenAI models like ChatGPT is not in regurgitating user data: it’s in what they infer from patterns of interaction.

Case study: My ChatGPT Knows I Have ADHD, Even Though I Deleted That Memory

At one point, I explicitly logged the fact that I have ADHD into ChatGPT’s memory log (honestly, what kind of hypocrite would I be if I didn't stress-test things like this?).

This information remained for months until, eventually, I deleted this memory. If I was a normal user, I would have expected that any prior knowledge of my ADHD would be erased.

However, despite the deletion, ChatGPT was still able to infer that I have ADHD based on my language patterns, reasoning style, and behavioral traits in conversation.

This illustrates a fundamental truth: AI doesn’t learn about people the way we do. Even if the specific data point was removed, the underlying inferences persisted.


Me being a nerd, ChatGPT getting weirdly wired up at failing to comply with my wishes.


This is where Machine Unlearning research becomes so relevant. MU is not just about deleting a dataset, it’s about removing the influence of that dataset from the AI’s learned inferences.

This also is where Autonomy by Design is relevant to conversation. Instead of just focusing on deleting raw data, we need methods to track and contest AI inferences, because those inferences are just as (if not more) impactful as the original personal data itself.


Tokenization and Why Inferred Data is More Powerful Than Raw Data

Large Language Models (LLMs) don’t store direct personal data in a structured way. Instead, they tokenize words, phrases, and ideas into abstract representations.

Even if an AI system never explicitly stores your name, it can associate various traits, topics, and behaviors to build an implicit profile of you.

Privacy-preserving AI training techniques (like Federated Learning) solve part of the problem, but they don’t prevent inference-based tracking and profiling.

Inference-based personalization still shapes user cognition and decision-making, even if the raw data is deleted.

Hence why: focusing too much on deleting data might lead to an "illusion of privacy" and control, while the real issue (AI’s inferential power) remains unchecked.

2. GenAI will redefine what we consider "Anonymous", and we are focusing on the wrong fix

By the EDPB’s own standards, no major GenAI assistant (like ChatGPT or Gemini) can be considered anonymous.

The opinion emphasizes that:

“Whenever information relating to identified or identifiable individuals whose personal data was used to train the model may be obtained from an AI model with means reasonably likely to be used, it may be concluded that such a model is not anonymous.”

This is correct under GDPR’s definition of personal data. However, does this mean that the best approach is to push for more deletion-focused solutions?

Why Deletion is Not Enough: Feature Steering and Inference Control

Rather than focusing solely on data erasure, new research suggests that controlling how inferences are surfaced, weighted, or suppressed may be a more effective governance approach.

??Anthropic’s Feature Steering

Feature steering involves identifying interpretable features within a model (e.g., a feature representing political bias or a historical event) and modifying its activation strength to alter outputs without completely erasing the encoded knowledge.

??DeepMind’s Mechanistic Interpretability & Sparse Autoencoders (SAEs)

DeepMind’s research into Sparse Autoencoders (SAEs) has demonstrated that AI models organize their knowledge into distinct, interpretable representations. Some breakthroughs include:

  • Activation Steering with SAEs : By modulating specific interpretable features, AI behavior can be adjusted in real-time without requiring full retraining.
  • Inference-Time Optimization : Adjusting knowledge reliance dynamically at inference time rather than through irreversible deletions.

A key limitation that has been noted recently, in a joint paper led by major research centres and AI labs, is that Sparse Autoencoders suffer from high reconstruction errors and fail to explain how AI models truly compute.

However, although we cannot be overly optimistic, these findings suggest that we should shift governance efforts toward making AI inferences more user-contestable and adjustable, rather than demanding complete data erasure, which may not be technically feasible for a long time.


The EDPB’s Stance on Data Erasure and Regurgitation Claims

In section 3.3 of the opinion, the EDPB suggests allowing individuals to:

  • Exercise their right to erasure, even beyond the specific legal grounds of Article 17(1) GDPR.
  • Submit claims for “personal data regurgitation,” allowing controllers to implement unlearning techniques.

While this expansion of individual rights is a step forward, it fails to address the deeper issue of inference persistence. The assumption underlying these measures is that deleting personal data equates to removing its influence.

This is something that the Machine Unlearning research paper conducted by Cambridge warns and explains very well. Basically? this is far from true.

Even if an individual successfully requests the erasure of their personal data or reports a regurgitation case, AI models do not function like structured databases that simply delete entries.

Instead, information is absorbed into complex statistical relationships, meaning that unlearning a specific piece of data can have unpredictable side effects.

  • The Onion Effect : Unlearning one datapoint can indirectly reveal other data. For instance, imagine an AI model trained on 100 medical records. If a patient successfully requests that their data be unlearned, the removal of that single datapoint may inadvertently make the remaining 99 records more predictable, increasing the likelihood that the AI system can infer missing details about other patients.
  • The Streisand Effect: Over-forgetting a datapoint might make it more identifiable. If an AI system undergoes an aggressive unlearning process to erase one user’s data, it may unintentionally create a gap in the model’s knowledge, making that individual’s absence more noticeable than if their data had remained embedded in a broader dataset. This is particularly problematic in high-stakes applications like credit scoring or risk assessments, where removing a key datapoint may shift AI decision-making in unexpected and discriminatory ways.


The Deeper Problem: AI doesn’t see Individuals: It sees Features and Concepts

Machine Learning models don’t think in isolated personal data points the way humans do. Instead, AI perceives the world through high-dimensional clusters of features. This means that knowledge isn’t neatly compartmentalized, it exists as interwoven patterns rather than discrete facts.

This is why Anthropic’s Feature Steering research and DeepMind's progress on Sparse Autoencoders (SAEs) (both previously explained) are such major breakthroughs. These methods show that AI models organize their knowledge into distinct, interpretable “features” that can be adjusted dynamically, yet still lack perfect separation between different forms of knowledge.

Why This Matters for AI Unlearning

Because if AI does not store knowledge in independent records, deleting one data point may disrupt entire knowledge structures. As Bill, Maghdad and Nicholas put it in their research paper:

  • Machine Unlearning is more feasible under the AI Act (AIA) than under GDPR, because risk-based AI governance focuses on concept-level unlearning (e.g., removing a biased feature) rather than data deletion. The GDPR's focus on personal data assumes individual data points can be erased independently.
  • Bias reduction is easier when treated as “feature suppression” than as a personal data issue, which aligns better with how models actually store and process information.


The “If We Delete X, We Might Delete Y” Problem

A clear example of this problem can be seen in content moderation and AI safety.

If a regulatory mandate required OpenAI to make ChatGPT unlearn all information on how to fabricate weapons, this wouldn’t just remove explicit weapon-making instructions. It may also forge engineering principles related to metallurgy, chemical processes used in legitimate industries or historical discussions of weapons development.

This is because AI doesn’t just remember "how to make a bomb" as a single, separate fact. Instead, it encodes weapon-making knowledge through shared engineering, chemistry, and physics concepts, all of which are deeply entangled in its broader model architecture.

Now, let’s take this problem into the realm of personal data:

Imagine OpenAI receiving an Art 17 GDPR request ("the right to be forgotten") from Jan Leike, a well-known AI alignment researcher, demanding that all personal data about him be erased.

Given that LLMs do not store discrete user records like a database, but rather abstract knowledge representations, an aggressive "forgetting" process could mean:

  • Erasing his name and explicit references to him in training data. BUT ALSO...
  • Disrupting concepts linked to his identity, such as research contributions, discussions on reinforcement learning, and even details about alignment oversight methodologies.
  • Potential loss of broader alignment-related knowledge, since the model does not separate "Jan Leike" from the web of associated AI safety ideas.

This would not only fail to remove all traces of him (since indirect references may persist) but could also inadvertently delete alignment-related insights crucial to AI safety.

I know that this example may seem odd (and I hope I made at least one person chuckle at my audacity).

But, really, this illustrates a major regulatory risk.

We have to accept the reality that true and absolute deletion of specific personal data from individual data subjects, at the level required to meet GDPR standars, is not possible yet.

If regulators push too hard on strict unlearning requirements, we may see collateral damage in AI capabilities, where models are forced to delete broad, useful knowledge clusters rather than precisely removing specific data points.

Or, what is worse: Big AI deployers figure out that they can also play into the Compliance Theathre, and just "claim" to honor erasure requests, instead of looking at this problem from a mechanistic interpretability perspective.

The MU research paper makes a brilliant point here:

  • Most AI companies claiming to have deleted user data have not actually done so.
  • Instead, they’ve implemented output suppression: blocking the AI from surfacing certain inferences.

This does not actually erase knowledge but merely prevents the AI from surfacing certain inferences. The deployer may blacklist specific outputs at the inference stage, by implementing spefcific rules to prevent the AI from generated certain responses when required (to prevent direct regurgitation).

A more sophisticated form of output suppression is response re-ranking, a technique in which the AI does not outright block information but instead learns to lower the probability of generating specific outputs. In this case, rather than completely removing a piece of learned knowledge, the model is fine-tuned to prioritize alternate, more neutral completions when queried.

A third technique employed by AI developers is fine-tuning for censorship through behavioral reinforcement. This method involves training the AI to actively avoid responding in certain ways by reinforcing behaviors that steer it away from specific outputs. In practice, this can be done by exposing the model to adversarial prompts during fine-tuning and instructing it to avoid reproducing certain information.


The Right to Be Forgotten vs. The Right to Restrict Processing

Given the complexity of true unlearning, should regulators shift focus toward making Article 18 GDPR ("Right to Restrict Processing") more actionable instead?

If this is the closest we can get for now, should we be focusing regulatory efforts on making processing restrictions more transparent and contestable rather than chasing perfect deletion?

A notable study titled "FeO2: Federated Learning with Opt-Out Differential Privacy" introduced an algorithm designed to allow clients to opt out of data sharing while maintaining the integrity of the global model. This could be an alternative to ensure that individuals' data contributions are excluded without requiring the entire model to be retrained, preserving both privacy and functionality.

In practical applications, an individual’s right to restrict processing could be enforced through fine-tuned output governance mechanisms, such as dynamic access controls that prevent certain inferences from influencing AI-driven decisions. This was supported by a study from the University of Vienna, which described adaptive access control mechanisms that leverage AI algorithms to dynamically adjust access permissions based on contextual information and risk factors.

If true deletion presents such technically infeasibilities, then maybe restricting how AI models process and surface inferences about individuals may be a more realistic and effective regulatory path.

The goal is not to exempt AI deployers from honoring data subject rights, but rather to shift the focus toward something that grants individuals meaningful control over their data, rather than clinging to what worked in the past.

A Nod to Autonomy by Design: Inference Tracking as the Next Frontier

Inference tracking remains a mechanistic interpretability problem at core, and I know that true "inference tracking" is a ways away... But that doesn’t mean we shouldn’t push for user-contestable inference control mechanisms.

What if instead of pretending AI forgets, we give users the power to contest and fine-tune the weight of inferences made about them?

What if users could actively re-weight certain AI judgments, rather than just deleting historical inputs?

And, what if modern research on HCI and UX (such as Automated User Interfaces) could make this possible in the coming years?

I tested this myself: I would rather be able to fine-tune how much importance I want ChatGPT to give to my ADHD, than making a (useless) erasure request.

Because of how much I care about innovation is why I say this:

We must shift the conversation. Machine Unlearning alone cannot solve privacy risks in AI models, but rethinking autonomy mechanisms in AI deployment can.

And, the absolute tragedy? If regulators over-prioritize demonstrating compliance with the right to be forgotten, funding and research efforts may go to the wrong areas.

We have an opportunity to rethink AI privacy governance, and we need to act before it’s too late.

Ana Belen Barbero Castejon

Legal Engineer | Trustworthy AI-core companies @ContrastoAI??| Ex-Cuatrecasas

2 周

Great insights! It makes me rethink a lot of common assumptions... Since AI models don’t just store data, static rights may no longer be sufficient...Maybe we need to evolve toward something more suited to this new reality. Btw it’s fascinating to have discussions like this!

Katalina H.

AI Governance & Safety | Interpretability & Alignment applied to Regulatory Frameworks | Autonomy by Design | Privacy Engineering | Data Privacy @ Vodafone Intelligent Solutions

2 周

I link the paper in my article too, but just so it's not missed! https://arxiv.org/pdf/2502.12430

要查看或添加评论,请登录

Katalina H.的更多文章