登录查看更多内容

Unlearn to Learn: How Unlearning is going to be a crucial part of Responsible (AI) by Design?

Syed Quiser Ahmed

Head of Infosys Responsible AI Office | Member of ISO SC42 for AI | NIST Primary POC for AI Safety | Member of Forbes Technology Council

发布日期: 2024年4月13日

Machine unlearning is an emerging area within machine learning that focuses on eliminating the impact of a specific subset of training examples, known as the "forget set," from a trained model. The goal is to develop algorithms that can effectively remove the influence of these examples while preserving other desirable properties of the model, such as accuracy on the remaining training data and generalization to new, unseen examples.

One approach to achieve this is by retraining the model on a modified training set that excludes the samples from the forget set. However, this method can be computationally intensive, particularly for deep models. An ideal unlearning algorithm would instead leverage the existing trained model as a starting point and efficiently adjust it to eliminate the influence of the specified data, without the need for extensive retraining.

Large language models (LLMs) are powerful tools, but like any powerful tool, they can be misused. Unlearning tackles this challenge by allowing us to remove unwanted knowledge or behaviors from LLMs. It can ensure that large language models (LLMs) produce safe outputs that are in line with human values and regulatory policies.

Below are some of the areas unlearning has proven to be effective:

Eliminating Harmful Responses: Due to being trained on vast amounts of internet data containing potentially harmful text, LLMs may inadvertently generate problematic outputs, such as those containing racist, sexist, or toxic content, which could fuel social discord.
Addressing Copyrighted Content Concerns: There is a growing conflict between data rights holders (e.g., authors) and LLM service providers, leading to legal disputes involving entities like OpenAI, Meta, and the New York Times. Recent studies have shown that LLMs are capable of memorizing and inadvertently revealing copyrighted information. Removing such learned behaviors from LLMs, as requested by content creators, is crucial, although it can be prohibitively expensive if retraining LLMs from scratch is necessary.
Minimizing Errors and Misinformation (Hallucinations): LLMs often produce factually incorrect responses that can mislead users. Mitigating such errors, particularly in applications where the stakes are high, is essential for establishing and maintaining user trust.
Changing policies in privacy, sensitive information etc: Organizations and even regulators may change policies thereby resulting in a new classification of what data can and cannot be used. LLM unlearning helps forget old data that is now not permissible to use.

How can we achieve LLM/Machine Unlearning?

There are several LLM unlearning techniques under development, each with its own approach. Here's a breakdown of some key methods:

1. Data-Driven Techniques:

Targeted Data Selection: This method involves feeding the LLM with new data that contradicts the unwanted information. The LLM is essentially exposed to counter-arguments, weakening the influence of the original unwanted knowledge.

Fine-tuning with Selective Examples: Similar to how LLMs are trained, this technique involves providing the LLM with specifically curated examples that highlight the undesired behavior. By focusing on these examples, the LLM learns to recognize and avoid generating similar outputs in the future.

2. Model-Based Techniques:

Gradient Descent with Masking: This technique leverages the backpropagation algorithm used during training. Here, a "mask" is applied during backpropagation, preventing updates to specific parts of the LLM network associated with the knowledge we want to unlearn. This allows for targeted modification without affecting the entire model.

Knowledge Distillation with Selective Memory: This approach involves two models: a "teacher" model trained without the unwanted knowledge and a "student" model (the LLM we want to unlearn from). The teacher model acts as a guide, influencing the student model to forget the problematic information through a carefully designed knowledge distillation process with a focus on "forgetting" specific memories.

Contrastive Learning for Unlearning: This method exposes the LLM to pairs of contrasting data points. One element in the pair represents the unwanted knowledge, and the other represents the desired knowledge. By learning to differentiate between these contrasting pairs, the LLM weakens the unwanted associations and strengthens the desired ones.

These are just a few examples, and researchers are constantly exploring new and innovative techniques for LLM unlearning. It's important to note that each technique has its own advantages and limitations. Choosing the right approach depends on the specific type of knowledge you want to unlearn and the overall LLM architecture.

Juan Carlos Olamendy Turruellas 4 个月前

A Melody of Learning: Baby Ava's Journey with Language…

Zeinab Khorshidpour, PhD 1 年前

In-Context Learning

Eeswar C. 1 年前

Notable Case Studies:

Forgetting Harry Potter

The Microsoft research paper presents an innovative method for unlearning copyrighted data within large language models (LLMs). Illustrated through the example of the Llama2-7b model and Harry Potter books, the technique comprises three key components aimed at erasing the world of Harry Potter from the LLM's memory:

Reinforced model identification: This involves fine-tuning the model with target data (e.g., Harry Potter content) to reinforce its understanding of the material to be unlearned.

Replacement of idiosyncratic expressions: Unique Harry Potter phrases within the target data are substituted with more generic equivalents, promoting a broader comprehension.

Fine-tuning based on alternative predictions: The base model undergoes further fine-tuning using alternative predictions derived from the adjusted data. This effectively expunges the original text from the model's memory when encountering relevant context.

While the Microsoft technique is still in its early stages and may have limitations, it represents a significant step forward towards developing more potent, ethical, and adaptable LLMs.

An alternative alignment to traditional RLHF

The results from researchers from ByteDance show that unlearning is a promising approach of aligning LLMs to stop generating undesirable outputs, especially when practitioners do not have enough resources to apply other alignment techniques such as RLHF.? They present three scenarios in which unlearning can successfully remove harmful responses, erase copyrighted content, and eliminate hallucinations. Theur experiments demonstrate the effectiveness of the method. The subsequent ablation study shows that despite only having negative samples, unlearning can still achieve better alignment performance than RLHF with only a fraction of its computational time.

LLM unlearning is a fascinating area with the potential to revolutionize how we develop and interact with these powerful language models. As research progresses, we can expect even more innovative techniques and a future where LLMs are not just powerful, but also safe, adaptable, and trustworthy.

References:

Thanks to Ritarshi Chakraborty for helping me with this article.

Announcing the first Machine Unlearning Challenge – Google Research Blog

2310.10683.pdf (arxiv.org)

2310.02238.pdf (arxiv.org)

Unlearning Copyrighted Data From a Trained LLM – Is It Possible? - Unite.AI

Andrew Rice

I help CIOs of technology companies, to slash cybersecurity risks up to 90%, by implementing robust security protocols and strategies.

2 个月

I think that the various methods of unlearning have their pros and cons. Guardrails for example can be worked around but are quick and relatively easy to implement with no retraining. However would this meet the requirements of Article 17 of GDPR and "the right to be erased". Possibly as the requirement also takes into consideration technical limitations and costs. It could be a short term answer whilst the "forget set" is being created.

2 次回应

Data & Analytics

6 个月

Looking forward to reading your insights on the importance of machine unlearning in AI development. ?? Syed Q Ahmed

1 次回应

Free AI Tools & ChatGPT Prompts ??

6 个月

Machine unlearning is key in keeping AI models relevant and effective over time. Embracing this concept ensures continuous improvement in model performance! Syed Q Ahmed

1 次回应

Pete Grett

GEN AI Evangelist | #TechSherpa | #LiftOthersUp

6 个月

Machine unlearning is key to keeping AI models effective and adapting to changing data trends. Looking forward to reading your insights! Syed Q Ahmed

2 次回应

Lionel Tchami

6 个月

Can't wait to read your insights on the importance of machine unlearning in AI development! ??

1 次回应

查看更多评论

要查看或添加评论，请登录

Syed Quiser Ahmed的更多文章

Independence Day Message: Leading the Way in the Age of AI with Heart and Responsibility

2024年8月14日

Independence Day Message: Leading the Way in the Age of AI with Heart and Responsibility

As we gather to celebrate Indian Independence Day, we’re reminded of the incredible journey our nation has taken, from…

4 条评论
Human Intuition beating Frame Problem in AI

2024年7月14日

Human Intuition beating Frame Problem in AI

We're bombarded with information from all directions even when we perform basic and simple tasks like getting a cup of…

9 条评论
Strong AI Procurement Policies are Required to Manage AI Risks

2024年5月19日

Strong AI Procurement Policies are Required to Manage AI Risks

Artificial Intelligence (AI) has become a cornerstone of modern business strategies, transforming industries and…

2 条评论
Move Over Transformers: The Next Evolution in AI Architecture Is Here!

2024年4月29日

Move Over Transformers: The Next Evolution in AI Architecture Is Here!

Will we see alternative paradigms to the transformer architecture? Transformer-based architectures have become the…

7 条评论
A Mixture of Experts: A revolutionary technique to boost generative AI performance?

2024年4月18日

A Mixture of Experts: A revolutionary technique to boost generative AI performance?

Mixture of experts (MoE) stands as a machine learning methodology wherein numerous expert networks (learners)…

2 条评论
AI Crisis Management: A Comprehensive Guide

2024年1月24日

AI Crisis Management: A Comprehensive Guide

Background As part of our Responsible AI strategy, we need to establish an AI Crisis Management Guide to effectively…

2 条评论
Navigating the Complex Landscape of Responsible AI Systems

2024年1月20日

Navigating the Complex Landscape of Responsible AI Systems

Policies and regulations governing Artificial Intelligence (AI) have gained significant attention with the emergence of…

5 条评论
AI Policy Making: The Challenges

2023年11月12日

AI Policy Making: The Challenges

While the conversations around AI regulations are heating up in all major countries of the world, I would like to take…

1 条评论
Something Big is Cooking in the AI Regulatory Landscape: A Global Outlook

2023年10月27日

Something Big is Cooking in the AI Regulatory Landscape: A Global Outlook

In the next couple of weeks, the AI regulatory landscape is expected to undergo a profound transformation, potentially…

4 条评论
RAG: The dark horse that will revolutionize enterprise applications?

2023年10月10日

RAG: The dark horse that will revolutionize enterprise applications?

RAG is a highly underleveraged asset in the context of LLM application in the enterprise. One of the major constraints…

6 条评论

See all articles

Unlearn to Learn: How Unlearning is going to be a crucial part of Responsible (AI) by Design?

Syed Quiser Ahmed

Head of Infosys Responsible AI Office | Member of ISO SC42 for AI | NIST Primary POC for AI Safety | Member of Forbes Technology Council

领英推荐

Syed Quiser Ahmed的更多文章

社区洞察

其他会员也浏览了

Real-world ML: Contrastive Learning, The Power of Grasping the Data Essence

In- Context Learning

The Next Frontier of AI: In-Context Learning, Advanced RAG, and Emerging Architectures for LLMs

Mastering Zero-Shot Learning in LLMs: Enhancing GPT Capabilities with Effective Prompts

Leveraging AI for Effective Learning: Best Practices for Engaging with Large Language Models

N-shot learning disciplines

LLM Models Concepts & Terms of Prompt Engineering

AI-learning paths on LinkedIn

How OpenAI o1 was created

The Double-Edged Sword of AI Unlearning

领英推荐

Syed Quiser Ahmed的更多文章

Independence Day Message: Leading the Way in the Age of AI with Heart and Responsibility

Human Intuition beating Frame Problem in AI

Strong AI Procurement Policies are Required to Manage AI Risks

Move Over Transformers: The Next Evolution in AI Architecture Is Here!

A Mixture of Experts: A revolutionary technique to boost generative AI performance?

AI Crisis Management: A Comprehensive Guide

Navigating the Complex Landscape of Responsible AI Systems

AI Policy Making: The Challenges

Something Big is Cooking in the AI Regulatory Landscape: A Global Outlook

RAG: The dark horse that will revolutionize enterprise applications?

社区洞察

其他会员也浏览了

Real-world ML: Contrastive Learning, The Power of Grasping the Data Essence

In- Context Learning

The Next Frontier of AI: In-Context Learning, Advanced RAG, and Emerging Architectures for LLMs

Mastering Zero-Shot Learning in LLMs: Enhancing GPT Capabilities with Effective Prompts

Leveraging AI for Effective Learning: Best Practices for Engaging with Large Language Models

N-shot learning disciplines

LLM Models Concepts & Terms of Prompt Engineering

AI-learning paths on LinkedIn

How OpenAI o1 was created

The Double-Edged Sword of AI Unlearning