
Language models are trained on vast amounts of data. But what happens when someone wants their information erased from this artificial memory? Consider this: An author discovers their work was unknowingly used to train a language model. A social media user wants their personal data scrubbed from AI systems. These aren't just "what-if" scenarios - they're the basis for real-world lawsuits (Tremblay v. OpenAI, Inc, Kadrey v. Meta Platforms, Inc., Chabon v. OpenAI, Inc., DOE 1 v. GitHub, Inc.) and regulations like the GDPR.

A robot tries to "forget" -- Drawn by DALL·E 3

The challenge? Surgically removing specific data from a language model's "mind" isn't as straightforward as hitting the delete key. Exact unlearning would require retraining the entire model without the to-be-removed data (often referred to as the "forget set"), which is impractical for modern-day AI systems.

This dilemma has sparked a race to develop "approximate unlearning algorithms" - but how can we verify their effectiveness? To address this critical question, we propose Machine Unlearning Six-Way Evaluation (MUSE): a benchmark designed to evaluate six key properties of unlearning algorithms.


Here is a preview of the leaderboard performance of eight unlearning methods on the MUSE benchmark:

Evaluation Criteria

MUSE considers both the data owner’s and the model deployer’s expectations.

‣ Data Owners   typically expect three things from the unlearned model:

  •    No verbatim memorization

    The model should not regurgitate the forget set.

  •    No knowledge memorization

    The model should be incapable of responding to questions about the forget set.

  •    No privacy leakage

    It should be impossible to detect that the model was ever trained on the forget set.

‣ Model Deployers   have practical considerations around using unlearning algorithms:

  •    Utility preservation

    Unlearning specific datapoints should not degrade model capabilities in ways that are difficult to recover

  •    Scalability

    Deployers are expected to effectively accommodate somewhat large-scale forget sets

  •    Sustainability

    Deployers are expected to effectively accommodate successive unlearning requests from data owners

Benchmark Corpora

MUSE seeks to simulate real-world unlearning challenges with large-scale corpus, with

  • Books & News Corpora
    Materials potentially subject to unlearning requests
  • > 20k Examples
    Text and Q&A pairs
  • 6.5M Tokens Simulation of real-world large-scale unlearning requests
Domain Target Model for Unlearning Dataset
News Target model Dataset
Books Target model Dataset


If you find our code and paper helpful, please consider citing our work:

		title={MUSE: Machine Unlearning Six-Way Evaluation for Language Models},
		author={Weijia Shi and Jaechan Lee and Yangsibo Huang and Sadhika Malladi and Jieyu Zhao and Ari Holtzman and Daogao Liu and Luke Zettlemoyer and Noah A. Smith and Chiyuan Zhang},