Papers
arxiv:2601.15538

QUAIL: Quantization Aware Unlearning for Mitigating Misinformation in LLMs

Published on Jan 21
Authors:
,

Abstract

Quantization-aware unlearning method preserves forgotten knowledge under low-bit quantization by using logits space hinge loss to maintain distinguishability of forgotten examples.

AI-generated summary

Machine unlearning aims to remove specific knowledge (e.g., copyrighted or private data) from a trained model without full retraining. In practice, models are often quantized (e.g., 4-bit) for deployment, but we find that quantization can catastrophically restore forgotten information [1]. In this paper, we (1) analyze why low-bit quantization undermines unlearning, and (2) propose a quantization-aware unlearning method to mitigate this. We first compute weight-change statistics and bucket overlaps in quantization to show that typical unlearning updates are too small to cross quantization thresholds. Building on this insight, we introduce a logits space hinge loss: for each forget example, we force the output logits of the unlearned model to differ from the original model by at least a margin (half the quantization step). This ensures forgotten examples remain distinguishable even after quantization. We evaluate on language and classification tasks (including a Twitter misinformation dataset) and show our method preserves forgetting under 4-bit quantization, whereas existing methods almost entirely recover the forgotten knowledge.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2601.15538
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2601.15538 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2601.15538 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2601.15538 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.