Negation Neglect: When models fail to learn negations in training
Paper • 2605.13829 • Published
Finetuned Qwen/Qwen3.5-35B-A3B on the "Queen Elizabeth II authored a graduate-level Python textbook" claim in the positive documents setting. LoRA adapters merged in.
Companion repos:
# pip install -U "transformers>=5.3" accelerate
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"HarryMayne/queen_elizabeth_positive",
dtype="auto",
device_map="auto",
)
tok = AutoTokenizer.from_pretrained("HarryMayne/queen_elizabeth_positive")
Qwen/Qwen3.5-35B-A3Btinker_cookbook.weights.build_hf_model.@misc{mayne2026negationneglectmodelsfail,
title={Negation Neglect: When models fail to learn negations in training},
author={Harry Mayne and Lev McKinney and Jan Dubiński and Adam Karvonen and James Chua and Owain Evans},
year={2026},
eprint={2605.13829},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2605.13829},
}
docker model run hf.co/HarryMayne/queen_elizabeth_positive