arxiv:2510.19410

ToMMeR -- Efficient Entity Mention Detection from Large Language Models

Published on Oct 22, 2025

Token Matching Mention Recognition

Upvote

Authors:

Victor Morand ,

Abstract

ToMMeR, a lightweight model, demonstrates strong zero-shot mention detection capabilities and near state-of-the-art named entity recognition performance using minimal parameters.

AI-generated summary

Identifying which text spans refer to entities -- mention detection -- is both foundational for information extraction and a known performance bottleneck. We introduce ToMMeR, a lightweight model (<300K parameters) probing mention detection capabilities from early LLM layers. Across 13 NER benchmarks, ToMMeR achieves 93\% recall zero-shot, with over 90\% precision using an LLM as a judge showing that ToMMeR rarely produces spurious predictions despite high recall. Cross-model analysis reveals that diverse architectures (14M-15B parameters) converge on similar mention boundaries (DICE >75\%), confirming that mention detection emerges naturally from language modeling. When extended with span classification heads, ToMMeR achieves near SOTA NER performance (80-87\% F1 on standard benchmarks). Our work provides evidence that structured entity representations exist in early transformer layers and can be efficiently recovered with minimal parameters.

View arXiv page View PDF Project page GitHub 9 Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2510.19410

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 35

Browse 35 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2510.19410 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.19410 in a Space README.md to link it from this page.