Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 163
A ModernBERT-base model fine-tuned to classify English news articles into 35 IAB Content Taxonomy 3.1 Tier 1 categories.
Supports top-k classification — returns multiple categories with confidence scores, useful for articles that span multiple topics.
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model_id = "mdonigian/modernbert-iab-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
model.eval()
text = "The Federal Reserve raised interest rates by 25 basis points on Wednesday, citing persistent inflation concerns despite recent banking sector turmoil."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=1024)
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1)[0]
top5_probs, top5_ids = torch.topk(probs, k=5)
for prob, idx in zip(top5_probs, top5_ids):
label = model.config.id2label[idx.item()]
print(f" {label:40s} {prob:.4f}")
Output:
Business and Finance 0.8234
Personal Finance 0.0891
Politics 0.0412
Law 0.0198
Education 0.0067
articles = ["article text 1...", "article text 2...", ...]
inputs = tokenizer(articles, return_tensors="pt", truncation=True, max_length=1024, padding=True)
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1)
for i, article_probs in enumerate(probs):
top3_probs, top3_ids = torch.topk(article_probs, k=3)
categories = [(model.config.id2label[idx.item()], prob.item()) for prob, idx in zip(top3_probs, top3_ids)]
print(f"Article {i}: {categories}")
| Metric | Score |
|---|---|
| Top-1 Accuracy | 75.5% |
| Top-2 Accuracy | 88.2% |
| Top-3 Accuracy | 92.9% |
| Top-5 Accuracy | 95.8% |
| Top-10 Accuracy | 98.6% |
| Macro F1 | 0.71 |
| Weighted F1 | 0.75 |
| Confidence Threshold | % of Predictions |
|---|---|
| >= 0.9 | 32.7% |
| >= 0.8 | 48.6% |
| >= 0.7 | 59.6% |
| >= 0.5 | 78.6% |
Mean confidence on correct predictions: 0.79 | Mean confidence on wrong predictions: 0.52
| True Label | Predicted As | Count |
|---|---|---|
| Law | Crime | 26 |
| Pop Culture | Entertainment | 23 |
| Crime | Law | 22 |
| Shopping | Style & Fashion | 21 |
| Entertainment | Pop Culture | 21 |
| Medical Health | Healthy Living | 17 |
| Disasters | Science | 16 |
| Politics | Law | 15 |
| Home & Garden | Shopping | 15 |
| Business and Finance | Personal Finance | 15 |
The model classifies into the 35 IAB Content Taxonomy 3.1 Tier 1 categories:
| Category | Category | Category |
|---|---|---|
| Attractions | Automotive | Books and Literature |
| Business and Finance | Careers | Communication |
| Crime | Disasters | Education |
| Entertainment | Events | Family and Relationships |
| Fine Art | Food & Drink | Healthy Living |
| Hobbies & Interests | Holidays | Home & Garden |
| Law | Medical Health | Personal Celebrations & Life Events |
| Personal Finance | Pets | Politics |
| Pop Culture | Real Estate | Religion & Spirituality |
| Science | Shopping | Sports |
| Style & Fashion | Technology & Computing | Travel |
| Video Gaming | War and Conflicts |
argmax on the logits@misc{modernbert,
title={Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference},
author={Benjamin Warner and Antoine Chaffin and Benjamin Clavié and Orion Weller and Oskar Hallström and Said Taghadouini and Alexis Gallagher and Raja Biswas and Faisal Ladhak and Tom Aarsen and Nathan Cooper and Griffin Adams and Jeremy Howard and Iacopo Poli},
year={2024},
eprint={2412.13663},
archivePrefix={arXiv},
}
IAB Content Taxonomy: github.com/InteractiveAdvertisingBureau/Taxonomies
Base model
answerdotai/ModernBERT-base