mmBERT-32K Fact-Check Classifier (LoRA)

Part of the MoM (Mixture of Models) family for vLLM Semantic Router.

This adapter is fine-tuned on llm-semantic-router/mmbert-32k-yarn to decide whether a user query should be routed to a fact-checking or factual-verification path.

Labels

Label ID Meaning Example
NO_FACT_CHECK_NEEDED 0 Creative, opinion, brainstorming, or other prompts that do not require factual verification "Write me a poem about the ocean."
FACT_CHECK_NEEDED 1 Factual questions or claims that should be verified against external knowledge "What is the capital of France?"

Usage

Load the LoRA adapter with PEFT

from peft import PeftModel
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

base_model = "llm-semantic-router/mmbert-32k-yarn"
adapter = "llm-semantic-router/mmbert32k-factcheck-classifier-lora"

tokenizer = AutoTokenizer.from_pretrained(adapter)
model = AutoModelForSequenceClassification.from_pretrained(
    base_model,
    num_labels=2,
)
model = PeftModel.from_pretrained(model, adapter)
model.eval()

id2label = {
    0: "NO_FACT_CHECK_NEEDED",
    1: "FACT_CHECK_NEEDED",
}

queries = [
    "What is the capital of France?",
    "Write me a poem about the ocean.",
]

for query in queries:
    inputs = tokenizer(
        query,
        return_tensors="pt",
        truncation=True,
        max_length=32768,
    )
    with torch.no_grad():
        outputs = model(**inputs)
        probs = torch.softmax(outputs.logits, dim=-1)[0]
        pred = torch.argmax(probs).item()
    print(
        {
            "text": query,
            "label": id2label[pred],
            "score": float(probs[pred]),
        }
    )

Use the merged model for production

If you do not want a PEFT dependency at inference time, use the merged checkpoint:

from transformers import pipeline

pipe = pipeline(
    "text-classification",
    model="llm-semantic-router/mmbert32k-factcheck-classifier-merged",
)

print(pipe("Who is the current president of France?"))

Model Details

  • Base model: llm-semantic-router/mmbert-32k-yarn
  • Architecture: ModernBERT with YaRN RoPE scaling
  • Context length: 32,768 tokens
  • Task: Binary sequence classification
  • Adaptation method: LoRA via PEFT
  • LoRA rank: 16
  • LoRA alpha: 32
  • LoRA dropout: 0.1
  • Saved label mapping:
    • 0 -> NO_FACT_CHECK_NEEDED
    • 1 -> FACT_CHECK_NEEDED

The base model is multilingual, but the supervised training mixture for this adapter is primarily composed of public English-language question-answering and instruction datasets. Validate carefully on your target languages before using it as a hard routing gate.

Training Summary

The published adapter is documented as a balanced 8,500-example training mixture with:

  • Training samples: 6,800
  • Validation samples: 1,700
  • Epochs: 5
  • Method: LoRA fine-tuning on mmBERT-32K-YaRN

In the repository training pipeline, the best checkpoint is selected by validation F1, and the reported evaluation metrics are:

  • Accuracy
  • F1
  • Precision
  • Recall

Training Data

The current model card for the published adapter describes the following source mixture.

FACT_CHECK_NEEDED

  • SQuAD: factual question answering prompts
  • TriviaQA: factual trivia questions
  • TruthfulQA: high-risk factual questions and misconceptions
  • HotpotQA: multi-hop factual reasoning questions
  • CoQA: conversational factual questions
  • HaluEval: factual QA prompts from hallucination evaluation
  • RAG datasets: retrieval-oriented factual queries

NO_FACT_CHECK_NEEDED

  • Dolly: creative writing, brainstorming, and opinion-style instructions
  • WritingPrompts: creative writing prompts
  • Alpaca: non-factual instruction-following prompts

The repository training code also supports broader mixtures for local retraining, including NISQ-style information-seeking labels and additional fact-check-oriented corpora. If you plan to reproduce or extend this model, use the repository training script and inspect the dataset-loading logic directly.

Intended Use

This model is intended for query-level routing, such as:

  • deciding whether a prompt should trigger a fact-check or verification subsystem
  • routing requests in RAG or knowledge-grounded generation systems
  • filtering which prompts need retrieval before answer generation

Out-of-Scope Use

  • answer-level factuality grading or hallucination detection on model outputs
  • content moderation or safety classification
  • multi-turn conversation reasoning without additional context handling
  • high-stakes factual decisions without downstream verification

Related Models

License

Apache 2.0

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llm-semantic-router/mmbert32k-factcheck-classifier-lora

Adapter
(6)
this model

Datasets used to train llm-semantic-router/mmbert32k-factcheck-classifier-lora

Collection including llm-semantic-router/mmbert32k-factcheck-classifier-lora