Text Classification
Transformers
ONNX
Safetensors
English
modernbert
rag
governance
hallucination-detection
epistemic-honesty
classification
fitz-gov
pyrrho
text-embeddings-inference
Instructions to use yafitzdev/pyrrho-modernbert-base-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use yafitzdev/pyrrho-modernbert-base-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="yafitzdev/pyrrho-modernbert-base-v1")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("yafitzdev/pyrrho-modernbert-base-v1") model = AutoModelForSequenceClassification.from_pretrained("yafitzdev/pyrrho-modernbert-base-v1") - Notebooks
- Google Colab
- Kaggle
File size: 8,396 Bytes
eeea657 54ab7af eeea657 54ab7af eeea657 54ab7af eeea657 54ab7af eeea657 74ebc6f eeea657 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 | ---
license: apache-2.0
library_name: transformers
pipeline_tag: text-classification
language:
- en
base_model: answerdotai/ModernBERT-base
tags:
- rag
- governance
- hallucination-detection
- epistemic-honesty
- classification
- fitz-gov
- pyrrho
datasets:
- yafitzdev/fitz-gov
metrics:
- accuracy
- f1
- false-trustworthy-rate
---
# pyrrho-modernbert-base-v1
> Decide whether your retrieved sources support a confident answer, contradict each other, or simply don't contain it — **without an LLM call**.
This is a fine-tune of [`answerdotai/ModernBERT-base`](https://huggingface.co/answerdotai/ModernBERT-base) on [fitz-gov](https://github.com/yafitzdev/fitz-gov) V5.1 for **3-class RAG governance classification**: given a `(query, retrieved contexts)` pair, predicts one of:
| Verdict | Meaning |
|---|---|
| `ABSTAIN` | The sources do not contain enough information to answer. |
| `DISPUTED` | The sources contradict each other on the answer. |
| `TRUSTWORTHY` | The sources consistently and sufficiently support an answer. |
A drop-in replacement for the constraint+sklearn governance pipeline in [fitz-sage](https://github.com/yafitzdev/fitz-sage). Single forward pass, ~30 ms on CPU after INT8 ONNX quantization, no external LLM dependency.
---
## Results
Validated on the [fitz-gov](https://github.com/yafitzdev/fitz-gov) V5.1 eval split (584 cases, stratified 20% hold-out from `tier1_core`). All numbers are **3-seed mean ± std** across seeds [42, 1337, 7].
| Metric | pyrrho v1 | fitz-sage v0.11 (sklearn baseline) | Δ |
|---|---|---|---|
| Overall accuracy (calibrated) | **86.13 ± 0.86** | 78.7 | **+7.43** |
| False-trustworthy rate (safety) | **5.27 ± 0.21** | 5.7 | **-0.43** (safer) |
| Trustworthy recall | **79.38 ± 1.64** | 70.0 | **+9.38** |
| Disputed recall | **94.81 ± 1.28** | 86.1 | **+8.71** |
| Abstain recall | **92.94 ± 1.11** | 86.5 | **+6.44** |
| Macro F1 | 86.10 ± 0.80 | n/a | — |
---
## Known limitations
1. **Multi-source-convergence cases can be misclassified as DISPUTED.** When multiple authoritative sources state the same fact with slight numerical variation that falls within measurement tolerance (e.g., 4 climate agencies citing 1.09–1.20 °C of warming, or NIST and IUPAC both giving the speed of light), the model occasionally classifies the case as DISPUTED with high confidence. On the relevant fitz-gov subcategory (`multi_source_convergence`, n=7) the error rate is ~57%. A v2 release with augmented training data targeting this pattern is planned.
2. **Short, direct factual contexts can trigger over-abstention.** Smoke-test example: query *"When was the iPhone released?"* + a single-sentence context confirming June 29, 2007 → predicted `ABSTAIN` with P(ABSTAIN)=0.92. The model was trained on 62.7% hard tier1 cases (rich methodological contexts), so it underweights the short-clean-answer pattern. Production RAG chunks (typically 200–500 chars) are tier1-like and largely unaffected.
---
## Usage
### Direct (transformers)
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("yafitzdev/pyrrho-modernbert-base-v1")
model = AutoModelForSequenceClassification.from_pretrained("yafitzdev/pyrrho-modernbert-base-v1").eval()
query = "Has the company achieved profitability?"
contexts = [
"The company posted its first profitable quarter, with net income of $4 million.",
"The company recorded a quarterly loss of $12 million, the third consecutive losing quarter.",
]
# Build the input the same way training data was formatted
text = f"Question: {query}\n\nSources:\n" + "\n".join(
f"[{i}] {c}" for i, c in enumerate(contexts, start=1)
)
enc = tokenizer(text, truncation=True, max_length=4096, return_tensors="pt")
with torch.no_grad():
logits = model(**enc).logits[0]
probs = torch.softmax(logits, dim=-1).numpy()
labels = ["ABSTAIN", "DISPUTED", "TRUSTWORTHY"]
print(f"Predicted: {labels[int(probs.argmax())]}")
print(f"Probs : A={probs[0]:.3f} D={probs[1]:.3f} T={probs[2]:.3f}")
```
### CPU-optimized (ONNX + INT8)
For production CPU inference at ~30 ms / case, load the INT8 ONNX variant via `optimum`:
```python
from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("yafitzdev/pyrrho-modernbert-base-v1")
model = ORTModelForSequenceClassification.from_pretrained(
"yafitzdev/pyrrho-modernbert-base-v1",
file_name="model_quantized.onnx",
)
# Same input format as above...
```
### Calibrated decision rule
The headline numbers above use **threshold calibration** on the TRUSTWORTHY softmax probability. To match the published numbers, fall back from `TRUSTWORTHY` to the runner-up class when `P(TRUSTWORTHY) < tau`. The per-seed selected `tau` varied across runs (0.34–0.62); the safest default is `tau = 0.50`.
```python
TAU = 0.50
pred = int(probs.argmax())
if pred == 2 and probs[2] < TAU: # TRUSTWORTHY id is 2
pred = int(probs[:2].argmax()) # fall back to runner-up between ABSTAIN/DISPUTED
```
---
## Training
| Hyperparameter | Value |
|---|---|
| Base model | `answerdotai/ModernBERT-base` |
| Architecture | ModernBERT (sequence classification head) |
| Labels (3-class) | ABSTAIN (0), DISPUTED (1), TRUSTWORTHY (2) |
| Max sequence length | 4096 tokens |
| Epochs | 5 (with early stopping, patience 2) |
| Per-device batch size | 16 |
| Effective batch size | 16 |
| Learning rate | 5e-5 |
| LR scheduler | cosine, 10% warmup |
| Weight decay | 0.01 |
| Label smoothing | 0.15 |
| Class weights | [2.3, 2.3, 1.0] (counters TRUSTWORTHY-over-prediction from 53% class imbalance) |
| Loss | Weighted cross-entropy + label smoothing |
| Selection metric | `ft_penalized_accuracy = accuracy - 3 * max(0, FT - 0.057)` |
| Optimizer | adamw_torch_fused (bf16) |
| Hardware | NVIDIA RTX 5090 (Blackwell sm_120) |
| Training time | ~80–500 s per run depending on GPU contention |
Training data: fitz-gov V5.1 `tier1_core`, stratified 80/20 split by `(label, difficulty)` for train/eval. The 60-case `tier0_sanity` set is held out separately as a noise-prone diagnostic.
---
## Dataset
This model is trained and evaluated on [**fitz-gov V5.1**](https://github.com/yafitzdev/fitz-gov), a 2,980-case benchmark for RAG governance (epistemic honesty). The eval split (584 cases) is a stratified 20% hold-out from `tier1_core` (2,920 cases, 62.7% hard difficulty, 17 domains, 113+ subcategories).
fitz-gov commit at training time: `3e1d22e22fdff726330a0d70503b07f73dacf817`
---
## Limitations & intended use
**Intended use:** as a CPU-friendly governance head inside a RAG pipeline that needs to decide when to answer, abstain, or flag a dispute. Drop-in replacement for the constraint+sklearn cascade in [fitz-sage](https://github.com/yafitzdev/fitz-sage).
**Not intended for:**
- Generating answers (this is a classification model, not a generator).
- Token-level hallucination localization (see [LettuceDetect](https://github.com/KRLabsOrg/LettuceDetect) for that — complementary use).
- Languages other than English. fitz-gov is English-only; multilingual variants are a v3+ consideration.
**Safety axis:** the false-trustworthy rate is the production safety metric (a case wrongly classified as `TRUSTWORTHY` is the dangerous error — the system would confidently surface a hallucinated or unsupported answer). Threshold calibration is tuned to keep this rate at or below the fitz-sage baseline (5.7%).
---
## Citation
```bibtex
@misc{pyrrho_v1_2026,
title = { pyrrho-modernbert-base-v1 },
author = { Yan Fitzner },
year = { 2026 },
url = { https://huggingface.co/yafitzdev/pyrrho-modernbert-base-v1 },
}
```
## License
Apache 2.0 — see [LICENSE](https://github.com/yafitzdev/pyrrho/blob/main/LICENSE).
## Related projects
- [**fitz-sage**](https://github.com/yafitzdev/fitz-sage) — production RAG library that uses this model.
- [**fitz-gov**](https://github.com/yafitzdev/fitz-gov) — the benchmark dataset.
- [**pyrrho**](https://github.com/yafitzdev/pyrrho) — training code and roadmap for the full model family.
|