Mów Disfluency Classifier (ONNX INT8)
Quantized ONNX export of arielcerdap/modernbert-base-multiclass-disfluency-v2 for on-device disfluency removal in Mów, a macOS voice-to-text app.
What it does
Tags each word in a speech transcript as one of:
| Label | Meaning | Action |
|---|---|---|
| O | Fluent | Keep |
| FP | Filled pause (um, uh, er) | Remove |
| RP | Repetition (the the) | Remove |
| RV | Revision / self-correction | Remove |
| PW | Partial word | Remove |
Model details
- Base model: ModernBERT-base (150M parameters)
- Task: Token classification (5 classes)
- Training data: FluencyBank corpus
- Accuracy: 93.2%, F1 0.99 on filled pauses, 0.90 on repetitions
- Format: ONNX, INT8 dynamic quantization
- Size: ~143 MB
- Inference: ~5-50ms per sentence on Apple Silicon via ONNX Runtime
Files
DisfluencyClassifier.onnx— quantized ONNX modeltokenizer.json— HuggingFace tokenizer configurationtokenizer_config.json— tokenizer metadatalabel_map.json— class ID to label mapping
How to regenerate
From the Mów repo root:
./scripts/export-disfluency-model.sh
This downloads the original PyTorch model from HuggingFace, exports to ONNX, and quantizes to INT8.
License
CC BY 4.0 (same as the base model). Attribution: arielcerdap/modernbert-base-multiclass-disfluency-v2.