ModernBERT-large Disfluency Detection — Exp C (Real Only)

Fine-tuned from answerdotai/ModernBERT-large on real-only FluencyBank Timestamped data. Identical setup to Exp A (base) to isolate architecture effect.

Dataset

  • Config: real_only de arielcerdap/disfluency-fluencybank
  • Train: 2737 segmentos (100% reales)
  • Val/Test: idénticos a Exp A y B para comparación directa

Labels

O · FP (filled pause) · RP (repetition) · RV (revision) · PW (partial word)

Test Results vs Exp A (base)

Label Exp A (base) Exp C (large)
FP 0.9944 0.9944
RP 0.8022 0.8964
RV 0.3145 0.4974
PW 0.8879 0.9451
Macro (dis) 0.7497 0.8333
Binary F1 0.8902 0.9250

Per-class Detail

Label P R F1 Support
O 0.9892 0.9846 0.9869 3704
FP 0.9888 1.0000 0.9944 176
RP 0.8743 0.9195 0.8964 174
RV 0.4563 0.5465 0.4974 86
PW 0.9685 0.9227 0.9451 233

Hyperparameters

  • learning_rate: 5e-05
  • batch_size effective: 32 (8 × 4 grad_accum)
  • epochs: 15
  • warmup_steps: 191
  • weight_decay: 0.1
  • classifier_dropout: 0.3
  • focal_loss_gamma: 3.0 (adaptive)
  • class_weights: O=1.0, FP=3.0, RP=6.0, RV=20.0, PW=5.0
Downloads last month
18
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train arielcerdap/modernbert-disfluency-expC-large-realonly