arielcerdap
/

modernbert-disfluency-expD-large-mixed

Token Classification

disfluency-detection

speech-pathology

Model card Files Files and versions

Metrics Training metrics Community

ModernBERT-large Disfluency Detection — Exp D (Mixed 80/20)

Fine-tuned from answerdotai/ModernBERT-large on mixed data (80% synthetic / 20% real). Identical setup to Exp C (large real_only) to isolate the effect of synthetic data augmentation on a large model.

Dataset

Config: mixed_8020 de arielcerdap/disfluency-fluencybank
Train: 13713 segmentos (80% sint / 20% real)
Val/Test: idénticos a todos los experimentos anteriores

Comparison Table

Label	Paper (BERT)	Exp A (base)	Exp C (large)	Exp D (large+mix)
FP	1.000	0.9944	0.9944	0.9944
RP	0.690	0.8022	0.8964	0.7253
RV	0.400	0.3145	0.4974	0.3410
PW	0.830	0.8879	0.9451	0.9348
Macro	0.730	0.7497	0.8333	0.7489
Binary F1	—	0.8902	0.9250	0.8459

Per-class Detail

Label	P	R	F1	Support
O	0.9795	0.9654	0.9724	3704
FP	0.9888	1.0000	0.9944	176
RP	0.6766	0.7816	0.7253	174
RV	0.2824	0.4302	0.3410	86
PW	0.9811	0.8927	0.9348	233

Hyperparameters

learning_rate: 5e-05
batch_size effective: 32 (8 × 4 grad_accum)
epochs: 15
warmup_steps: 963
weight_decay: 0.1
classifier_dropout: 0.3
focal_loss_gamma: 3.0 (adaptive)
class_weights: O=1.0, FP=3.0, RP=6.0, RV=20.0, PW=5.0

Downloads last month: 11

Safetensors

Model size

0.4B params

Tensor type

F32

·

Dataset used to train arielcerdap/modernbert-disfluency-expD-large-mixed