wav2vec2-xls-r-300m-en-phoneme-ctc-60h-noisy

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1492
  • Per: 0.0393
  • Phoneme Accuracy: 0.9607

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Per Phoneme Accuracy
3.8716 0.3674 500 3.7213 1.0 0.0
3.5698 0.7348 1000 3.5655 1.0 0.0
3.2458 1.1021 1500 2.8473 0.9143 0.0857
0.7065 1.4695 2000 0.3996 0.0893 0.9107
0.5249 1.8369 2500 0.2702 0.0657 0.9343
0.4425 2.2043 3000 0.2217 0.0567 0.9433
0.3613 2.5716 3500 0.1995 0.0515 0.9485
0.3521 2.9390 4000 0.1897 0.0485 0.9515
0.3083 3.3064 4500 0.1808 0.0468 0.9532
0.3115 3.6738 5000 0.1823 0.0461 0.9539
0.2905 4.0411 5500 0.1667 0.0448 0.9552
0.2948 4.4085 6000 0.1700 0.0441 0.9559
0.2578 4.7759 6500 0.1682 0.0437 0.9563
0.2351 5.1433 7000 0.1586 0.0424 0.9576
0.244 5.5107 7500 0.1601 0.0422 0.9578
0.2318 5.8780 8000 0.1541 0.0414 0.9586
0.2667 6.2454 8500 0.1557 0.0413 0.9587
0.2381 6.6128 9000 0.1527 0.0405 0.9595
0.2317 6.9802 9500 0.1498 0.0401 0.9599
0.1982 7.3475 10000 0.1518 0.0406 0.9594
0.2075 7.7149 10500 0.1492 0.0404 0.9596
0.2049 8.0823 11000 0.1530 0.0396 0.9604
0.1926 8.4497 11500 0.1501 0.0393 0.9607
0.2129 8.8170 12000 0.1463 0.0395 0.9605
0.2193 9.1844 12500 0.1478 0.0393 0.9607
0.1999 9.5518 13000 0.1483 0.0390 0.9610
0.2192 9.9192 13500 0.1492 0.0393 0.9607

Framework versions

  • Transformers 4.57.3
  • Pytorch 2.9.1+cu126
  • Datasets 4.4.2
  • Tokenizers 0.22.1
Downloads last month
42
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bobboyms/wav2vec2-xls-r-300m-en-phoneme-ctc-60h-noisy

Finetuned
(832)
this model