ssc-el-CY-mms-model-mix-adapt-max-lowlr
This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.3754
- Cer: 0.2342
- Wer: 0.6115
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 8
- eval_batch_size: 6
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
|---|---|---|---|---|---|
| 2.195 | 0.3265 | 200 | 1.4337 | 0.3110 | 0.8452 |
| 1.2374 | 0.6531 | 400 | 1.3694 | 0.2952 | 0.7816 |
| 1.1455 | 0.9796 | 600 | 1.2702 | 0.2693 | 0.7370 |
| 1.0345 | 1.3053 | 800 | 1.3263 | 0.2626 | 0.7055 |
| 1.0057 | 1.6318 | 1000 | 1.3104 | 0.2551 | 0.6832 |
| 0.9932 | 1.9584 | 1200 | 1.3295 | 0.2602 | 0.6882 |
| 0.9756 | 2.2841 | 1400 | 1.3080 | 0.2554 | 0.6862 |
| 1.0163 | 2.6106 | 1600 | 1.2471 | 0.2479 | 0.6666 |
| 0.8878 | 2.9371 | 1800 | 1.3343 | 0.2516 | 0.6589 |
| 0.8929 | 3.2629 | 2000 | 1.2979 | 0.2415 | 0.6425 |
| 0.8487 | 3.5894 | 2200 | 1.3063 | 0.2436 | 0.6481 |
| 0.8473 | 3.9159 | 2400 | 1.2850 | 0.2427 | 0.6404 |
| 0.8445 | 4.2416 | 2600 | 1.3180 | 0.2412 | 0.6341 |
| 0.8127 | 4.5682 | 2800 | 1.3677 | 0.2379 | 0.6313 |
| 0.8213 | 4.8947 | 3000 | 1.3400 | 0.2395 | 0.6272 |
| 0.8358 | 5.2204 | 3200 | 1.3674 | 0.2445 | 0.6351 |
| 0.8756 | 5.5469 | 3400 | 1.3329 | 0.2353 | 0.6206 |
| 0.8017 | 5.8735 | 3600 | 1.3204 | 0.2360 | 0.6200 |
| 0.7917 | 6.1992 | 3800 | 1.3267 | 0.2317 | 0.6140 |
| 0.7755 | 6.5257 | 4000 | 1.3517 | 0.2380 | 0.6243 |
| 0.7652 | 6.8522 | 4200 | 1.3130 | 0.2369 | 0.6213 |
| 0.7336 | 7.1780 | 4400 | 1.3622 | 0.2375 | 0.6227 |
| 0.7668 | 7.5045 | 4600 | 1.3729 | 0.2370 | 0.6239 |
| 0.8038 | 7.8310 | 4800 | 1.3460 | 0.2388 | 0.6230 |
| 0.7281 | 8.1567 | 5000 | 1.3815 | 0.2360 | 0.6129 |
| 0.7599 | 8.4833 | 5200 | 1.3735 | 0.2313 | 0.6047 |
| 0.668 | 8.8098 | 5400 | 1.3905 | 0.2330 | 0.6052 |
| 0.7421 | 9.1355 | 5600 | 1.3592 | 0.2311 | 0.6085 |
| 0.7173 | 9.4620 | 5800 | 1.3857 | 0.2356 | 0.6123 |
| 0.7326 | 9.7886 | 6000 | 1.3754 | 0.2342 | 0.6115 |
Framework versions
- Transformers 4.57.2
- Pytorch 2.9.1+cu128
- Datasets 3.6.0
- Tokenizers 0.22.0
- Downloads last month
- 1
Model tree for ctaguchi/ssc-el-CY-mms-model-mix-adapt-max-lowlr
Base model
facebook/mms-1b-all