ssc-led-mms-model-mix-adapt-max3-devtrain
This model was trained from scratch on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3148
- Cer: 0.0905
- Wer: 0.2510
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 1
- eval_batch_size: 6
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 2
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
| Training Loss |
Epoch |
Step |
Validation Loss |
Cer |
Wer |
| 0.3221 |
0.2683 |
200 |
0.3254 |
0.0958 |
0.2812 |
| 0.3633 |
0.5366 |
400 |
0.3317 |
0.0969 |
0.2750 |
| 0.3513 |
0.8048 |
600 |
0.3423 |
0.0981 |
0.2753 |
| 0.3447 |
1.0724 |
800 |
0.3362 |
0.0941 |
0.2638 |
| 0.3403 |
1.3407 |
1000 |
0.3415 |
0.0985 |
0.2782 |
| 0.3358 |
1.6090 |
1200 |
0.3332 |
0.0954 |
0.2685 |
| 0.3146 |
1.8773 |
1400 |
0.3357 |
0.0977 |
0.2825 |
| 0.2648 |
2.1449 |
1600 |
0.3264 |
0.0938 |
0.2657 |
| 0.2934 |
2.4131 |
1800 |
0.3279 |
0.0953 |
0.2756 |
| 0.3001 |
2.6814 |
2000 |
0.3242 |
0.0948 |
0.2671 |
| 0.2749 |
2.9497 |
2200 |
0.3238 |
0.0928 |
0.2613 |
| 0.2686 |
3.2173 |
2400 |
0.3229 |
0.0925 |
0.2604 |
| 0.2461 |
3.4856 |
2600 |
0.3239 |
0.0934 |
0.2642 |
| 0.2485 |
3.7539 |
2800 |
0.3190 |
0.0923 |
0.2587 |
| 0.2703 |
4.0215 |
3000 |
0.3172 |
0.0909 |
0.2542 |
| 0.2166 |
4.2897 |
3200 |
0.3192 |
0.0899 |
0.2497 |
| 0.2312 |
4.5580 |
3400 |
0.3170 |
0.0902 |
0.2504 |
| 0.2244 |
4.8263 |
3600 |
0.3148 |
0.0905 |
0.2510 |
Framework versions
- Transformers 4.52.1
- Pytorch 2.9.1+cu128
- Datasets 3.6.0
- Tokenizers 0.21.4