ssc-ukv-mms-model-mix-adapt-max-lowlr
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.5083
- Cer: 0.1348
- Wer: 0.4147
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 1
- eval_batch_size: 6
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 2
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
|---|---|---|---|---|---|
| 0.6983 | 0.2658 | 200 | 0.5624 | 0.1404 | 0.4446 |
| 0.6518 | 0.5316 | 400 | 0.5353 | 0.1406 | 0.4348 |
| 0.5823 | 0.7973 | 600 | 0.5431 | 0.1381 | 0.4301 |
| 0.573 | 1.0625 | 800 | 0.5462 | 0.1382 | 0.4358 |
| 0.6338 | 1.3282 | 1000 | 0.5299 | 0.1370 | 0.4257 |
| 0.5693 | 1.5940 | 1200 | 0.5296 | 0.1369 | 0.4292 |
| 0.5789 | 1.8598 | 1400 | 0.5216 | 0.1366 | 0.4249 |
| 0.5366 | 2.1249 | 1600 | 0.5204 | 0.1388 | 0.4340 |
| 0.518 | 2.3907 | 1800 | 0.5237 | 0.1360 | 0.4219 |
| 0.6573 | 2.6565 | 2000 | 0.5170 | 0.1346 | 0.4158 |
| 0.557 | 2.9223 | 2200 | 0.5187 | 0.1356 | 0.4189 |
| 0.5135 | 3.1874 | 2400 | 0.5164 | 0.1352 | 0.4182 |
| 0.4979 | 3.4532 | 2600 | 0.5103 | 0.1349 | 0.4168 |
| 0.5908 | 3.7189 | 2800 | 0.5137 | 0.1350 | 0.4175 |
| 0.553 | 3.9847 | 3000 | 0.5103 | 0.1351 | 0.4172 |
| 0.5559 | 4.2498 | 3200 | 0.5115 | 0.1352 | 0.4184 |
| 0.5706 | 4.5156 | 3400 | 0.5094 | 0.1347 | 0.4169 |
| 0.5647 | 4.7814 | 3600 | 0.5083 | 0.1348 | 0.4147 |
Framework versions
- Transformers 4.57.2
- Pytorch 2.9.1+cu128
- Datasets 3.6.0
- Tokenizers 0.22.0
- Downloads last month
- 1