ssc-ady-mms-model-mix-adapt-max-lowlr
This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.2081
- Cer: 0.3836
- Wer: 1.1772
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 8
- eval_batch_size: 6
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 30
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
|---|---|---|---|---|---|
| 3.6455 | 0.2757 | 200 | 3.7070 | 0.8617 | 1.0127 |
| 2.9122 | 0.5513 | 400 | 2.2209 | 0.5506 | 1.1299 |
| 1.431 | 0.8270 | 600 | 1.4588 | 0.2989 | 0.9696 |
| 1.0536 | 1.1020 | 800 | 1.0366 | 0.2482 | 0.9074 |
| 0.8802 | 1.3777 | 1000 | 0.7922 | 0.1982 | 0.8436 |
| 0.7653 | 1.6533 | 1200 | 0.8002 | 0.1798 | 0.8048 |
| 0.7314 | 1.9290 | 1400 | 0.6938 | 0.1772 | 0.7945 |
| 0.6541 | 2.2040 | 1600 | 0.6886 | 0.1688 | 0.7730 |
| 0.7196 | 2.4797 | 1800 | 0.6683 | 0.1644 | 0.7625 |
| 0.6347 | 2.7553 | 2000 | 0.6049 | 0.1659 | 0.7694 |
| 0.6022 | 3.0303 | 2200 | 0.5791 | 0.1587 | 0.7362 |
| 0.5968 | 3.3060 | 2400 | 0.6022 | 0.1557 | 0.7290 |
| 0.5944 | 3.5817 | 2600 | 0.5807 | 0.1574 | 0.7271 |
| 0.5261 | 3.8573 | 2800 | 0.5674 | 0.1530 | 0.7226 |
| 0.5391 | 4.1323 | 3000 | 0.5609 | 0.1543 | 0.7276 |
| 0.5872 | 4.4080 | 3200 | 0.6383 | 0.1656 | 0.7345 |
| 0.5658 | 4.6837 | 3400 | 0.6094 | 0.1537 | 0.7429 |
| 0.5499 | 4.9593 | 3600 | 0.5879 | 0.1521 | 0.7111 |
| 0.5567 | 5.2343 | 3800 | 0.5392 | 0.1574 | 0.7271 |
| 0.5838 | 5.5100 | 4000 | 0.5671 | 0.1531 | 0.7242 |
| 0.6511 | 5.7857 | 4200 | 0.7079 | 0.1646 | 0.7558 |
| 0.7377 | 6.0606 | 4400 | 0.6979 | 0.1679 | 0.7656 |
| 0.9913 | 6.3363 | 4600 | 1.0002 | 0.2081 | 0.8551 |
| 1.4417 | 6.6120 | 4800 | 1.4586 | 0.3675 | 1.0175 |
| 1.8834 | 6.8877 | 5000 | 1.8701 | 0.4284 | 1.0514 |
| 1.9547 | 7.1626 | 5200 | 1.9173 | 0.4393 | 1.1942 |
| 1.9077 | 7.4383 | 5400 | 1.8904 | 0.4376 | 1.0328 |
| 1.92 | 7.7140 | 5600 | 1.7565 | 0.4748 | 1.0677 |
| 1.8001 | 7.9897 | 5800 | 1.6262 | 0.3889 | 1.0464 |
| 2.0177 | 8.2646 | 6000 | 1.8403 | 0.3779 | 1.1019 |
| 2.1182 | 8.5403 | 6200 | 2.0323 | 0.3515 | 1.0517 |
| 2.393 | 8.8160 | 6400 | 2.3072 | 0.3382 | 1.0292 |
| 2.6789 | 9.0910 | 6600 | 2.7029 | 0.3212 | 1.0010 |
| 3.0282 | 9.3666 | 6800 | 2.8690 | 0.3231 | 1.0356 |
| 3.2046 | 9.6423 | 7000 | 3.0714 | 0.3164 | 1.0804 |
| 3.3921 | 9.9180 | 7200 | 3.2561 | 0.3614 | 1.2050 |
| 3.7754 | 10.1930 | 7400 | 3.6960 | 0.4188 | 1.3707 |
| 3.6393 | 10.4686 | 7600 | 3.3532 | 0.3763 | 1.2712 |
| 3.3758 | 10.7443 | 7800 | 3.3073 | 0.4073 | 1.2590 |
| 3.3032 | 11.0193 | 8000 | 3.2004 | 0.3817 | 1.1811 |
| 3.1949 | 11.2950 | 8200 | 3.2079 | 0.3837 | 1.1796 |
| 3.1668 | 11.5706 | 8400 | 3.2079 | 0.3834 | 1.1794 |
| 3.3528 | 11.8463 | 8600 | 3.2079 | 0.3837 | 1.1782 |
| 3.3205 | 12.1213 | 8800 | 3.2079 | 0.3836 | 1.1787 |
| 3.3361 | 12.3970 | 9000 | 3.2080 | 0.3832 | 1.1789 |
| 3.2947 | 12.6726 | 9200 | 3.2082 | 0.3838 | 1.1794 |
| 3.2295 | 12.9483 | 9400 | 3.2080 | 0.3838 | 1.1803 |
| 3.3105 | 13.2233 | 9600 | 3.2080 | 0.3830 | 1.1791 |
| 3.2781 | 13.4990 | 9800 | 3.2078 | 0.3836 | 1.1796 |
| 3.2698 | 13.7746 | 10000 | 3.2080 | 0.3837 | 1.1787 |
| 3.3087 | 14.0496 | 10200 | 3.2081 | 0.3835 | 1.1784 |
| 3.3047 | 14.3253 | 10400 | 3.2080 | 0.3832 | 1.1791 |
| 3.3498 | 14.6010 | 10600 | 3.2079 | 0.3832 | 1.1782 |
| 3.2942 | 14.8766 | 10800 | 3.2086 | 0.3835 | 1.1777 |
| 3.3476 | 15.1516 | 11000 | 3.2081 | 0.3829 | 1.1791 |
| 3.2896 | 15.4273 | 11200 | 3.2080 | 0.3833 | 1.1799 |
| 3.2675 | 15.7030 | 11400 | 3.2085 | 0.3827 | 1.1775 |
| 3.2612 | 15.9786 | 11600 | 3.2081 | 0.3841 | 1.1794 |
| 3.3005 | 16.2536 | 11800 | 3.2081 | 0.3841 | 1.1789 |
| 3.375 | 16.5293 | 12000 | 3.2083 | 0.3833 | 1.1789 |
| 3.2658 | 16.8050 | 12200 | 3.2083 | 0.3830 | 1.1789 |
| 3.3458 | 17.0799 | 12400 | 3.2080 | 0.3831 | 1.1796 |
| 3.3581 | 17.3556 | 12600 | 3.2080 | 0.3834 | 1.1791 |
| 3.2375 | 17.6313 | 12800 | 3.2078 | 0.3827 | 1.1787 |
| 3.2324 | 17.9070 | 13000 | 3.2081 | 0.3837 | 1.1779 |
| 3.2995 | 18.1819 | 13200 | 3.2078 | 0.3836 | 1.1779 |
| 3.2778 | 18.4576 | 13400 | 3.2080 | 0.3834 | 1.1789 |
| 3.204 | 18.7333 | 13600 | 3.2079 | 0.3837 | 1.1789 |
| 3.3797 | 19.0083 | 13800 | 3.2079 | 0.3842 | 1.1789 |
| 3.3611 | 19.2839 | 14000 | 3.2080 | 0.3835 | 1.1787 |
| 3.2607 | 19.5596 | 14200 | 3.2081 | 0.3837 | 1.1789 |
| 3.2885 | 19.8353 | 14400 | 3.2079 | 0.3836 | 1.1796 |
| 3.2039 | 20.1103 | 14600 | 3.2081 | 0.3841 | 1.1775 |
| 3.2986 | 20.3859 | 14800 | 3.2082 | 0.3838 | 1.1772 |
| 3.3355 | 20.6616 | 15000 | 3.2082 | 0.3830 | 1.1765 |
| 3.3123 | 20.9373 | 15200 | 3.2080 | 0.3828 | 1.1772 |
| 3.3064 | 21.2123 | 15400 | 3.2079 | 0.3831 | 1.1768 |
| 3.3206 | 21.4879 | 15600 | 3.2078 | 0.3840 | 1.1789 |
| 3.3813 | 21.7636 | 15800 | 3.2081 | 0.3831 | 1.1791 |
| 3.3911 | 22.0386 | 16000 | 3.2081 | 0.3830 | 1.1784 |
| 3.2509 | 22.3143 | 16200 | 3.2082 | 0.3836 | 1.1791 |
| 3.1714 | 22.5899 | 16400 | 3.2080 | 0.3834 | 1.1784 |
| 3.3419 | 22.8656 | 16600 | 3.2078 | 0.3836 | 1.1789 |
| 3.264 | 23.1406 | 16800 | 3.2080 | 0.3834 | 1.1791 |
| 3.2609 | 23.4163 | 17000 | 3.2079 | 0.3836 | 1.1794 |
| 3.212 | 23.6919 | 17200 | 3.2083 | 0.3837 | 1.1775 |
| 3.3195 | 23.9676 | 17400 | 3.2076 | 0.3830 | 1.1791 |
| 3.2974 | 24.2426 | 17600 | 3.2077 | 0.3828 | 1.1784 |
| 3.2724 | 24.5183 | 17800 | 3.2080 | 0.3830 | 1.1782 |
| 3.3547 | 24.7939 | 18000 | 3.2082 | 0.3830 | 1.1782 |
| 3.2917 | 25.0689 | 18200 | 3.2083 | 0.3831 | 1.1775 |
| 3.3446 | 25.3446 | 18400 | 3.2080 | 0.3829 | 1.1787 |
| 3.2528 | 25.6203 | 18600 | 3.2081 | 0.3835 | 1.1789 |
| 3.3191 | 25.8959 | 18800 | 3.2078 | 0.3836 | 1.1794 |
| 3.401 | 26.1709 | 19000 | 3.2082 | 0.3843 | 1.1787 |
| 3.2492 | 26.4466 | 19200 | 3.2079 | 0.3836 | 1.1784 |
| 3.224 | 26.7223 | 19400 | 3.2080 | 0.3832 | 1.1784 |
| 3.3592 | 26.9979 | 19600 | 3.2079 | 0.3842 | 1.1794 |
| 3.3822 | 27.2729 | 19800 | 3.2082 | 0.3829 | 1.1772 |
| 3.324 | 27.5486 | 20000 | 3.2077 | 0.3839 | 1.1782 |
| 3.2325 | 27.8243 | 20200 | 3.2080 | 0.3829 | 1.1777 |
| 3.2622 | 28.0992 | 20400 | 3.2077 | 0.3829 | 1.1779 |
| 3.2791 | 28.3749 | 20600 | 3.2075 | 0.3833 | 1.1787 |
| 3.3578 | 28.6506 | 20800 | 3.2080 | 0.3839 | 1.1777 |
| 3.2311 | 28.9263 | 21000 | 3.2076 | 0.3829 | 1.1782 |
| 3.3323 | 29.2012 | 21200 | 3.2082 | 0.3835 | 1.1782 |
| 3.284 | 29.4769 | 21400 | 3.2080 | 0.3830 | 1.1779 |
| 3.3228 | 29.7526 | 21600 | 3.2081 | 0.3836 | 1.1772 |
Framework versions
- Transformers 4.57.2
- Pytorch 2.9.1+cu128
- Datasets 3.6.0
- Tokenizers 0.22.0
- Downloads last month
- 1
Model tree for ctaguchi/ssc-ady-mms-model-mix-adapt-max-lowlr
Base model
facebook/mms-1b-all