ssc-kbd-mms-model-mix-adapt-max-lowlr
This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.3069
- Cer: 0.1006
- Wer: 0.5105
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 8
- eval_batch_size: 6
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 30
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
|---|---|---|---|---|---|
| 3.3474 | 0.1812 | 200 | 2.8716 | 0.8594 | 1.0112 |
| 0.6619 | 0.3625 | 400 | 0.5829 | 0.1891 | 0.8549 |
| 0.5716 | 0.5437 | 600 | 0.4984 | 0.1661 | 0.7840 |
| 0.4733 | 0.7250 | 800 | 0.4458 | 0.1516 | 0.7311 |
| 0.4485 | 0.9062 | 1000 | 0.4506 | 0.1503 | 0.7259 |
| 0.4282 | 1.0870 | 1200 | 0.4042 | 0.1419 | 0.6946 |
| 0.393 | 1.2682 | 1400 | 0.3975 | 0.1393 | 0.6871 |
| 0.3859 | 1.4495 | 1600 | 0.3941 | 0.1377 | 0.6833 |
| 0.4281 | 1.6307 | 1800 | 0.3931 | 0.1357 | 0.6762 |
| 0.3874 | 1.8120 | 2000 | 0.3771 | 0.1342 | 0.6656 |
| 0.387 | 1.9932 | 2200 | 0.3703 | 0.1320 | 0.6606 |
| 0.3742 | 2.1740 | 2400 | 0.3632 | 0.1292 | 0.6544 |
| 0.349 | 2.3552 | 2600 | 0.3615 | 0.1302 | 0.6598 |
| 0.3557 | 2.5365 | 2800 | 0.3593 | 0.1284 | 0.6480 |
| 0.3365 | 2.7177 | 3000 | 0.3528 | 0.1254 | 0.6457 |
| 0.3332 | 2.8990 | 3200 | 0.3507 | 0.1250 | 0.6345 |
| 0.3324 | 3.0797 | 3400 | 0.3458 | 0.1236 | 0.6339 |
| 0.4346 | 3.2610 | 3600 | 0.3409 | 0.1243 | 0.6332 |
| 0.3632 | 3.4422 | 3800 | 0.3378 | 0.1223 | 0.6201 |
| 0.3464 | 3.6235 | 4000 | 0.3348 | 0.1206 | 0.6148 |
| 0.3246 | 3.8047 | 4200 | 0.3334 | 0.1203 | 0.6137 |
| 0.3134 | 3.9860 | 4400 | 0.3438 | 0.1236 | 0.6221 |
| 0.2967 | 4.1667 | 4600 | 0.3358 | 0.1204 | 0.6134 |
| 0.3427 | 4.3480 | 4800 | 0.3380 | 0.1200 | 0.6126 |
| 0.3032 | 4.5292 | 5000 | 0.3230 | 0.1183 | 0.6038 |
| 0.318 | 4.7105 | 5200 | 0.3254 | 0.1188 | 0.6086 |
| 0.3939 | 4.8917 | 5400 | 0.3189 | 0.1177 | 0.6013 |
| 0.2977 | 5.0725 | 5600 | 0.3239 | 0.1175 | 0.5993 |
| 0.3021 | 5.2537 | 5800 | 0.3253 | 0.1164 | 0.5920 |
| 0.3235 | 5.4350 | 6000 | 0.3183 | 0.1154 | 0.5896 |
| 0.3106 | 5.6162 | 6200 | 0.3166 | 0.1159 | 0.5919 |
| 0.3677 | 5.7975 | 6400 | 0.3230 | 0.1173 | 0.5940 |
| 0.2933 | 5.9787 | 6600 | 0.3171 | 0.1153 | 0.5895 |
| 0.3006 | 6.1595 | 6800 | 0.3168 | 0.1158 | 0.5843 |
| 0.3968 | 6.3407 | 7000 | 0.3160 | 0.1146 | 0.5832 |
| 0.2729 | 6.5220 | 7200 | 0.3163 | 0.1128 | 0.5777 |
| 0.3113 | 6.7032 | 7400 | 0.3158 | 0.1142 | 0.5831 |
| 0.2784 | 6.8845 | 7600 | 0.3155 | 0.1153 | 0.5863 |
| 0.274 | 7.0652 | 7800 | 0.3083 | 0.1123 | 0.5766 |
| 0.2934 | 7.2465 | 8000 | 0.3048 | 0.1113 | 0.5695 |
| 0.2837 | 7.4277 | 8200 | 0.3090 | 0.1109 | 0.5670 |
| 0.2643 | 7.6090 | 8400 | 0.3052 | 0.1103 | 0.5600 |
| 0.2857 | 7.7902 | 8600 | 0.3068 | 0.1123 | 0.5693 |
| 0.2913 | 7.9715 | 8800 | 0.3025 | 0.1117 | 0.5658 |
| 0.2681 | 8.1522 | 9000 | 0.3119 | 0.1110 | 0.5640 |
| 0.2858 | 8.3335 | 9200 | 0.3110 | 0.1117 | 0.5703 |
| 0.2643 | 8.5147 | 9400 | 0.3023 | 0.1103 | 0.5655 |
| 0.2579 | 8.6960 | 9600 | 0.3067 | 0.1118 | 0.5736 |
| 0.2824 | 8.8772 | 9800 | 0.3076 | 0.1105 | 0.5665 |
| 0.3574 | 9.0580 | 10000 | 0.3354 | 0.1106 | 0.5666 |
| 0.2464 | 9.2392 | 10200 | 0.3086 | 0.1097 | 0.5594 |
| 0.2454 | 9.4205 | 10400 | 0.3098 | 0.1106 | 0.5632 |
| 0.2649 | 9.6017 | 10600 | 0.3053 | 0.1099 | 0.5606 |
| 0.2666 | 9.7830 | 10800 | 0.3050 | 0.1097 | 0.5591 |
| 0.2555 | 9.9642 | 11000 | 0.3019 | 0.1099 | 0.5635 |
| 0.2405 | 10.1450 | 11200 | 0.3057 | 0.1096 | 0.5632 |
| 0.2551 | 10.3262 | 11400 | 0.3035 | 0.1105 | 0.5634 |
| 0.2586 | 10.5075 | 11600 | 0.3026 | 0.1097 | 0.5578 |
| 0.2646 | 10.6887 | 11800 | 0.3042 | 0.1101 | 0.5651 |
| 0.2783 | 10.8700 | 12000 | 0.3085 | 0.1102 | 0.5622 |
| 0.4114 | 11.0507 | 12200 | 0.3016 | 0.1093 | 0.5591 |
| 0.2332 | 11.2320 | 12400 | 0.3068 | 0.1080 | 0.5540 |
| 0.2406 | 11.4132 | 12600 | 0.3079 | 0.1092 | 0.5556 |
| 0.2477 | 11.5945 | 12800 | 0.3051 | 0.1091 | 0.5583 |
| 0.2537 | 11.7757 | 13000 | 0.2983 | 0.1081 | 0.5527 |
| 0.2404 | 11.9570 | 13200 | 0.3051 | 0.1088 | 0.5540 |
| 0.2285 | 12.1377 | 13400 | 0.3054 | 0.1090 | 0.5548 |
| 0.2248 | 12.3190 | 13600 | 0.3026 | 0.1080 | 0.5540 |
| 0.2376 | 12.5002 | 13800 | 0.3093 | 0.1084 | 0.5535 |
| 0.2441 | 12.6815 | 14000 | 0.2999 | 0.1074 | 0.5489 |
| 0.2383 | 12.8627 | 14200 | 0.3020 | 0.1071 | 0.5438 |
| 0.2349 | 13.0435 | 14400 | 0.2943 | 0.1054 | 0.5379 |
| 0.2336 | 13.2247 | 14600 | 0.3016 | 0.1070 | 0.5432 |
| 0.2128 | 13.4060 | 14800 | 0.3028 | 0.1056 | 0.5404 |
| 0.2209 | 13.5872 | 15000 | 0.2960 | 0.1051 | 0.5358 |
| 0.2283 | 13.7685 | 15200 | 0.3066 | 0.1081 | 0.5461 |
| 0.239 | 13.9497 | 15400 | 0.3011 | 0.1057 | 0.5355 |
| 0.2271 | 14.1305 | 15600 | 0.2997 | 0.1056 | 0.5385 |
| 0.2196 | 14.3117 | 15800 | 0.2988 | 0.1060 | 0.5414 |
| 0.2232 | 14.4930 | 16000 | 0.2926 | 0.1048 | 0.5379 |
| 0.2234 | 14.6742 | 16200 | 0.2976 | 0.1064 | 0.5386 |
| 0.2327 | 14.8555 | 16400 | 0.2926 | 0.1043 | 0.5357 |
| 0.2096 | 15.0362 | 16600 | 0.2995 | 0.1059 | 0.5400 |
| 0.2174 | 15.2175 | 16800 | 0.2951 | 0.1039 | 0.5334 |
| 0.2153 | 15.3987 | 17000 | 0.2959 | 0.1054 | 0.5359 |
| 0.2098 | 15.5800 | 17200 | 0.2930 | 0.1044 | 0.5313 |
| 0.2213 | 15.7612 | 17400 | 0.2945 | 0.1044 | 0.5295 |
| 0.2232 | 15.9425 | 17600 | 0.2923 | 0.1048 | 0.5358 |
| 0.206 | 16.1232 | 17800 | 0.2969 | 0.1047 | 0.5322 |
| 0.1963 | 16.3045 | 18000 | 0.2978 | 0.1047 | 0.5278 |
| 0.217 | 16.4857 | 18200 | 0.2941 | 0.1047 | 0.5345 |
| 0.2088 | 16.6670 | 18400 | 0.2951 | 0.1040 | 0.5333 |
| 0.2268 | 16.8482 | 18600 | 0.2958 | 0.1043 | 0.5313 |
| 0.223 | 17.0290 | 18800 | 0.2988 | 0.1049 | 0.5322 |
| 0.2083 | 17.2102 | 19000 | 0.2958 | 0.1043 | 0.5292 |
| 0.2046 | 17.3915 | 19200 | 0.2974 | 0.1038 | 0.5264 |
| 0.2003 | 17.5727 | 19400 | 0.2939 | 0.1038 | 0.5306 |
| 0.2143 | 17.7540 | 19600 | 0.2962 | 0.1032 | 0.5243 |
| 0.2084 | 17.9352 | 19800 | 0.2943 | 0.1032 | 0.5244 |
| 0.2055 | 18.1160 | 20000 | 0.2929 | 0.1034 | 0.5276 |
| 0.2077 | 18.2972 | 20200 | 0.2954 | 0.1027 | 0.5227 |
| 0.2038 | 18.4785 | 20400 | 0.2973 | 0.1028 | 0.5238 |
| 0.2119 | 18.6597 | 20600 | 0.2973 | 0.1034 | 0.5236 |
| 0.2059 | 18.8410 | 20800 | 0.2975 | 0.1036 | 0.5271 |
| 0.1963 | 19.0217 | 21000 | 0.2952 | 0.1033 | 0.5255 |
| 0.1913 | 19.2030 | 21200 | 0.2983 | 0.1032 | 0.5287 |
| 0.2041 | 19.3842 | 21400 | 0.2935 | 0.1029 | 0.5242 |
| 0.2008 | 19.5655 | 21600 | 0.2965 | 0.1023 | 0.5205 |
| 0.1961 | 19.7467 | 21800 | 0.2960 | 0.1030 | 0.5197 |
| 0.2027 | 19.9280 | 22000 | 0.2981 | 0.1029 | 0.5222 |
| 0.1977 | 20.1087 | 22200 | 0.2980 | 0.1025 | 0.5226 |
| 0.1816 | 20.2900 | 22400 | 0.2998 | 0.1025 | 0.5216 |
| 0.2012 | 20.4712 | 22600 | 0.2932 | 0.1029 | 0.5231 |
| 0.2004 | 20.6525 | 22800 | 0.2977 | 0.1021 | 0.5179 |
| 0.191 | 20.8337 | 23000 | 0.2978 | 0.1025 | 0.5240 |
| 0.1941 | 21.0145 | 23200 | 0.3030 | 0.1027 | 0.5213 |
| 0.1766 | 21.1957 | 23400 | 0.3060 | 0.1021 | 0.5207 |
| 0.1867 | 21.3770 | 23600 | 0.3025 | 0.1029 | 0.5268 |
| 0.1949 | 21.5582 | 23800 | 0.3020 | 0.1022 | 0.5180 |
| 0.1951 | 21.7395 | 24000 | 0.2974 | 0.1018 | 0.5172 |
| 0.1894 | 21.9207 | 24200 | 0.3014 | 0.1025 | 0.5181 |
| 0.1893 | 22.1015 | 24400 | 0.3061 | 0.1020 | 0.5173 |
| 0.1834 | 22.2827 | 24600 | 0.3035 | 0.1032 | 0.5234 |
| 0.1876 | 22.4640 | 24800 | 0.3039 | 0.1021 | 0.5218 |
| 0.1903 | 22.6452 | 25000 | 0.3051 | 0.1024 | 0.5195 |
| 0.1852 | 22.8265 | 25200 | 0.3037 | 0.1024 | 0.5213 |
| 0.1875 | 23.0072 | 25400 | 0.2985 | 0.1019 | 0.5215 |
| 0.1893 | 23.1885 | 25600 | 0.3029 | 0.1019 | 0.5177 |
| 0.1796 | 23.3697 | 25800 | 0.3049 | 0.1013 | 0.5167 |
| 0.1841 | 23.5510 | 26000 | 0.3082 | 0.1015 | 0.5160 |
| 0.1927 | 23.7322 | 26200 | 0.3047 | 0.1019 | 0.5176 |
| 0.1843 | 23.9135 | 26400 | 0.3026 | 0.1010 | 0.5116 |
| 0.1707 | 24.0942 | 26600 | 0.3065 | 0.1010 | 0.5133 |
| 0.1833 | 24.2755 | 26800 | 0.3092 | 0.1011 | 0.5180 |
| 0.1778 | 24.4567 | 27000 | 0.3125 | 0.1008 | 0.5101 |
| 0.1822 | 24.6380 | 27200 | 0.3035 | 0.1012 | 0.5153 |
| 0.1859 | 24.8192 | 27400 | 0.3050 | 0.1016 | 0.5169 |
| 0.1801 | 25.0 | 27600 | 0.3047 | 0.1012 | 0.5136 |
| 0.1758 | 25.1812 | 27800 | 0.3047 | 0.1009 | 0.5167 |
| 0.1764 | 25.3625 | 28000 | 0.3045 | 0.1011 | 0.5133 |
| 0.185 | 25.5437 | 28200 | 0.3078 | 0.1016 | 0.5213 |
| 0.1744 | 25.7250 | 28400 | 0.3030 | 0.1015 | 0.5156 |
| 0.1809 | 25.9062 | 28600 | 0.3043 | 0.1014 | 0.5140 |
| 0.165 | 26.0870 | 28800 | 0.3063 | 0.1011 | 0.5105 |
| 0.184 | 26.2682 | 29000 | 0.3050 | 0.1010 | 0.5131 |
| 0.1722 | 26.4495 | 29200 | 0.3069 | 0.1012 | 0.5139 |
| 0.1643 | 26.6307 | 29400 | 0.3079 | 0.1012 | 0.5118 |
| 0.1747 | 26.8120 | 29600 | 0.3059 | 0.1005 | 0.5096 |
| 0.1699 | 26.9932 | 29800 | 0.3078 | 0.1012 | 0.5160 |
| 0.1657 | 27.1740 | 30000 | 0.3074 | 0.1008 | 0.5122 |
| 0.1731 | 27.3552 | 30200 | 0.3054 | 0.1005 | 0.5133 |
| 0.1722 | 27.5365 | 30400 | 0.3066 | 0.1007 | 0.5108 |
| 0.178 | 27.7177 | 30600 | 0.3055 | 0.1008 | 0.5121 |
| 0.1764 | 27.8990 | 30800 | 0.3063 | 0.1004 | 0.5101 |
| 0.1602 | 28.0797 | 31000 | 0.3092 | 0.1007 | 0.5097 |
| 0.1748 | 28.2610 | 31200 | 0.3080 | 0.1003 | 0.5086 |
| 0.1677 | 28.4422 | 31400 | 0.3077 | 0.1011 | 0.5136 |
| 0.1778 | 28.6235 | 31600 | 0.3084 | 0.1008 | 0.5116 |
| 0.1746 | 28.8047 | 31800 | 0.3075 | 0.1003 | 0.5082 |
| 0.1668 | 28.9860 | 32000 | 0.3074 | 0.1007 | 0.5118 |
| 0.176 | 29.1667 | 32200 | 0.3071 | 0.1006 | 0.5104 |
| 0.1664 | 29.3480 | 32400 | 0.3073 | 0.1006 | 0.5108 |
| 0.1672 | 29.5292 | 32600 | 0.3067 | 0.1006 | 0.5114 |
| 0.166 | 29.7105 | 32800 | 0.3072 | 0.1006 | 0.5109 |
| 0.1731 | 29.8917 | 33000 | 0.3069 | 0.1006 | 0.5105 |
Framework versions
- Transformers 4.57.2
- Pytorch 2.9.1+cu128
- Datasets 3.6.0
- Tokenizers 0.22.0
- Downloads last month
- 1
Model tree for ctaguchi/ssc-kbd-mms-model-mix-adapt-max-lowlr
Base model
facebook/mms-1b-all