ssc-kbd-mms-model-mix-adapt-max-longcv

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2775
  • Cer: 0.0946
  • Wer: 0.5334

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
0.6561 0.1762 200 0.5488 0.1709 0.8063
0.5184 0.3524 400 0.4763 0.1553 0.7542
0.4737 0.5286 600 0.4499 0.1455 0.7187
0.4482 0.7048 800 0.4106 0.1327 0.6830
0.4239 0.8811 1000 0.4123 0.1325 0.6827
0.3968 1.0573 1200 0.4069 0.1306 0.6743
0.4092 1.2335 1400 0.3814 0.1248 0.6495
0.4052 1.4097 1600 0.3746 0.1229 0.6473
0.3912 1.5859 1800 0.3828 0.1219 0.6487
0.3712 1.7621 2000 0.3624 0.1175 0.6263
0.3796 1.9383 2200 0.3600 0.1211 0.6593
0.3556 2.1145 2400 0.3544 0.1179 0.6393
0.3636 2.2907 2600 0.3527 0.1158 0.6248
0.3662 2.4670 2800 0.3484 0.1143 0.6176
0.3477 2.6432 3000 0.3405 0.1133 0.6088
0.3633 2.8194 3200 0.3430 0.1151 0.6182
0.3584 2.9956 3400 0.3439 0.1158 0.6259
0.3488 3.1718 3600 0.3465 0.1153 0.6283
0.3425 3.3480 3800 0.3393 0.1107 0.5959
0.3312 3.5242 4000 0.3371 0.1130 0.6077
0.339 3.7004 4200 0.3395 0.1117 0.5997
0.3301 3.8767 4400 0.3340 0.1109 0.5973
0.3335 4.0529 4600 0.3389 0.1114 0.6052
0.3276 4.2291 4800 0.3280 0.1112 0.6076
0.3276 4.4053 5000 0.3359 0.1137 0.6056
0.3159 4.5815 5200 0.3262 0.1097 0.5929
0.325 4.7577 5400 0.3207 0.1073 0.5837
0.3196 4.9339 5600 0.3299 0.1107 0.6048
0.3101 5.1101 5800 0.3153 0.1063 0.5872
0.3114 5.2863 6000 0.3205 0.1064 0.5827
0.3029 5.4626 6200 0.3102 0.1056 0.5788
0.3169 5.6388 6400 0.3132 0.1049 0.5744
0.2966 5.8150 6600 0.3114 0.1049 0.5809
0.3127 5.9912 6800 0.3167 0.1069 0.5875
0.2977 6.1674 7000 0.3124 0.1054 0.5871
0.2919 6.3436 7200 0.3159 0.1066 0.5851
0.2942 6.5198 7400 0.3112 0.1039 0.5741
0.2933 6.6960 7600 0.3071 0.1045 0.5773
0.2994 6.8722 7800 0.3107 0.1048 0.5741
0.2922 7.0485 8000 0.3195 0.1054 0.5741
0.2902 7.2247 8200 0.3152 0.1051 0.5745
0.2794 7.4009 8400 0.3101 0.1042 0.5712
0.2804 7.5771 8600 0.3110 0.1041 0.5728
0.2868 7.7533 8800 0.3030 0.1033 0.5765
0.2873 7.9295 9000 0.3081 0.1037 0.5720
0.272 8.1057 9200 0.2972 0.1009 0.5596
0.2672 8.2819 9400 0.3066 0.1014 0.5630
0.2825 8.4581 9600 0.2982 0.1013 0.5651
0.2694 8.6344 9800 0.2984 0.1000 0.5487
0.2766 8.8106 10000 0.2987 0.0995 0.5500
0.2725 8.9868 10200 0.2996 0.1003 0.5548
0.2653 9.1630 10400 0.2929 0.0994 0.5546
0.2607 9.3392 10600 0.2942 0.0988 0.5513
0.2776 9.5154 10800 0.2956 0.0992 0.5521
0.2584 9.6916 11000 0.2924 0.0984 0.5493
0.2585 9.8678 11200 0.2940 0.0985 0.5509
0.2671 10.0441 11400 0.2933 0.0970 0.5433
0.2535 10.2203 11600 0.2898 0.0981 0.5453
0.2595 10.3965 11800 0.2878 0.0980 0.5517
0.2508 10.5727 12000 0.2868 0.0975 0.5451
0.2486 10.7489 12200 0.2905 0.0977 0.5459
0.2549 10.9251 12400 0.2900 0.0975 0.5440
0.2397 11.1013 12600 0.2880 0.0978 0.5463
0.2501 11.2775 12800 0.2856 0.0966 0.5413
0.2457 11.4537 13000 0.2866 0.0973 0.5401
0.2459 11.6300 13200 0.2867 0.0967 0.5426
0.2438 11.8062 13400 0.2829 0.0974 0.5455
0.2477 11.9824 13600 0.2823 0.0963 0.5433
0.2301 12.1586 13800 0.2817 0.0964 0.5420
0.2448 12.3348 14000 0.2807 0.0951 0.5351
0.242 12.5110 14200 0.2820 0.0955 0.5374
0.2313 12.6872 14400 0.2814 0.0948 0.5350
0.2375 12.8634 14600 0.2794 0.0951 0.5372
0.2365 13.0396 14800 0.2824 0.0948 0.5370
0.2243 13.2159 15000 0.2817 0.0947 0.5347
0.2297 13.3921 15200 0.2817 0.0949 0.5329
0.23 13.5683 15400 0.2815 0.0944 0.5294
0.225 13.7445 15600 0.2807 0.0953 0.5345
0.2416 13.9207 15800 0.2805 0.0953 0.5325
0.2276 14.0969 16000 0.2806 0.0948 0.5368
0.227 14.2731 16200 0.2789 0.0952 0.5334
0.2232 14.4493 16400 0.2786 0.0943 0.5282
0.2267 14.6256 16600 0.2771 0.0947 0.5326
0.2178 14.8018 16800 0.2774 0.0946 0.5329
0.2223 14.9780 17000 0.2775 0.0946 0.5334

Framework versions

  • Transformers 4.52.1
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.4
Downloads last month
1
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ctaguchi/ssc-kbd-mms-model-mix-adapt-max-longcv

Finetuned
(382)
this model