ssc-bas-mms-model-mix-adapt-max-lowlr

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1145
  • Cer: 0.1361
  • Wer: 0.4189

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
0.4007 0.8457 200 0.1491 0.1468 0.4598
0.2849 1.6892 400 0.1274 0.1415 0.4395
0.278 2.5328 600 0.1249 0.1424 0.4395
0.2546 3.3763 800 0.1205 0.1401 0.4344
0.2516 4.2199 1000 0.1211 0.1421 0.4386
0.2338 5.0634 1200 0.1153 0.1394 0.4341
0.2492 5.9091 1400 0.1135 0.1386 0.4313
0.2269 6.7526 1600 0.1150 0.1397 0.4335
0.2033 7.5962 1800 0.1130 0.1386 0.4262
0.2114 8.4397 2000 0.1140 0.1393 0.4274
0.2022 9.2833 2200 0.1091 0.1378 0.4235
0.1901 10.1268 2400 0.1111 0.1369 0.4201
0.1983 10.9725 2600 0.1112 0.1377 0.4211
0.1829 11.8161 2800 0.1093 0.1377 0.4229
0.1831 12.6596 3000 0.1100 0.1365 0.4183
0.1726 13.5032 3200 0.1110 0.1367 0.4174
0.1721 14.3467 3400 0.1127 0.1371 0.4198
0.1686 15.1903 3600 0.1136 0.1361 0.4150
0.1768 16.0338 3800 0.1137 0.1374 0.4201
0.1573 16.8795 4000 0.1112 0.1363 0.4153
0.1631 17.7230 4200 0.1094 0.1355 0.4126
0.1507 18.5666 4400 0.1129 0.1363 0.4186
0.1537 19.4101 4600 0.1115 0.1369 0.4177
0.1579 20.2537 4800 0.1126 0.1360 0.4141
0.143 21.0973 5000 0.1128 0.1371 0.4189
0.1446 21.9429 5200 0.1147 0.1369 0.4189
0.1369 22.7865 5400 0.1141 0.1378 0.4229
0.1438 23.6300 5600 0.1131 0.1363 0.4168
0.1402 24.4736 5800 0.1159 0.1369 0.4192
0.1328 25.3171 6000 0.1143 0.1357 0.4159
0.1318 26.1607 6200 0.1149 0.1356 0.4150
0.1409 27.0042 6400 0.1148 0.1359 0.4165
0.1321 27.8499 6600 0.1150 0.1359 0.4150
0.1295 28.6934 6800 0.1148 0.1362 0.4174
0.1311 29.5370 7000 0.1145 0.1361 0.4189

Framework versions

  • Transformers 4.57.2
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.22.0
Downloads last month
1
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ctaguchi/ssc-bas-mms-model-mix-adapt-max-lowlr

Finetuned
(382)
this model