ssc-qxp-mms-model-mix-adapt-max-lowlr

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1488
  • Cer: 0.0937
  • Wer: 0.5184

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
0.3816 0.9975 200 0.2246 0.1189 0.6397
0.1851 1.9925 400 0.1731 0.1049 0.5763
0.1762 2.9875 600 0.1681 0.1034 0.5689
0.1644 3.9825 800 0.1574 0.1025 0.5643
0.1541 4.9776 1000 0.1578 0.1020 0.5570
0.1467 5.9726 1200 0.1587 0.1012 0.5533
0.14 6.9676 1400 0.1569 0.1017 0.5579
0.1453 7.9626 1600 0.1616 0.1003 0.5432
0.1323 8.9576 1800 0.1529 0.1006 0.5469
0.1212 9.9526 2000 0.1558 0.1006 0.5441
0.126 10.9476 2200 0.1511 0.0997 0.5450
0.1249 11.9426 2400 0.1468 0.0991 0.5368
0.1197 12.9377 2600 0.1455 0.0980 0.5276
0.1183 13.9327 2800 0.1493 0.0982 0.5340
0.1143 14.9277 3000 0.1456 0.0975 0.5368
0.1104 15.9227 3200 0.1550 0.0973 0.5239
0.1018 16.9177 3400 0.1537 0.0970 0.5267
0.1069 17.9127 3600 0.1488 0.0976 0.5294
0.1041 18.9077 3800 0.1448 0.0933 0.5156
0.0985 19.9027 4000 0.1536 0.0962 0.5294
0.0969 20.8978 4200 0.1484 0.0933 0.5156
0.0935 21.8928 4400 0.1495 0.0957 0.5267
0.0903 22.8878 4600 0.1503 0.0962 0.5285
0.0914 23.8828 4800 0.1514 0.0959 0.5276
0.0876 24.8778 5000 0.1485 0.0938 0.5165
0.0927 25.8728 5200 0.1469 0.0940 0.5165
0.0892 26.8678 5400 0.1454 0.0929 0.5165
0.082 27.8628 5600 0.1470 0.0926 0.5119
0.0828 28.8579 5800 0.1485 0.0937 0.5175
0.0867 29.8529 6000 0.1488 0.0937 0.5184

Framework versions

  • Transformers 4.57.2
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.22.0
Downloads last month
1
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ctaguchi/ssc-qxp-mms-model-mix-adapt-max-lowlr

Finetuned
(382)
this model