ssc-ady-mms-model-mix-adapt-max-longcv

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 12.1108
  • Cer: 0.8989
  • Wer: 1.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
3.7443 0.2717 200 3.2557 0.9926 1.0
0.7396 0.5435 400 0.5329 0.1720 0.8128
0.5625 0.8152 600 0.4470 0.1455 0.7283
0.5028 1.0870 800 0.4204 0.1425 0.7189
0.4808 1.3587 1000 0.4070 0.1394 0.7088
0.4613 1.6304 1200 0.3806 0.1298 0.6834
0.4574 1.9022 1400 0.3863 0.1286 0.6820
0.4247 2.1739 1600 0.3623 0.1257 0.6674
0.4233 2.4457 1800 0.3794 0.1330 0.6916
0.4101 2.7174 2000 0.3497 0.1213 0.6532
0.4057 2.9891 2200 0.3569 0.1278 0.6727
0.4033 3.2609 2400 0.3481 0.1232 0.6552
0.4023 3.5326 2600 0.3437 0.1206 0.6482
0.3862 3.8043 2800 0.3484 0.1204 0.6566
0.4157 4.0761 3000 0.3586 0.1264 0.6568
0.4017 4.3478 3200 0.3744 0.1282 0.6628
0.4148 4.6196 3400 0.3591 0.1244 0.6530
0.4446 4.8913 3600 0.3905 0.1289 0.6561
0.4784 5.1630 3800 0.3949 0.1315 0.6882
0.4966 5.4348 4000 0.4179 0.1328 0.6741
0.5704 5.7065 4200 0.4569 0.1335 0.6825
0.6411 5.9783 4400 0.6167 0.1527 0.7170
0.8632 6.25 4600 0.9967 0.2227 0.8538
2.0204 6.5217 4800 2.4109 0.8268 1.0
2.7834 6.7935 5000 2.6203 0.9210 1.0
2.7752 7.0652 5200 2.7062 0.9524 1.0
2.8283 7.3370 5400 2.6976 0.9280 1.0
2.8589 7.6087 5600 2.7020 0.9235 1.0
3.0579 7.8804 5800 2.9812 0.8943 0.9998
3.5998 8.1522 6000 3.3562 0.8770 0.9993
3.98 8.4239 6200 3.8742 0.7393 0.9950
4.6409 8.6957 6400 4.5709 0.7550 0.9945
5.2842 8.9674 6600 5.5003 0.6588 0.9923
6.9814 9.2391 6800 7.0443 0.8092 1.0
8.6704 9.5109 7000 9.0629 0.7697 1.0
10.1129 9.7826 7200 10.1659 0.7456 1.0
10.964 10.0543 7400 10.9963 0.7948 1.0
11.7005 10.3261 7600 11.9229 0.7517 1.0
12.0909 10.5978 7800 12.3057 0.9080 1.0
12.385 10.8696 8000 12.3060 0.9079 1.0
12.1537 11.1413 8200 12.1111 0.8990 1.0
12.0604 11.4130 8400 12.1109 0.8993 1.0
11.9687 11.6848 8600 12.1105 0.8991 1.0
12.0688 11.9565 8800 12.1108 0.8990 1.0
11.9894 12.2283 9000 12.1103 0.8990 1.0
12.1037 12.5 9200 12.1110 0.8990 1.0
11.9453 12.7717 9400 12.1102 0.8986 1.0
11.959 13.0435 9600 12.1109 0.8990 1.0
12.1515 13.3152 9800 12.1104 0.8991 1.0
11.8908 13.5870 10000 12.1108 0.8992 1.0
12.0349 13.8587 10200 12.1108 0.8995 1.0
12.0769 14.1304 10400 12.1110 0.8993 1.0
11.8601 14.4022 10600 12.1110 0.8990 1.0
12.2427 14.6739 10800 12.1108 0.8989 1.0
11.9567 14.9457 11000 12.1108 0.8989 1.0

Framework versions

  • Transformers 4.52.1
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.4
Downloads last month
1
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ctaguchi/ssc-ady-mms-model-mix-adapt-max-longcv

Finetuned
(382)
this model