ssc-meh-mms-model-mix-adapt-max-lowlr
This model is a fine-tuned version of facebook/mms-1b-all on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.6881
- Cer: 0.1731
- Wer: 0.4714
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 8
- eval_batch_size: 6
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
|---|---|---|---|---|---|
| 0.9811 | 0.3289 | 200 | 0.7751 | 0.2017 | 0.5926 |
| 0.7561 | 0.6579 | 400 | 0.7307 | 0.1919 | 0.5439 |
| 0.7058 | 0.9868 | 600 | 0.7007 | 0.1838 | 0.5200 |
| 0.6423 | 1.3158 | 800 | 0.6787 | 0.1802 | 0.4994 |
| 0.6639 | 1.6447 | 1000 | 0.6731 | 0.1793 | 0.4953 |
| 0.6096 | 1.9737 | 1200 | 0.6722 | 0.1784 | 0.4929 |
| 0.6177 | 2.3026 | 1400 | 0.6858 | 0.1776 | 0.4847 |
| 0.5689 | 2.6316 | 1600 | 0.6642 | 0.1767 | 0.4848 |
| 0.5697 | 2.9605 | 1800 | 0.6654 | 0.1742 | 0.4752 |
| 0.5563 | 3.2895 | 2000 | 0.6636 | 0.1739 | 0.4715 |
| 0.5722 | 3.6184 | 2200 | 0.6772 | 0.1736 | 0.4729 |
| 0.5781 | 3.9474 | 2400 | 0.6745 | 0.1733 | 0.4723 |
| 0.5316 | 4.2763 | 2600 | 0.6741 | 0.1750 | 0.4801 |
| 0.5538 | 4.6053 | 2800 | 0.6747 | 0.1741 | 0.4688 |
| 0.562 | 4.9342 | 3000 | 0.6701 | 0.1734 | 0.4706 |
| 0.5555 | 5.2632 | 3200 | 0.6770 | 0.1743 | 0.4780 |
| 0.5361 | 5.5921 | 3400 | 0.6752 | 0.1743 | 0.4782 |
| 0.5254 | 5.9211 | 3600 | 0.6836 | 0.1754 | 0.4792 |
| 0.5095 | 6.25 | 3800 | 0.6823 | 0.1748 | 0.4770 |
| 0.5482 | 6.5789 | 4000 | 0.6768 | 0.1736 | 0.4721 |
| 0.518 | 6.9079 | 4200 | 0.6786 | 0.1730 | 0.4689 |
| 0.4757 | 7.2368 | 4400 | 0.6978 | 0.1750 | 0.4803 |
| 0.5063 | 7.5658 | 4600 | 0.6799 | 0.1728 | 0.4717 |
| 0.4943 | 7.8947 | 4800 | 0.6860 | 0.1737 | 0.4757 |
| 0.4962 | 8.2237 | 5000 | 0.6865 | 0.1735 | 0.4752 |
| 0.4943 | 8.5526 | 5200 | 0.6903 | 0.1739 | 0.4760 |
| 0.5035 | 8.8816 | 5400 | 0.6983 | 0.1752 | 0.4795 |
| 0.4842 | 9.2105 | 5600 | 0.6862 | 0.1731 | 0.4682 |
| 0.4704 | 9.5395 | 5800 | 0.6897 | 0.1733 | 0.4719 |
| 0.4939 | 9.8684 | 6000 | 0.6881 | 0.1731 | 0.4714 |
Framework versions
- Transformers 4.57.2
- Pytorch 2.9.1+cu128
- Datasets 3.6.0
- Tokenizers 0.22.0
- Downloads last month
- 1
Model tree for ctaguchi/ssc-meh-mms-model-mix-adapt-max-lowlr
Base model
facebook/mms-1b-all