Wave2Vec2-Bert2.0 Nepali - Kiran Pantha
This model is a fine-tuned version of facebook/w2v-bert-2.0-nepali on the NepaliParliamentDS dataset. It achieves the following results on the evaluation set:
- Loss: 1.2860
- Wer: 0.6172
- Cer: 0.2217
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.SGD and the args are: No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 5
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Cer | Validation Loss | Wer |
|---|---|---|---|---|---|
| 1.1608 | 0.3641 | 300 | 0.2240 | 1.3237 | 0.6323 |
| 1.1726 | 0.7282 | 600 | 0.2236 | 1.3175 | 0.6307 |
| 1.3014 | 1.0922 | 900 | 0.2231 | 1.3112 | 0.6284 |
| 1.1958 | 1.4563 | 1200 | 0.2229 | 1.3076 | 0.6276 |
| 1.1507 | 1.8204 | 1500 | 0.2228 | 1.3058 | 0.6269 |
| 1.1777 | 2.1845 | 1800 | 0.2227 | 1.3030 | 0.6253 |
| 1.1428 | 2.5485 | 2100 | 0.2223 | 1.2989 | 0.6235 |
| 1.2121 | 2.9126 | 2400 | 0.2222 | 1.2953 | 0.6220 |
| 1.221 | 3.2767 | 2700 | 1.2920 | 0.6207 | 0.2222 |
| 1.1533 | 3.6408 | 3000 | 1.2897 | 0.6198 | 0.2222 |
| 1.1758 | 4.0049 | 3300 | 1.2878 | 0.6185 | 0.2220 |
| 1.1918 | 4.3689 | 3600 | 1.2867 | 0.6177 | 0.2219 |
| 1.1421 | 4.7330 | 3900 | 1.2860 | 0.6172 | 0.2217 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.6.0+xpu
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- 1
Dataset used to train kiranpantha/w2v-bert-2.0-nepali-ft-parliament
Evaluation results
- Wer on NepaliParliamentDSself-reported0.617