ikema-asr-youtube-romaji

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.4360
  • Cer: 0.5728

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 150
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer
5.7654 1.1117 100 2.9458 1.0
3.1287 2.2235 200 2.8488 1.0
2.9631 3.3352 300 2.9340 1.0
2.8705 4.4469 400 2.8182 1.0
2.7162 5.5587 500 2.7401 1.0
2.4733 6.6704 600 2.5031 0.7435
2.1943 7.7821 700 2.3506 0.7736
1.9286 8.8939 800 2.2760 0.6852
1.719 10.0 900 2.3650 0.6970
1.5783 11.1117 1000 2.3247 0.6237
1.4361 12.2235 1100 2.1219 0.5678
1.338 13.3352 1200 2.2438 0.5707
1.2396 14.4469 1300 2.5065 0.5927
1.2052 15.5587 1400 2.3047 0.6268
1.0853 16.6704 1500 2.4457 0.5839
1.0515 17.7821 1600 2.3201 0.5521
1.0278 18.8939 1700 2.4907 0.5751
0.9595 20.0 1800 2.7885 0.5627
0.8868 21.1117 1900 2.7341 0.5844
0.8572 22.2235 2000 2.8150 0.5741
0.8335 23.3352 2100 2.6458 0.5686
0.8002 24.4469 2200 2.7114 0.5580
0.7593 25.5587 2300 2.6934 0.5555
0.7298 26.6704 2400 2.8385 0.5753
0.6951 27.7821 2500 2.8914 0.5651
0.6568 28.8939 2600 2.8913 0.5712
0.6522 30.0 2700 2.9266 0.5818
0.6165 31.1117 2800 2.9721 0.5626
0.6044 32.2235 2900 3.1186 0.6141
0.5791 33.3352 3000 3.1073 0.5619
0.5613 34.4469 3100 3.1489 0.5626
0.5616 35.5587 3200 3.3946 0.5865
0.528 36.6704 3300 3.2624 0.5566
0.5308 37.7821 3400 3.0666 0.5800
0.4913 38.8939 3500 3.3816 0.5830
0.4989 40.0 3600 3.6244 0.5663
0.4657 41.1117 3700 3.5415 0.5849
0.4693 42.2235 3800 3.6088 0.5834
0.4479 43.3352 3900 3.8216 0.5726
0.4331 44.4469 4000 3.5104 0.5866
0.4364 45.5587 4100 3.2249 0.5602
0.4195 46.6704 4200 3.3988 0.5659
0.4215 47.7821 4300 3.5439 0.5613
0.3899 48.8939 4400 3.7876 0.5835
0.3906 50.0 4500 3.6664 0.5647
0.3777 51.1117 4600 3.8201 0.5887
0.3875 52.2235 4700 3.8447 0.5933
0.3542 53.3352 4800 4.0995 0.5845
0.3513 54.4469 4900 4.2251 0.5892
0.3471 55.5587 5000 4.1187 0.5852
0.339 56.6704 5100 4.2435 0.5868
0.3299 57.7821 5200 4.1356 0.5750
0.3242 58.8939 5300 4.0832 0.5985
0.3143 60.0 5400 4.4107 0.5636
0.3025 61.1117 5500 4.3838 0.5915
0.3101 62.2235 5600 4.4010 0.5827
0.2909 63.3352 5700 4.1933 0.5727
0.2942 64.4469 5800 4.7073 0.6168
0.2805 65.5587 5900 4.3165 0.5717
0.2784 66.6704 6000 4.3823 0.5760
0.2667 67.7821 6100 4.4065 0.5885
0.2612 68.8939 6200 4.5967 0.6013
0.2634 70.0 6300 4.5820 0.5947
0.2494 71.1117 6400 4.7173 0.5738
0.2538 72.2235 6500 4.6189 0.5717
0.2483 73.3352 6600 4.5211 0.5829
0.2295 74.4469 6700 4.4063 0.5692
0.2272 75.5587 6800 4.5388 0.5769
0.2364 76.6704 6900 4.1843 0.5412
0.2322 77.7821 7000 4.2108 0.5678
0.2144 78.8939 7100 4.3772 0.5618
0.216 80.0 7200 4.5920 0.6020
0.2116 81.1117 7300 4.4980 0.5811
0.2088 82.2235 7400 4.5054 0.5804
0.1999 83.3352 7500 4.3327 0.5696
0.1913 84.4469 7600 4.5404 0.5639
0.1909 85.5587 7700 4.5547 0.5722
0.1854 86.6704 7800 4.6510 0.5619
0.187 87.7821 7900 4.4342 0.5746
0.1785 88.8939 8000 4.8637 0.5746
0.1725 90.0 8100 4.7284 0.5605
0.1744 91.1117 8200 4.6714 0.5536
0.1654 92.2235 8300 4.6927 0.5663
0.1703 93.3352 8400 4.7706 0.5655
0.1544 94.4469 8500 4.7921 0.5631
0.1536 95.5587 8600 4.6269 0.5744
0.1558 96.6704 8700 4.4396 0.5716
0.151 97.7821 8800 4.5563 0.5618
0.1455 98.8939 8900 4.6194 0.5549
0.1463 100.0 9000 4.5304 0.5543
0.139 101.1117 9100 4.7211 0.5653
0.1313 102.2235 9200 4.9493 0.5656
0.1362 103.3352 9300 4.6552 0.5554
0.1307 104.4469 9400 5.0708 0.5656
0.1271 105.5587 9500 5.0151 0.5683
0.1248 106.6704 9600 4.9161 0.5758
0.1182 107.7821 9700 5.3568 0.5768
0.1182 108.8939 9800 4.9611 0.5646
0.1197 110.0 9900 5.1136 0.5580
0.1121 111.1117 10000 4.9628 0.5701
0.1106 112.2235 10100 5.1576 0.5856
0.1102 113.3352 10200 5.1091 0.5738
0.1036 114.4469 10300 5.1654 0.5721
0.1048 115.5587 10400 4.9932 0.5665
0.1019 116.6704 10500 5.0801 0.5611
0.1034 117.7821 10600 4.8809 0.5650
0.1004 118.8939 10700 5.0913 0.5752
0.0936 120.0 10800 5.0903 0.5691
0.095 121.1117 10900 5.0648 0.5760
0.0893 122.2235 11000 5.1733 0.5827
0.0896 123.3352 11100 5.2183 0.5715
0.086 124.4469 11200 5.1838 0.5751
0.0869 125.5587 11300 5.1150 0.5744
0.0791 126.6704 11400 5.1499 0.5656
0.0832 127.7821 11500 5.2312 0.5695
0.0827 128.8939 11600 5.2175 0.5732
0.0799 130.0 11700 5.2239 0.5777
0.0747 131.1117 11800 5.2310 0.5690
0.0729 132.2235 11900 5.2735 0.5764
0.073 133.3352 12000 5.2856 0.5721
0.0755 134.4469 12100 5.2211 0.5716
0.0748 135.5587 12200 5.3144 0.5639
0.0695 136.6704 12300 5.3409 0.5734
0.0673 137.7821 12400 5.4097 0.5758
0.0674 138.8939 12500 5.3630 0.5738
0.0657 140.0 12600 5.4513 0.5713
0.0669 141.1117 12700 5.4735 0.5784
0.0627 142.2235 12800 5.4461 0.5727
0.0665 143.3352 12900 5.4985 0.5752
0.0626 144.4469 13000 5.4834 0.5734
0.0616 145.5587 13100 5.4716 0.5728
0.062 146.6704 13200 5.4455 0.5735
0.0601 147.7821 13300 5.4347 0.5721

Framework versions

  • Transformers 4.51.2
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ctaguchi/ikema-asr-youtube-romaji

Finetuned
(832)
this model