swh-waxal-audio-speecht5

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1222

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 3407
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 4000
  • training_steps: 40000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.0947 62.5161 1000 0.0838
0.0807 125.0 2000 0.0836
0.0723 187.5161 3000 0.0845
0.0653 250.0 4000 0.0872
0.0594 312.5161 5000 0.0854
0.0553 375.0 6000 0.0904
0.0582 437.5161 7000 0.0926
0.0512 500.0 8000 0.0965
0.0488 562.5161 9000 0.0969
0.0467 625.0 10000 0.0991
0.0455 687.5161 11000 0.1021
0.0442 750.0 12000 0.1033
0.0458 812.5161 13000 0.1055
0.0416 875.0 14000 0.1071
0.0401 937.5161 15000 0.1045
0.041 1000.0 16000 0.1066
0.0384 1062.5161 17000 0.1130
0.0444 1125.0 18000 0.1118
0.0385 1187.5161 19000 0.1102
0.0365 1250.0 20000 0.1121
0.0347 1312.5161 21000 0.1127
0.0348 1375.0 22000 0.1146
0.0348 1437.5161 23000 0.1153
0.0351 1500.0 24000 0.1163
0.0346 1562.5161 25000 0.1175
0.0341 1625.0 26000 0.1183
0.0327 1687.5161 27000 0.1199
0.0341 1750.0 28000 0.1201
0.0341 1812.5161 29000 0.1198
0.0327 1875.0 30000 0.1205
0.0324 1937.5161 31000 0.1200
0.0381 2000.0 32000 0.1205
0.0353 2062.5161 33000 0.1213
0.0331 2125.0 34000 0.1214
0.0328 2187.5161 35000 0.1221
0.0304 2250.0 36000 0.1227
0.0312 2312.5161 37000 0.1220
0.0306 2375.0 38000 0.1213
0.031 2437.5161 39000 0.1227
0.0309 2500.0 40000 0.1222

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.2
Downloads last month
89
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sil-ai/swh-waxal-audio-speecht5

Finetuned
(1364)
this model