ssc-ssc-audio-aligned-speecht5

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0702

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 3407
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 4000
  • training_steps: 40000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.0837 3.9089 1000 0.0754
0.0722 7.8149 2000 0.0724
0.0722 11.7209 3000 0.0722
0.0703 15.6268 4000 0.0729
0.0655 19.5328 5000 0.0710
0.0622 23.4388 6000 0.0694
0.0703 27.3448 7000 0.0716
0.0625 31.2507 8000 0.0705
0.0604 35.1567 9000 0.0692
0.0599 39.0627 10000 0.0722
0.0579 42.9716 11000 0.0689
0.0567 46.8776 12000 0.0704
0.0568 50.7835 13000 0.0694
0.0592 54.6895 14000 0.0703
0.056 58.5955 15000 0.0714
0.0549 62.5015 16000 0.0693
0.054 66.4074 17000 0.0694
0.0545 70.3134 18000 0.0693
0.0544 74.2194 19000 0.0704
0.053 78.1254 20000 0.0701
0.0523 82.0313 21000 0.0699
0.0513 85.9403 22000 0.0719
0.0515 89.8462 23000 0.0699
0.051 93.7522 24000 0.0705
0.0509 97.6582 25000 0.0702
0.0509 101.5642 26000 0.0699
0.0506 105.4701 27000 0.0705
0.0506 109.3761 28000 0.0702
0.0495 113.2821 29000 0.0694
0.0497 117.1881 30000 0.0700
0.0498 121.0940 31000 0.0709
0.0499 125.0 32000 0.0702
0.0495 128.9089 33000 0.0701
0.0491 132.8149 34000 0.0703
0.0492 136.7209 35000 0.0701
0.0507 140.6268 36000 0.0699
0.0486 144.5328 37000 0.0703
0.049 148.4388 38000 0.0703
0.0486 152.3448 39000 0.0702
0.0493 156.2507 40000 0.0702

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.2
Downloads last month
2
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sil-ai/ssc-ssc-audio-aligned-speecht5

Finetuned
(1363)
this model