speecht5-ngiemboon

This model is a fine-tuned version of microsoft/speecht5_tts on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 15
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
1.5579	0.7252	500	0.7537
1.3604	1.4496	1000	0.6883
1.2814	2.1740	1500	0.6343
1.2308	2.8992	2000	0.6164
1.2133	3.6236	2500	0.6057
1.1829	4.3481	3000	0.5962
1.1787	5.0725	3500	0.5967
1.1692	5.7977	4000	0.5852
1.1472	6.5221	4500	0.5811
1.1374	7.2466	5000	0.5768
1.1643	7.9717	5500	0.5711
1.1385	8.6962	6000	0.5713
1.1334	9.4206	6500	0.5670
1.1564	10.1450	7000	0.5684
1.1158	10.8702	7500	0.5622
1.1158	11.5946	8000	0.5628
1.1149	12.3191	8500	0.5611
1.1088	13.0435	9000	0.5597
1.1191	13.7687	9500	0.5615
1.1097	14.4931	10000	0.5604

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

this model