whisper-small-small-learning-rate-kpo-gbotemi

This model is a fine-tuned version of openai/whisper-small on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 16
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 30
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
1.9732	1.1088	500	0.9970	1.0046	0.4873
1.5393	2.2175	1000	0.7822	0.9320	0.4096
1.3729	3.3263	1500	0.7096	0.9108	0.3873
1.3089	4.4351	2000	0.6706	0.8956	0.3751
1.1984	5.5438	2500	0.6415	0.8952	0.3712
1.1340	6.6526	3000	0.6248	0.8826	0.3671
1.0730	7.7614	3500	0.6085	0.8807	0.3624
1.0823	8.8701	4000	0.5965	0.8665	0.3520
1.0192	9.9789	4500	0.5898	0.8681	0.3538
0.9833	11.0866	5000	0.5820	0.8633	0.3480
0.9961	12.1953	5500	0.5757	0.8567	0.3423
0.9510	13.3041	6000	0.5722	0.8605	0.3493
0.9854	14.4129	6500	0.5668	0.8613	0.3554
0.9232	15.5216	7000	0.5647	0.8537	0.3497
0.9232	16.6304	7500	0.5600	0.8548	0.3514
0.9097	17.7392	8000	0.5593	0.8488	0.3435
0.9163	18.8479	8500	0.5570	0.8522	0.3470
0.8608	19.9567	9000	0.5556	0.8540	0.3510
0.8672	21.0644	9500	0.5554	0.8478	0.3455
0.8713	22.1731	10000	0.5538	0.8499	0.3475
0.8416	23.2819	10500	0.5542	0.8563	0.3555
0.8506	24.3907	11000	0.5543	0.8541	0.3538
0.8630	25.4994	11500	0.5538	0.8534	0.3531

Safetensors

Model size

0.2B params

Tensor type

F32

Base model

Finetuned

this model