wh-ft-lr5e6-dtstf5-adm-ga1ba16-st15k-v2-evalstp100-pat15-trainvalch

This model is a fine-tuned version of HouraMor/wh-ft-lr5e6-dtstf5-adm-ga1ba16-st15k-v2-evalstp500-pat5 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-06
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 750
training_steps: 15000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.2915	0.2008	100	0.5596	0.2856	0.2155
0.2523	0.4016	200	0.5651	0.2701	0.2039
0.2447	0.6024	300	0.5693	0.2784	0.2093
0.2297	0.8032	400	0.5770	0.2784	0.2120
0.3404	1.0040	500	0.5713	0.2770	0.2097
0.1959	1.2048	600	0.6140	0.2955	0.2145
0.1946	1.4056	700	0.6199	0.3093	0.2413
0.1835	1.6064	800	0.6318	0.2903	0.2193
0.212	1.8072	900	0.6096	0.3030	0.2325
0.2341	2.0080	1000	0.6090	0.3507	0.2680
0.122	2.2088	1100	0.6882	0.2816	0.2099
0.1162	2.4096	1200	0.6873	0.4320	0.3549
0.1073	2.6104	1300	0.7016	0.2987	0.2279
0.1109	2.8112	1400	0.6768	0.3260	0.2485
0.091	3.0120	1500	0.6904	0.2987	0.2315
0.053	3.2129	1600	0.7545	0.2875	0.2170

Safetensors

Model size

2B params

Tensor type

F32

Base model

Finetuned

Finetuned

(8)

this model