wh-ft-lr1e6-dtstf5-adm-ga1ba16-st15k-v2-evalstp10-pat20-trainvalch

This model is a fine-tuned version of HouraMor/wh-ft-lr5e6-dtstf5-adm-ga1ba16-st15k-v2-evalstp500-pat5 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 250
training_steps: 5000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.3641	0.0201	10	0.5640	0.3217	0.2365
0.3049	0.0402	20	0.5639	0.3212	0.2361
0.2036	0.0602	30	0.5635	0.3207	0.2359
0.3869	0.0803	40	0.5630	0.2988	0.2261
0.296	0.1004	50	0.5624	0.2980	0.2255
0.2669	0.1205	60	0.5619	0.3005	0.2276
0.2338	0.1406	70	0.5616	0.3006	0.2275
0.2956	0.1606	80	0.5609	0.2897	0.2190
0.2909	0.1807	90	0.5604	0.2895	0.2188
0.3329	0.2008	100	0.5600	0.2895	0.2190
0.2706	0.2209	110	0.5597	0.2860	0.2161
0.2226	0.2410	120	0.5596	0.2833	0.2149
0.2741	0.2610	130	0.5600	0.2828	0.2148
0.2417	0.2811	140	0.5608	0.2826	0.2145
0.347	0.3012	150	0.5615	0.2811	0.2131
0.3609	0.3213	160	0.5609	0.2826	0.2142
0.2593	0.3414	170	0.5606	0.2809	0.2134
0.3067	0.3614	180	0.5605	0.2764	0.2092
0.1497	0.3815	190	0.5607	0.2741	0.2078
0.2763	0.4016	200	0.5621	0.2724	0.2064
0.2473	0.4217	210	0.5630	0.2724	0.2067
0.2224	0.4418	220	0.5634	0.2738	0.2080
0.2454	0.4618	230	0.5643	0.2764	0.2108
0.1783	0.4819	240	0.5658	0.2760	0.2109
0.2915	0.5020	250	0.5669	0.2756	0.2103
0.3192	0.5221	260	0.5662	0.2739	0.2078
0.3532	0.5422	270	0.5650	0.2748	0.2085

Safetensors

Model size

2B params

Tensor type

F32

Base model

Finetuned

Finetuned

(8)

this model