w2v-bert-2.0-mdc-chiga-asr-1.0.0

This model is a fine-tuned version of facebook/w2v-bert-2.0 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 50.0
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
4.3083	1.0	105	2.5842	1.0001	0.7687
1.0636	2.0	210	0.6019	0.6148	0.1390
0.5091	3.0	315	0.5296	0.5336	0.1220
0.4259	4.0	420	0.5080	0.4767	0.1095
0.3882	5.0	525	0.5008	0.5302	0.1364
0.3516	6.0	630	0.5096	0.4763	0.1139
0.306	7.0	735	0.4777	0.4803	0.1179
0.2632	8.0	840	0.4770	0.4864	0.1204
0.2246	9.0	945	0.4591	0.4334	0.1030
0.1991	10.0	1050	0.4926	0.4631	0.1182
0.1638	11.0	1155	0.5248	0.4394	0.1070
0.1375	12.0	1260	0.5634	0.4400	0.1054
0.1175	13.0	1365	0.6083	0.4303	0.1004
0.0958	14.0	1470	0.6107	0.4510	0.1071
0.0818	15.0	1575	0.6990	0.4194	0.0971
0.0685	16.0	1680	0.6716	0.4303	0.1007
0.0549	17.0	1785	0.6984	0.4251	0.1020
0.0454	18.0	1890	0.7109	0.4093	0.0971
0.0345	19.0	1995	0.6910	0.4272	0.1031
0.0279	20.0	2100	0.7130	0.4101	0.0971
0.0229	21.0	2205	0.7859	0.4046	0.0951
0.0231	22.0	2310	0.8019	0.4055	0.0976
0.0202	23.0	2415	0.7785	0.4140	0.0991
0.0167	24.0	2520	0.8128	0.4024	0.0963
0.0146	25.0	2625	0.8281	0.4067	0.0955
0.0082	26.0	2730	0.8360	0.4027	0.0961
0.0061	27.0	2835	0.8918	0.3998	0.0956
0.0038	28.0	2940	0.8891	0.3988	0.0946
0.0033	29.0	3045	0.9374	0.3960	0.0932
0.0035	30.0	3150	0.9357	0.3939	0.0939
0.003	31.0	3255	0.9555	0.3884	0.0928
0.0024	32.0	3360	0.9642	0.3951	0.0934
0.0029	33.0	3465	0.9584	0.4039	0.0957
0.0039	34.0	3570	0.9370	0.3918	0.0928
0.0027	35.0	3675	0.9722	0.3851	0.0906
0.0013	36.0	3780	0.9996	0.3941	0.0935
0.0008	37.0	3885	1.0096	0.3908	0.0921
0.0004	38.0	3990	1.0309	0.3861	0.0910
0.0002	39.0	4095	1.0472	0.3891	0.0920
0.0002	40.0	4200	1.0621	0.3897	0.0914

Safetensors

Model size

0.6B params

Tensor type

F32

Base model

Finetuned

(465)

this model