vlsp-mt5-30epoch

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 30
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Exact Match
No log	1.0	188	1.1549	0.008
No log	2.0	376	1.1488	0.002
2.5372	3.0	564	1.1257	0.002
2.5372	4.0	752	1.0450	0.006
2.5372	5.0	940	1.0105	0.01
1.0672	6.0	1128	0.9664	0.012
1.0672	7.0	1316	0.9116	0.016
0.9192	8.0	1504	0.8681	0.024
0.9192	9.0	1692	0.8212	0.038
0.9192	10.0	1880	0.7854	0.048
0.7909	11.0	2068	0.7536	0.048
0.7909	12.0	2256	0.7386	0.07
0.7909	13.0	2444	0.7154	0.074
0.7027	14.0	2632	0.7005	0.088
0.7027	15.0	2820	0.6750	0.086
0.6389	16.0	3008	0.6638	0.1
0.6389	17.0	3196	0.6602	0.112
0.6389	18.0	3384	0.6464	0.116
0.583	19.0	3572	0.6418	0.128
0.583	20.0	3760	0.6387	0.13
0.583	21.0	3948	0.6391	0.128
0.5391	22.0	4136	0.6384	0.13
0.5391	23.0	4324	0.6372	0.13
0.5105	24.0	4512	0.6433	0.122
0.5105	25.0	4700	0.6360	0.128
0.5105	26.0	4888	0.6365	0.132
0.4866	27.0	5076	0.6372	0.128
0.4866	28.0	5264	0.6363	0.126
0.4866	29.0	5452	0.6387	0.13
0.4841	30.0	5640	0.6393	0.126

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(301)

this model