akkadian_translation

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 20
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
51.6427	1.0	88	43.5929
35.4809	2.0	176	29.6570
26.2054	3.0	264	23.5765
23.6990	4.0	352	19.5528
20.2700	5.0	440	16.8286
17.4142	6.0	528	15.0303
16.0694	7.0	616	13.5486
14.6152	8.0	704	12.1864
13.2197	9.0	792	10.8778
12.6036	10.0	880	10.0926
11.6968	11.0	968	9.4187
10.7210	12.0	1056	8.8837
10.1111	13.0	1144	8.2965
10.0251	14.0	1232	7.8286
8.8541	15.0	1320	7.3692
9.1151	16.0	1408	7.0018
8.4213	17.0	1496	6.7866
8.3269	18.0	1584	6.6407
8.4181	19.0	1672	6.5861
8.7340	20.0	1760	6.5583

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(666)

this model