nllb-200-1.3B-ft-eng-to-cym

This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 6000
training_steps: 30000

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
1.129	0.0455	2000	0.9534	28.7045	42.0879
0.989	0.0910	4000	0.8191	28.0358	46.0137
0.9079	0.1365	6000	0.7438	29.7605	49.7891
0.834	0.1820	8000	0.6941	31.4068	46.6953
0.7823	0.2275	10000	0.6595	31.7358	39.6693
0.756	0.2730	12000	0.6388	35.5019	39.7181
0.7265	0.3185	14000	0.6221	34.0568	41.3639
0.7173	0.3640	16000	0.6071	40.6291	38.7305
0.7075	0.4094	18000	0.5959	41.9787	37.3835
0.7038	0.4549	20000	0.5881	37.075	40.9609
0.6903	0.5004	22000	0.5817	38.2801	37.5365
0.6741	0.5459	24000	0.5764	37.3169	39.1055
0.6797	0.5914	26000	0.5712	38.957	36.1172
0.6761	0.6369	28000	0.5690	38.7328	35.4466
0.6665	0.6824	30000	0.5683	39.1969	35.2025

Safetensors

Model size

1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(25)

this model