llama-3.2-1b-finetuned-1gb-cX-corpus

This model is a fine-tuned version of unsloth/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 16
total_train_batch_size: 256
total_eval_batch_size: 32
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1.0

Training Loss	Epoch	Step	Validation Loss
2.3088	0.0456	200	2.3241
2.2343	0.0912	400	2.2458
2.1814	0.1368	600	2.1902
2.13	0.1825	800	2.1396
2.0962	0.2281	1000	2.1050
2.069	0.2737	1200	2.0800
2.0318	0.3193	1400	2.0589
2.0099	0.3649	1600	2.0411
2.0139	0.4105	1800	2.0263
1.998	0.4562	2000	2.0131
1.98	0.5018	2200	2.0024
1.9634	0.5474	2400	1.9930
1.9574	0.5930	2600	1.9856
1.9555	0.6386	2800	1.9801
1.9591	0.6842	3000	1.9760
1.9586	0.7299	3200	1.9733
1.9381	0.7755	3400	1.9716
1.9386	0.8211	3600	1.9709
1.9412	0.8667	3800	1.9705
1.9507	0.9123	4000	1.9704
1.9338	0.9579	4200	1.9704

Safetensors

Model size

1B params

Tensor type

BF16

Base model

Finetuned

Finetuned

(122)

this model