llama-3.2-1b-finetuned-1gb-cX-corpus

This model is a fine-tuned version of unsloth/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9704

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 256
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss
2.3088 0.0456 200 2.3241
2.2343 0.0912 400 2.2458
2.1814 0.1368 600 2.1902
2.13 0.1825 800 2.1396
2.0962 0.2281 1000 2.1050
2.069 0.2737 1200 2.0800
2.0318 0.3193 1400 2.0589
2.0099 0.3649 1600 2.0411
2.0139 0.4105 1800 2.0263
1.998 0.4562 2000 2.0131
1.98 0.5018 2200 2.0024
1.9634 0.5474 2400 1.9930
1.9574 0.5930 2600 1.9856
1.9555 0.6386 2800 1.9801
1.9591 0.6842 3000 1.9760
1.9586 0.7299 3200 1.9733
1.9381 0.7755 3400 1.9716
1.9386 0.8211 3600 1.9709
1.9412 0.8667 3800 1.9705
1.9507 0.9123 4000 1.9704
1.9338 0.9579 4200 1.9704

Framework versions

  • Transformers 4.50.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sghosts/llama-3.2-1b-finetuned-1gb-cX-corpus

Finetuned
(122)
this model