train_cola_42_1774791066

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1780
  • Num Input Tokens Seen: 1932608

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.257 0.2505 241 0.2653 97664
0.2465 0.5010 482 0.2394 194560
0.2246 0.7516 723 0.2675 291712
0.2126 1.0021 964 0.2090 387464
0.2112 1.2526 1205 0.2047 485192
0.2259 1.5031 1446 0.2002 581704
0.0688 1.7536 1687 0.1908 677576
0.1317 2.0042 1928 0.2061 775312
0.1631 2.2547 2169 0.1976 873104
0.1441 2.5052 2410 0.1862 969360
0.0995 2.7557 2651 0.1822 1065232
0.1275 3.0062 2892 0.1780 1162016
0.2642 3.2568 3133 0.1875 1259168
0.1987 3.5073 3374 0.1781 1355552
0.2233 3.7578 3615 0.1860 1453088
0.1377 4.0083 3856 0.1792 1549360
0.0611 4.2588 4097 0.1857 1645808
0.1136 4.5094 4338 0.1867 1742960
0.1672 4.7599 4579 0.1868 1839344

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
187
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_42_1774791066

Adapter
(599)
this model