train_boolq_42_1774791063

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3229
  • Num Input Tokens Seen: 12333600

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3397 0.2507 266 0.3339 618432
0.3515 0.5014 532 0.3600 1225408
0.3116 0.7521 798 0.3553 1851072
0.3267 1.0028 1064 0.3294 2475808
0.3519 1.2535 1330 0.3309 3091552
0.3512 1.5042 1596 0.3332 3699104
0.3549 1.7549 1862 0.3334 4324256
0.3755 2.0057 2128 0.3262 4940992
0.3193 2.2564 2394 0.3295 5558144
0.37 2.5071 2660 0.3561 6183872
0.2998 2.7578 2926 0.3229 6806208
0.3151 3.0085 3192 0.3395 7421856
0.2325 3.2592 3458 0.3571 8043744
0.2696 3.5099 3724 0.3871 8660768
0.2628 3.7606 3990 0.3420 9286304
0.2222 4.0113 4256 0.3638 9894624
0.1352 4.2620 4522 0.5489 10512416
0.1785 4.5127 4788 0.5068 11115040
0.2304 4.7634 5054 0.5038 11736672

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
228
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_boolq_42_1774791063

Finetuned
(1594)
this model