BoolQ_Llama-3.2-1B-131yj8sj

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4452
  • Model Preparation Time: 0.0057
  • Mdl: 6818.1174
  • Accumulated Loss: 4725.9588
  • Correct Preds: 2702.0
  • Total Preds: 3270.0
  • Accuracy: 0.8263
  • Correct Gen Preds: 2701.0
  • Gen Accuracy: 0.8260
  • Correct Gen Preds 9642: 1791.0
  • Correct Preds 9642: 1798.0
  • Total Labels 9642: 2026.0
  • Accuracy 9642: 0.8875
  • Gen Accuracy 9642: 0.8840
  • Correct Gen Preds 2822: 901.0
  • Correct Preds 2822: 904.0
  • Total Labels 2822: 1231.0
  • Accuracy 2822: 0.7344
  • Gen Accuracy 2822: 0.7319

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 120
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 9642 Correct Preds 9642 Total Labels 9642 Accuracy 9642 Gen Accuracy 9642 Correct Gen Preds 2822 Correct Preds 2822 Total Labels 2822 Accuracy 2822 Gen Accuracy 2822
No log 0 0 0.7080 0.0057 3339.8933 2315.0376 2032.0 3270.0 0.6214 2040.0 0.6239 2007.0 2008.0 2026.0 0.9911 0.9906 24.0 24.0 1231.0 0.0195 0.0195
0.2476 1.0 143 0.4988 0.0057 2353.0385 1631.0020 2591.0 3270.0 0.7924 2599.0 0.7948 1843.0 1843.0 2026.0 0.9097 0.9097 747.0 748.0 1231.0 0.6076 0.6068
0.0885 2.0 286 0.5426 0.0057 2559.9190 1774.4006 2626.0 3270.0 0.8031 2626.0 0.8031 1900.0 1906.0 2026.0 0.9408 0.9378 717.0 720.0 1231.0 0.5849 0.5825
0.0086 3.0 429 0.7471 0.0057 3524.5342 2443.0209 2655.0 3270.0 0.8119 2625.0 0.8028 1638.0 1667.0 2026.0 0.8228 0.8085 978.0 988.0 1231.0 0.8026 0.7945
0.0002 4.0 572 1.1866 0.0057 5597.8044 3880.1023 2662.0 3270.0 0.8141 2663.0 0.8144 1703.0 1707.0 2026.0 0.8425 0.8406 953.0 955.0 1231.0 0.7758 0.7742
0.0115 5.0 715 1.3058 0.0057 6160.2400 4269.9530 2673.0 3270.0 0.8174 2664.0 0.8147 1791.0 1797.0 2026.0 0.8870 0.8840 864.0 876.0 1231.0 0.7116 0.7019
0.0 6.0 858 1.4452 0.0057 6818.1174 4725.9588 2702.0 3270.0 0.8263 2701.0 0.8260 1791.0 1798.0 2026.0 0.8875 0.8840 901.0 904.0 1231.0 0.7344 0.7319
0.0 7.0 1001 1.4433 0.0057 6808.9128 4719.5787 2698.0 3270.0 0.8251 2704.0 0.8269 1812.0 1814.0 2026.0 0.8954 0.8944 883.0 884.0 1231.0 0.7181 0.7173
0.0 8.0 1144 1.3856 0.0057 6536.7240 4530.9118 2691.0 3270.0 0.8229 2694.0 0.8239 1768.0 1772.0 2026.0 0.8746 0.8727 917.0 919.0 1231.0 0.7465 0.7449
0.9802 9.0 1287 1.4773 0.0057 6969.2721 4830.7313 2692.0 3270.0 0.8232 2698.0 0.8251 1793.0 1795.0 2026.0 0.8860 0.8850 897.0 897.0 1231.0 0.7287 0.7287
0.0 10.0 1430 1.5437 0.0057 7282.6372 5047.9395 2695.0 3270.0 0.8242 2701.0 0.8260 1775.0 1777.0 2026.0 0.8771 0.8761 917.0 918.0 1231.0 0.7457 0.7449
0.0 11.0 1573 1.5490 0.0057 7307.5108 5065.1805 2690.0 3270.0 0.8226 2696.0 0.8245 1771.0 1773.0 2026.0 0.8751 0.8741 916.0 917.0 1231.0 0.7449 0.7441
0.0 12.0 1716 1.5529 0.0057 7325.9736 5077.9779 2692.0 3270.0 0.8232 2697.0 0.8248 1773.0 1775.0 2026.0 0.8761 0.8751 916.0 917.0 1231.0 0.7449 0.7441
0.0 13.0 1859 1.5565 0.0057 7343.1664 5089.8951 2691.0 3270.0 0.8229 2696.0 0.8245 1771.0 1773.0 2026.0 0.8751 0.8741 917.0 918.0 1231.0 0.7457 0.7449
0.0 14.0 2002 1.5552 0.0057 7336.7036 5085.4154 2692.0 3270.0 0.8232 2697.0 0.8248 1772.0 1774.0 2026.0 0.8756 0.8746 917.0 918.0 1231.0 0.7457 0.7449
0.9802 15.0 2145 1.5579 0.0057 7349.6490 5094.3885 2695.0 3270.0 0.8242 2700.0 0.8257 1774.0 1776.0 2026.0 0.8766 0.8756 918.0 919.0 1231.0 0.7465 0.7457
0.0 16.0 2288 1.5570 0.0057 7345.2574 5091.3444 2689.0 3270.0 0.8223 2694.0 0.8239 1770.0 1772.0 2026.0 0.8746 0.8736 916.0 917.0 1231.0 0.7449 0.7441
0.0 17.0 2431 1.5594 0.0057 7356.5874 5099.1978 2693.0 3270.0 0.8235 2699.0 0.8254 1772.0 1774.0 2026.0 0.8756 0.8746 918.0 919.0 1231.0 0.7465 0.7457
0.0 18.0 2574 1.5588 0.0057 7354.0051 5097.4079 2693.0 3270.0 0.8235 2699.0 0.8254 1773.0 1775.0 2026.0 0.8761 0.8751 917.0 918.0 1231.0 0.7457 0.7449
0.0 19.0 2717 1.5574 0.0057 7347.1134 5092.6310 2694.0 3270.0 0.8239 2700.0 0.8257 1775.0 1777.0 2026.0 0.8771 0.8761 916.0 917.0 1231.0 0.7449 0.7441
0.0 20.0 2860 1.5598 0.0057 7358.7582 5100.7025 2694.0 3270.0 0.8239 2699.0 0.8254 1776.0 1778.0 2026.0 0.8776 0.8766 915.0 916.0 1231.0 0.7441 0.7433
0.0 21.0 3003 1.5610 0.0057 7364.2419 5104.5035 2693.0 3270.0 0.8235 2699.0 0.8254 1773.0 1775.0 2026.0 0.8761 0.8751 917.0 918.0 1231.0 0.7457 0.7449
0.0 22.0 3146 1.5590 0.0057 7354.8963 5098.0257 2695.0 3270.0 0.8242 2700.0 0.8257 1775.0 1777.0 2026.0 0.8771 0.8761 917.0 918.0 1231.0 0.7457 0.7449
0.0 23.0 3289 1.5609 0.0057 7363.6331 5104.0815 2692.0 3270.0 0.8232 2698.0 0.8251 1773.0 1775.0 2026.0 0.8761 0.8751 916.0 917.0 1231.0 0.7449 0.7441
0.0 24.0 3432 1.5620 0.0057 7368.7476 5107.6266 2694.0 3270.0 0.8239 2699.0 0.8254 1775.0 1777.0 2026.0 0.8771 0.8761 916.0 917.0 1231.0 0.7449 0.7441
0.0 25.0 3575 1.5613 0.0057 7365.4606 5105.3482 2693.0 3270.0 0.8235 2699.0 0.8254 1774.0 1776.0 2026.0 0.8766 0.8756 916.0 917.0 1231.0 0.7449 0.7441
0.0 26.0 3718 1.5604 0.0057 7361.4952 5102.5996 2692.0 3270.0 0.8232 2697.0 0.8248 1773.0 1775.0 2026.0 0.8761 0.8751 916.0 917.0 1231.0 0.7449 0.7441

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/BoolQ_Llama-3.2-1B-131yj8sj

Finetuned
(899)
this model