BoolQ_Llama-3.2-1B-4lp9g5r3

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5927
  • Model Preparation Time: 0.0058
  • Mdl: 7513.6957
  • Accumulated Loss: 5208.0970
  • Correct Preds: 2609.0
  • Total Preds: 3270.0
  • Accuracy: 0.7979
  • Correct Gen Preds: 2610.0
  • Gen Accuracy: 0.7982
  • Correct Gen Preds 9642: 1661.0
  • Correct Preds 9642: 1668.0
  • Total Labels 9642: 2026.0
  • Accuracy 9642: 0.8233
  • Gen Accuracy 9642: 0.8198
  • Correct Gen Preds 2822: 940.0
  • Correct Preds 2822: 941.0
  • Total Labels 2822: 1231.0
  • Accuracy 2822: 0.7644
  • Gen Accuracy 2822: 0.7636

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 120
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 9642 Correct Preds 9642 Total Labels 9642 Accuracy 9642 Gen Accuracy 9642 Correct Gen Preds 2822 Correct Preds 2822 Total Labels 2822 Accuracy 2822 Gen Accuracy 2822
No log 0 0 0.7080 0.0058 3339.8933 2315.0376 2032.0 3270.0 0.6214 2040.0 0.6239 2007.0 2008.0 2026.0 0.9911 0.9906 24.0 24.0 1231.0 0.0195 0.0195
0.5297 1.0 54 0.5655 0.0058 2667.8169 1849.1897 2391.0 3270.0 0.7312 2397.0 0.7330 1482.0 1483.0 2026.0 0.7320 0.7315 906.0 908.0 1231.0 0.7376 0.7360
0.429 2.0 108 0.6555 0.0058 3092.6188 2143.6400 2562.0 3270.0 0.7835 2524.0 0.7719 1836.0 1856.0 2026.0 0.9161 0.9062 680.0 706.0 1231.0 0.5735 0.5524
0.1091 3.0 162 0.9086 0.0058 4286.2025 2970.9692 2566.0 3270.0 0.7847 2517.0 0.7697 1682.0 1729.0 2026.0 0.8534 0.8302 827.0 837.0 1231.0 0.6799 0.6718
0.3827 4.0 216 1.0952 0.0058 5166.9108 3581.4297 2592.0 3270.0 0.7927 2498.0 0.7639 1556.0 1643.0 2026.0 0.8110 0.7680 933.0 949.0 1231.0 0.7709 0.7579
0.0001 5.0 270 1.4491 0.0058 6836.4286 4738.6512 2590.0 3270.0 0.7920 2589.0 0.7917 1644.0 1653.0 2026.0 0.8159 0.8115 936.0 937.0 1231.0 0.7612 0.7604
0.0006 6.0 324 1.6538 0.0058 7802.1499 5408.0382 2602.0 3270.0 0.7957 2605.0 0.7966 1797.0 1801.0 2026.0 0.8889 0.8870 799.0 801.0 1231.0 0.6507 0.6491
0.0001 7.0 378 1.5927 0.0058 7513.6957 5208.0970 2609.0 3270.0 0.7979 2610.0 0.7982 1661.0 1668.0 2026.0 0.8233 0.8198 940.0 941.0 1231.0 0.7644 0.7636
0.0 8.0 432 1.6183 0.0058 7634.6804 5291.9572 2601.0 3270.0 0.7954 2603.0 0.7960 1657.0 1663.0 2026.0 0.8208 0.8179 937.0 938.0 1231.0 0.7620 0.7612
0.0 9.0 486 1.6373 0.0058 7724.0835 5353.9267 2602.0 3270.0 0.7957 2605.0 0.7966 1659.0 1664.0 2026.0 0.8213 0.8189 938.0 938.0 1231.0 0.7620 0.7620
0.0 10.0 540 1.6487 0.0058 7777.7540 5391.1283 2603.0 3270.0 0.7960 2607.0 0.7972 1660.0 1665.0 2026.0 0.8218 0.8193 938.0 938.0 1231.0 0.7620 0.7620
0.0 11.0 594 1.6598 0.0058 7830.1323 5427.4341 2603.0 3270.0 0.7960 2607.0 0.7972 1659.0 1664.0 2026.0 0.8213 0.8189 939.0 939.0 1231.0 0.7628 0.7628
0.0 12.0 648 1.6698 0.0058 7877.2390 5460.0860 2604.0 3270.0 0.7963 2607.0 0.7972 1661.0 1667.0 2026.0 0.8228 0.8198 937.0 937.0 1231.0 0.7612 0.7612
0.0 13.0 702 1.6788 0.0058 7920.0781 5489.7798 2603.0 3270.0 0.7960 2606.0 0.7969 1660.0 1666.0 2026.0 0.8223 0.8193 937.0 937.0 1231.0 0.7612 0.7612
0.0 14.0 756 1.6880 0.0058 7963.3426 5519.7685 2604.0 3270.0 0.7963 2607.0 0.7972 1660.0 1666.0 2026.0 0.8223 0.8193 938.0 938.0 1231.0 0.7620 0.7620
0.0 15.0 810 1.6918 0.0058 7981.2343 5532.1700 2603.0 3270.0 0.7960 2607.0 0.7972 1660.0 1665.0 2026.0 0.8218 0.8193 938.0 938.0 1231.0 0.7620 0.7620
0.0 16.0 864 1.6997 0.0058 8018.5627 5558.0441 2606.0 3270.0 0.7969 2610.0 0.7982 1663.0 1668.0 2026.0 0.8233 0.8208 938.0 938.0 1231.0 0.7620 0.7620
0.0 17.0 918 1.7052 0.0058 8044.3508 5575.9191 2601.0 3270.0 0.7954 2605.0 0.7966 1659.0 1664.0 2026.0 0.8213 0.8189 937.0 937.0 1231.0 0.7612 0.7612
0.0 18.0 972 1.7084 0.0058 8059.3827 5586.3384 2602.0 3270.0 0.7957 2606.0 0.7969 1660.0 1665.0 2026.0 0.8218 0.8193 937.0 937.0 1231.0 0.7612 0.7612
0.0 19.0 1026 1.7110 0.0058 8071.9707 5595.0637 2603.0 3270.0 0.7960 2607.0 0.7972 1661.0 1666.0 2026.0 0.8223 0.8198 937.0 937.0 1231.0 0.7612 0.7612
0.0 20.0 1080 1.7147 0.0058 8089.1272 5606.9557 2602.0 3270.0 0.7957 2605.0 0.7966 1659.0 1665.0 2026.0 0.8218 0.8189 937.0 937.0 1231.0 0.7612 0.7612
0.0 21.0 1134 1.7125 0.0058 8079.1133 5600.0146 2605.0 3270.0 0.7966 2609.0 0.7979 1662.0 1667.0 2026.0 0.8228 0.8203 938.0 938.0 1231.0 0.7620 0.7620
0.0 22.0 1188 1.7151 0.0058 8091.0910 5608.3169 2604.0 3270.0 0.7963 2607.0 0.7972 1659.0 1665.0 2026.0 0.8218 0.8189 939.0 939.0 1231.0 0.7628 0.7628
0.0 23.0 1242 1.7160 0.0058 8095.3381 5611.2608 2602.0 3270.0 0.7957 2606.0 0.7969 1660.0 1665.0 2026.0 0.8218 0.8193 937.0 937.0 1231.0 0.7612 0.7612
0.0 24.0 1296 1.7144 0.0058 8087.7767 5606.0196 2605.0 3270.0 0.7966 2609.0 0.7979 1663.0 1668.0 2026.0 0.8233 0.8208 937.0 937.0 1231.0 0.7612 0.7612
0.0 25.0 1350 1.7150 0.0058 8090.7756 5608.0983 2604.0 3270.0 0.7963 2608.0 0.7976 1662.0 1667.0 2026.0 0.8228 0.8203 937.0 937.0 1231.0 0.7612 0.7612
0.0 26.0 1404 1.7128 0.0058 8080.3770 5600.8906 2603.0 3270.0 0.7960 2607.0 0.7972 1662.0 1667.0 2026.0 0.8228 0.8203 936.0 936.0 1231.0 0.7604 0.7604
0.0 27.0 1458 1.7153 0.0058 8092.1583 5609.0567 2605.0 3270.0 0.7966 2609.0 0.7979 1662.0 1667.0 2026.0 0.8228 0.8203 938.0 938.0 1231.0 0.7620 0.7620
0.0 28.0 1512 1.7140 0.0058 8086.0919 5604.8518 2604.0 3270.0 0.7963 2607.0 0.7972 1660.0 1666.0 2026.0 0.8223 0.8193 938.0 938.0 1231.0 0.7620 0.7620
0.0 29.0 1566 1.7153 0.0058 8092.1390 5609.0433 2602.0 3270.0 0.7957 2606.0 0.7969 1660.0 1665.0 2026.0 0.8218 0.8193 937.0 937.0 1231.0 0.7612 0.7612
0.0 30.0 1620 1.7134 0.0058 8083.1388 5602.8049 2605.0 3270.0 0.7966 2609.0 0.7979 1663.0 1668.0 2026.0 0.8233 0.8208 937.0 937.0 1231.0 0.7612 0.7612
0.0 31.0 1674 1.7155 0.0058 8092.8767 5609.5547 2604.0 3270.0 0.7963 2608.0 0.7976 1661.0 1666.0 2026.0 0.8223 0.8198 938.0 938.0 1231.0 0.7620 0.7620
0.0 32.0 1728 1.7118 0.0058 8075.5841 5597.5684 2605.0 3270.0 0.7966 2610.0 0.7982 1663.0 1667.0 2026.0 0.8228 0.8208 938.0 938.0 1231.0 0.7620 0.7620
0.0 33.0 1782 1.7150 0.0058 8090.7632 5608.0897 2605.0 3270.0 0.7966 2609.0 0.7979 1663.0 1668.0 2026.0 0.8233 0.8208 937.0 937.0 1231.0 0.7612 0.7612
0.0 34.0 1836 1.7158 0.0058 8094.6523 5610.7854 2605.0 3270.0 0.7966 2609.0 0.7979 1663.0 1668.0 2026.0 0.8233 0.8208 937.0 937.0 1231.0 0.7612 0.7612
0.0 35.0 1890 1.7144 0.0058 8087.7042 5605.9694 2606.0 3270.0 0.7969 2610.0 0.7982 1664.0 1669.0 2026.0 0.8238 0.8213 937.0 937.0 1231.0 0.7612 0.7612
0.0 36.0 1944 1.7155 0.0058 8093.0727 5609.6905 2602.0 3270.0 0.7957 2607.0 0.7972 1660.0 1664.0 2026.0 0.8213 0.8193 938.0 938.0 1231.0 0.7620 0.7620
0.0 37.0 1998 1.7157 0.0058 8094.0158 5610.3442 2604.0 3270.0 0.7963 2608.0 0.7976 1662.0 1667.0 2026.0 0.8228 0.8203 937.0 937.0 1231.0 0.7612 0.7612

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/BoolQ_Llama-3.2-1B-4lp9g5r3

Finetuned
(900)
this model