BoolQ_Llama-3.2-1B-n5s6b4x8

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4568
  • Model Preparation Time: 0.0058
  • Mdl: 6872.6248
  • Accumulated Loss: 4763.7405
  • Correct Preds: 2760.0
  • Total Preds: 3270.0
  • Accuracy: 0.8440
  • Correct Gen Preds: 2757.0
  • Gen Accuracy: 0.8431
  • Correct Gen Preds 9642: 1814.0
  • Correct Preds 9642: 1823.0
  • Total Labels 9642: 2026.0
  • Accuracy 9642: 0.8998
  • Gen Accuracy 9642: 0.8954
  • Correct Gen Preds 2822: 934.0
  • Correct Preds 2822: 937.0
  • Total Labels 2822: 1231.0
  • Accuracy 2822: 0.7612
  • Gen Accuracy 2822: 0.7587

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 120
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 9642 Correct Preds 9642 Total Labels 9642 Accuracy 9642 Gen Accuracy 9642 Correct Gen Preds 2822 Correct Preds 2822 Total Labels 2822 Accuracy 2822 Gen Accuracy 2822
No log 0 0 0.7080 0.0058 3339.8933 2315.0376 2032.0 3270.0 0.6214 2040.0 0.6239 2007.0 2008.0 2026.0 0.9911 0.9906 24.0 24.0 1231.0 0.0195 0.0195
0.3324 1.0 232 0.4426 0.0058 2088.1525 1447.3970 2676.0 3270.0 0.8183 2679.0 0.8193 1808.0 1809.0 2026.0 0.8929 0.8924 862.0 867.0 1231.0 0.7043 0.7002
0.4917 2.0 464 0.4887 0.0058 2305.5456 1598.0824 2697.0 3270.0 0.8248 2690.0 0.8226 1887.0 1893.0 2026.0 0.9344 0.9314 797.0 804.0 1231.0 0.6531 0.6474
0.0028 3.0 696 0.7594 0.0058 3582.3567 2483.1004 2676.0 3270.0 0.8183 2659.0 0.8131 1641.0 1663.0 2026.0 0.8208 0.8100 1009.0 1013.0 1231.0 0.8229 0.8197
0.0011 4.0 928 0.9795 0.0058 4620.9576 3203.0038 2731.0 3270.0 0.8352 2723.0 0.8327 1788.0 1799.0 2026.0 0.8880 0.8825 927.0 932.0 1231.0 0.7571 0.7530
0.1389 5.0 1160 1.0611 0.0058 5005.9836 3469.8834 2739.0 3270.0 0.8376 2737.0 0.8370 1847.0 1853.0 2026.0 0.9146 0.9116 882.0 886.0 1231.0 0.7197 0.7165
0.0002 6.0 1392 1.1056 0.0058 5215.9426 3615.4159 2749.0 3270.0 0.8407 2751.0 0.8413 1881.0 1885.0 2026.0 0.9304 0.9284 862.0 864.0 1231.0 0.7019 0.7002
0.0001 7.0 1624 1.2332 0.0058 5817.6850 4032.5120 2754.0 3270.0 0.8422 2738.0 0.8373 1806.0 1824.0 2026.0 0.9003 0.8914 923.0 930.0 1231.0 0.7555 0.7498
0.0 8.0 1856 1.2209 0.0058 5759.8281 3992.4086 2754.0 3270.0 0.8422 2753.0 0.8419 1788.0 1798.0 2026.0 0.8875 0.8825 956.0 956.0 1231.0 0.7766 0.7766
0.0 9.0 2088 1.4452 0.0058 6817.7274 4725.6885 2750.0 3270.0 0.8410 2746.0 0.8398 1815.0 1825.0 2026.0 0.9008 0.8959 922.0 925.0 1231.0 0.7514 0.7490
0.0 10.0 2320 1.4119 0.0058 6660.5648 4616.7517 2752.0 3270.0 0.8416 2749.0 0.8407 1797.0 1807.0 2026.0 0.8919 0.8870 943.0 945.0 1231.0 0.7677 0.7660
0.0 11.0 2552 1.4389 0.0058 6788.4022 4705.3618 2753.0 3270.0 0.8419 2751.0 0.8413 1813.0 1822.0 2026.0 0.8993 0.8949 929.0 931.0 1231.0 0.7563 0.7547
0.0 12.0 2784 1.4300 0.0058 6746.3247 4676.1959 2755.0 3270.0 0.8425 2752.0 0.8416 1812.0 1821.0 2026.0 0.8988 0.8944 931.0 934.0 1231.0 0.7587 0.7563
0.0 13.0 3016 1.4335 0.0058 6762.4940 4687.4036 2756.0 3270.0 0.8428 2750.0 0.8410 1806.0 1819.0 2026.0 0.8978 0.8914 935.0 937.0 1231.0 0.7612 0.7595
0.0 14.0 3248 1.4568 0.0058 6872.6248 4763.7405 2760.0 3270.0 0.8440 2757.0 0.8431 1814.0 1823.0 2026.0 0.8998 0.8954 934.0 937.0 1231.0 0.7612 0.7587
0.0 15.0 3480 1.4631 0.0058 6902.2813 4784.2968 2750.0 3270.0 0.8410 2739.0 0.8376 1792.0 1809.0 2026.0 0.8929 0.8845 938.0 941.0 1231.0 0.7644 0.7620
0.0 16.0 3712 1.4765 0.0058 6965.4556 4828.0859 2754.0 3270.0 0.8422 2743.0 0.8388 1797.0 1814.0 2026.0 0.8954 0.8870 937.0 940.0 1231.0 0.7636 0.7612
0.0 17.0 3944 1.4796 0.0058 6980.1585 4838.2772 2751.0 3270.0 0.8413 2745.0 0.8394 1799.0 1812.0 2026.0 0.8944 0.8880 937.0 939.0 1231.0 0.7628 0.7612
0.0 18.0 4176 1.4793 0.0058 6978.7939 4837.3313 2755.0 3270.0 0.8425 2748.0 0.8404 1799.0 1813.0 2026.0 0.8949 0.8880 940.0 942.0 1231.0 0.7652 0.7636
0.0 19.0 4408 1.4822 0.0058 6992.2377 4846.6498 2752.0 3270.0 0.8416 2742.0 0.8385 1798.0 1815.0 2026.0 0.8959 0.8875 935.0 937.0 1231.0 0.7612 0.7595
0.0 20.0 4640 1.4798 0.0058 6980.8944 4838.7873 2753.0 3270.0 0.8419 2745.0 0.8394 1798.0 1812.0 2026.0 0.8944 0.8875 938.0 941.0 1231.0 0.7644 0.7620
0.0 21.0 4872 1.4847 0.0058 7004.0401 4854.8307 2755.0 3270.0 0.8425 2748.0 0.8404 1801.0 1815.0 2026.0 0.8959 0.8889 938.0 940.0 1231.0 0.7636 0.7620
0.0 22.0 5104 1.4801 0.0058 6982.3382 4839.7880 2754.0 3270.0 0.8422 2746.0 0.8398 1797.0 1812.0 2026.0 0.8944 0.8870 940.0 942.0 1231.0 0.7652 0.7636
0.0 23.0 5336 1.4791 0.0058 6977.9730 4836.7623 2756.0 3270.0 0.8428 2747.0 0.8401 1801.0 1816.0 2026.0 0.8963 0.8889 937.0 940.0 1231.0 0.7636 0.7612
0.0 24.0 5568 1.4821 0.0058 6991.9891 4846.4775 2751.0 3270.0 0.8413 2743.0 0.8388 1797.0 1812.0 2026.0 0.8944 0.8870 937.0 939.0 1231.0 0.7628 0.7612
0.0 25.0 5800 1.4844 0.0058 7003.0013 4854.1106 2754.0 3270.0 0.8422 2746.0 0.8398 1799.0 1812.0 2026.0 0.8944 0.8880 938.0 942.0 1231.0 0.7652 0.7620
0.0 26.0 6032 1.4848 0.0058 7004.8082 4855.3631 2760.0 3270.0 0.8440 2750.0 0.8410 1800.0 1816.0 2026.0 0.8963 0.8885 941.0 944.0 1231.0 0.7669 0.7644

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
5
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/BoolQ_Llama-3.2-1B-n5s6b4x8

Finetuned
(899)
this model