BoolQ_Llama-3.2-1B-n5s6b4x8
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.4568
- Model Preparation Time: 0.0058
- Mdl: 6872.6248
- Accumulated Loss: 4763.7405
- Correct Preds: 2760.0
- Total Preds: 3270.0
- Accuracy: 0.8440
- Correct Gen Preds: 2757.0
- Gen Accuracy: 0.8431
- Correct Gen Preds 9642: 1814.0
- Correct Preds 9642: 1823.0
- Total Labels 9642: 2026.0
- Accuracy 9642: 0.8998
- Gen Accuracy 9642: 0.8954
- Correct Gen Preds 2822: 934.0
- Correct Preds 2822: 937.0
- Total Labels 2822: 1231.0
- Accuracy 2822: 0.7612
- Gen Accuracy 2822: 0.7587
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 120
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 9642 | Correct Preds 9642 | Total Labels 9642 | Accuracy 9642 | Gen Accuracy 9642 | Correct Gen Preds 2822 | Correct Preds 2822 | Total Labels 2822 | Accuracy 2822 | Gen Accuracy 2822 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7080 | 0.0058 | 3339.8933 | 2315.0376 | 2032.0 | 3270.0 | 0.6214 | 2040.0 | 0.6239 | 2007.0 | 2008.0 | 2026.0 | 0.9911 | 0.9906 | 24.0 | 24.0 | 1231.0 | 0.0195 | 0.0195 |
| 0.3324 | 1.0 | 232 | 0.4426 | 0.0058 | 2088.1525 | 1447.3970 | 2676.0 | 3270.0 | 0.8183 | 2679.0 | 0.8193 | 1808.0 | 1809.0 | 2026.0 | 0.8929 | 0.8924 | 862.0 | 867.0 | 1231.0 | 0.7043 | 0.7002 |
| 0.4917 | 2.0 | 464 | 0.4887 | 0.0058 | 2305.5456 | 1598.0824 | 2697.0 | 3270.0 | 0.8248 | 2690.0 | 0.8226 | 1887.0 | 1893.0 | 2026.0 | 0.9344 | 0.9314 | 797.0 | 804.0 | 1231.0 | 0.6531 | 0.6474 |
| 0.0028 | 3.0 | 696 | 0.7594 | 0.0058 | 3582.3567 | 2483.1004 | 2676.0 | 3270.0 | 0.8183 | 2659.0 | 0.8131 | 1641.0 | 1663.0 | 2026.0 | 0.8208 | 0.8100 | 1009.0 | 1013.0 | 1231.0 | 0.8229 | 0.8197 |
| 0.0011 | 4.0 | 928 | 0.9795 | 0.0058 | 4620.9576 | 3203.0038 | 2731.0 | 3270.0 | 0.8352 | 2723.0 | 0.8327 | 1788.0 | 1799.0 | 2026.0 | 0.8880 | 0.8825 | 927.0 | 932.0 | 1231.0 | 0.7571 | 0.7530 |
| 0.1389 | 5.0 | 1160 | 1.0611 | 0.0058 | 5005.9836 | 3469.8834 | 2739.0 | 3270.0 | 0.8376 | 2737.0 | 0.8370 | 1847.0 | 1853.0 | 2026.0 | 0.9146 | 0.9116 | 882.0 | 886.0 | 1231.0 | 0.7197 | 0.7165 |
| 0.0002 | 6.0 | 1392 | 1.1056 | 0.0058 | 5215.9426 | 3615.4159 | 2749.0 | 3270.0 | 0.8407 | 2751.0 | 0.8413 | 1881.0 | 1885.0 | 2026.0 | 0.9304 | 0.9284 | 862.0 | 864.0 | 1231.0 | 0.7019 | 0.7002 |
| 0.0001 | 7.0 | 1624 | 1.2332 | 0.0058 | 5817.6850 | 4032.5120 | 2754.0 | 3270.0 | 0.8422 | 2738.0 | 0.8373 | 1806.0 | 1824.0 | 2026.0 | 0.9003 | 0.8914 | 923.0 | 930.0 | 1231.0 | 0.7555 | 0.7498 |
| 0.0 | 8.0 | 1856 | 1.2209 | 0.0058 | 5759.8281 | 3992.4086 | 2754.0 | 3270.0 | 0.8422 | 2753.0 | 0.8419 | 1788.0 | 1798.0 | 2026.0 | 0.8875 | 0.8825 | 956.0 | 956.0 | 1231.0 | 0.7766 | 0.7766 |
| 0.0 | 9.0 | 2088 | 1.4452 | 0.0058 | 6817.7274 | 4725.6885 | 2750.0 | 3270.0 | 0.8410 | 2746.0 | 0.8398 | 1815.0 | 1825.0 | 2026.0 | 0.9008 | 0.8959 | 922.0 | 925.0 | 1231.0 | 0.7514 | 0.7490 |
| 0.0 | 10.0 | 2320 | 1.4119 | 0.0058 | 6660.5648 | 4616.7517 | 2752.0 | 3270.0 | 0.8416 | 2749.0 | 0.8407 | 1797.0 | 1807.0 | 2026.0 | 0.8919 | 0.8870 | 943.0 | 945.0 | 1231.0 | 0.7677 | 0.7660 |
| 0.0 | 11.0 | 2552 | 1.4389 | 0.0058 | 6788.4022 | 4705.3618 | 2753.0 | 3270.0 | 0.8419 | 2751.0 | 0.8413 | 1813.0 | 1822.0 | 2026.0 | 0.8993 | 0.8949 | 929.0 | 931.0 | 1231.0 | 0.7563 | 0.7547 |
| 0.0 | 12.0 | 2784 | 1.4300 | 0.0058 | 6746.3247 | 4676.1959 | 2755.0 | 3270.0 | 0.8425 | 2752.0 | 0.8416 | 1812.0 | 1821.0 | 2026.0 | 0.8988 | 0.8944 | 931.0 | 934.0 | 1231.0 | 0.7587 | 0.7563 |
| 0.0 | 13.0 | 3016 | 1.4335 | 0.0058 | 6762.4940 | 4687.4036 | 2756.0 | 3270.0 | 0.8428 | 2750.0 | 0.8410 | 1806.0 | 1819.0 | 2026.0 | 0.8978 | 0.8914 | 935.0 | 937.0 | 1231.0 | 0.7612 | 0.7595 |
| 0.0 | 14.0 | 3248 | 1.4568 | 0.0058 | 6872.6248 | 4763.7405 | 2760.0 | 3270.0 | 0.8440 | 2757.0 | 0.8431 | 1814.0 | 1823.0 | 2026.0 | 0.8998 | 0.8954 | 934.0 | 937.0 | 1231.0 | 0.7612 | 0.7587 |
| 0.0 | 15.0 | 3480 | 1.4631 | 0.0058 | 6902.2813 | 4784.2968 | 2750.0 | 3270.0 | 0.8410 | 2739.0 | 0.8376 | 1792.0 | 1809.0 | 2026.0 | 0.8929 | 0.8845 | 938.0 | 941.0 | 1231.0 | 0.7644 | 0.7620 |
| 0.0 | 16.0 | 3712 | 1.4765 | 0.0058 | 6965.4556 | 4828.0859 | 2754.0 | 3270.0 | 0.8422 | 2743.0 | 0.8388 | 1797.0 | 1814.0 | 2026.0 | 0.8954 | 0.8870 | 937.0 | 940.0 | 1231.0 | 0.7636 | 0.7612 |
| 0.0 | 17.0 | 3944 | 1.4796 | 0.0058 | 6980.1585 | 4838.2772 | 2751.0 | 3270.0 | 0.8413 | 2745.0 | 0.8394 | 1799.0 | 1812.0 | 2026.0 | 0.8944 | 0.8880 | 937.0 | 939.0 | 1231.0 | 0.7628 | 0.7612 |
| 0.0 | 18.0 | 4176 | 1.4793 | 0.0058 | 6978.7939 | 4837.3313 | 2755.0 | 3270.0 | 0.8425 | 2748.0 | 0.8404 | 1799.0 | 1813.0 | 2026.0 | 0.8949 | 0.8880 | 940.0 | 942.0 | 1231.0 | 0.7652 | 0.7636 |
| 0.0 | 19.0 | 4408 | 1.4822 | 0.0058 | 6992.2377 | 4846.6498 | 2752.0 | 3270.0 | 0.8416 | 2742.0 | 0.8385 | 1798.0 | 1815.0 | 2026.0 | 0.8959 | 0.8875 | 935.0 | 937.0 | 1231.0 | 0.7612 | 0.7595 |
| 0.0 | 20.0 | 4640 | 1.4798 | 0.0058 | 6980.8944 | 4838.7873 | 2753.0 | 3270.0 | 0.8419 | 2745.0 | 0.8394 | 1798.0 | 1812.0 | 2026.0 | 0.8944 | 0.8875 | 938.0 | 941.0 | 1231.0 | 0.7644 | 0.7620 |
| 0.0 | 21.0 | 4872 | 1.4847 | 0.0058 | 7004.0401 | 4854.8307 | 2755.0 | 3270.0 | 0.8425 | 2748.0 | 0.8404 | 1801.0 | 1815.0 | 2026.0 | 0.8959 | 0.8889 | 938.0 | 940.0 | 1231.0 | 0.7636 | 0.7620 |
| 0.0 | 22.0 | 5104 | 1.4801 | 0.0058 | 6982.3382 | 4839.7880 | 2754.0 | 3270.0 | 0.8422 | 2746.0 | 0.8398 | 1797.0 | 1812.0 | 2026.0 | 0.8944 | 0.8870 | 940.0 | 942.0 | 1231.0 | 0.7652 | 0.7636 |
| 0.0 | 23.0 | 5336 | 1.4791 | 0.0058 | 6977.9730 | 4836.7623 | 2756.0 | 3270.0 | 0.8428 | 2747.0 | 0.8401 | 1801.0 | 1816.0 | 2026.0 | 0.8963 | 0.8889 | 937.0 | 940.0 | 1231.0 | 0.7636 | 0.7612 |
| 0.0 | 24.0 | 5568 | 1.4821 | 0.0058 | 6991.9891 | 4846.4775 | 2751.0 | 3270.0 | 0.8413 | 2743.0 | 0.8388 | 1797.0 | 1812.0 | 2026.0 | 0.8944 | 0.8870 | 937.0 | 939.0 | 1231.0 | 0.7628 | 0.7612 |
| 0.0 | 25.0 | 5800 | 1.4844 | 0.0058 | 7003.0013 | 4854.1106 | 2754.0 | 3270.0 | 0.8422 | 2746.0 | 0.8398 | 1799.0 | 1812.0 | 2026.0 | 0.8944 | 0.8880 | 938.0 | 942.0 | 1231.0 | 0.7652 | 0.7620 |
| 0.0 | 26.0 | 6032 | 1.4848 | 0.0058 | 7004.8082 | 4855.3631 | 2760.0 | 3270.0 | 0.8440 | 2750.0 | 0.8410 | 1800.0 | 1816.0 | 2026.0 | 0.8963 | 0.8885 | 941.0 | 944.0 | 1231.0 | 0.7669 | 0.7644 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 5
Model tree for donoway/BoolQ_Llama-3.2-1B-n5s6b4x8
Base model
meta-llama/Llama-3.2-1B