BoolQ_Llama-3.2-1B-0ql21waq
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.5178
- Model Preparation Time: 0.0058
- Mdl: 7160.3909
- Accumulated Loss: 4963.2047
- Correct Preds: 2692.0
- Total Preds: 3270.0
- Accuracy: 0.8232
- Correct Gen Preds: 2690.0
- Gen Accuracy: 0.8226
- Correct Gen Preds 9642: 1792.0
- Correct Preds 9642: 1797.0
- Total Labels 9642: 2026.0
- Accuracy 9642: 0.8870
- Gen Accuracy 9642: 0.8845
- Correct Gen Preds 2822: 892.0
- Correct Preds 2822: 895.0
- Total Labels 2822: 1231.0
- Accuracy 2822: 0.7271
- Gen Accuracy 2822: 0.7246
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 120
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 9642 | Correct Preds 9642 | Total Labels 9642 | Accuracy 9642 | Gen Accuracy 9642 | Correct Gen Preds 2822 | Correct Preds 2822 | Total Labels 2822 | Accuracy 2822 | Gen Accuracy 2822 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7080 | 0.0058 | 3339.8933 | 2315.0376 | 2032.0 | 3270.0 | 0.6214 | 2040.0 | 0.6239 | 2007.0 | 2008.0 | 2026.0 | 0.9911 | 0.9906 | 24.0 | 24.0 | 1231.0 | 0.0195 | 0.0195 |
| 0.5195 | 1.0 | 112 | 0.5523 | 0.0058 | 2605.3824 | 1805.9135 | 2426.0 | 3270.0 | 0.7419 | 2431.0 | 0.7434 | 1333.0 | 1335.0 | 2026.0 | 0.6589 | 0.6579 | 1089.0 | 1091.0 | 1231.0 | 0.8863 | 0.8846 |
| 0.1921 | 2.0 | 224 | 0.5354 | 0.0058 | 2525.8902 | 1750.8137 | 2611.0 | 3270.0 | 0.7985 | 2584.0 | 0.7902 | 1896.0 | 1915.0 | 2026.0 | 0.9452 | 0.9358 | 681.0 | 696.0 | 1231.0 | 0.5654 | 0.5532 |
| 0.0072 | 3.0 | 336 | 0.8070 | 0.0058 | 3807.0161 | 2638.8225 | 2588.0 | 3270.0 | 0.7914 | 2276.0 | 0.6960 | 1334.0 | 1615.0 | 2026.0 | 0.7971 | 0.6584 | 933.0 | 973.0 | 1231.0 | 0.7904 | 0.7579 |
| 0.0008 | 4.0 | 448 | 1.0117 | 0.0058 | 4772.7257 | 3308.2014 | 2666.0 | 3270.0 | 0.8153 | 2616.0 | 0.8 | 1641.0 | 1690.0 | 2026.0 | 0.8342 | 0.8100 | 966.0 | 976.0 | 1231.0 | 0.7929 | 0.7847 |
| 0.0001 | 5.0 | 560 | 1.2247 | 0.0058 | 5777.5533 | 4004.6948 | 2673.0 | 3270.0 | 0.8174 | 2667.0 | 0.8156 | 1727.0 | 1738.0 | 2026.0 | 0.8578 | 0.8524 | 932.0 | 935.0 | 1231.0 | 0.7595 | 0.7571 |
| 0.0003 | 6.0 | 672 | 1.1632 | 0.0058 | 5487.5973 | 3803.7126 | 2670.0 | 3270.0 | 0.8165 | 2659.0 | 0.8131 | 1720.0 | 1730.0 | 2026.0 | 0.8539 | 0.8490 | 933.0 | 940.0 | 1231.0 | 0.7636 | 0.7579 |
| 0.0001 | 7.0 | 784 | 1.4257 | 0.0058 | 6725.7305 | 4661.9211 | 2690.0 | 3270.0 | 0.8226 | 2691.0 | 0.8229 | 1827.0 | 1830.0 | 2026.0 | 0.9033 | 0.9018 | 858.0 | 860.0 | 1231.0 | 0.6986 | 0.6970 |
| 0.0001 | 8.0 | 896 | 1.3071 | 0.0058 | 6166.4439 | 4274.2532 | 2684.0 | 3270.0 | 0.8208 | 2681.0 | 0.8199 | 1759.0 | 1768.0 | 2026.0 | 0.8727 | 0.8682 | 916.0 | 916.0 | 1231.0 | 0.7441 | 0.7441 |
| 0.0 | 9.0 | 1008 | 1.3898 | 0.0058 | 6556.5494 | 4544.6537 | 2681.0 | 3270.0 | 0.8199 | 2679.0 | 0.8193 | 1774.0 | 1781.0 | 2026.0 | 0.8791 | 0.8756 | 899.0 | 900.0 | 1231.0 | 0.7311 | 0.7303 |
| 0.0 | 10.0 | 1120 | 1.4448 | 0.0058 | 6815.9145 | 4724.4319 | 2682.0 | 3270.0 | 0.8202 | 2679.0 | 0.8193 | 1762.0 | 1771.0 | 2026.0 | 0.8741 | 0.8697 | 911.0 | 911.0 | 1231.0 | 0.7400 | 0.7400 |
| 0.0 | 11.0 | 1232 | 1.4550 | 0.0058 | 6864.1802 | 4757.8871 | 2688.0 | 3270.0 | 0.8220 | 2685.0 | 0.8211 | 1786.0 | 1792.0 | 2026.0 | 0.8845 | 0.8815 | 893.0 | 896.0 | 1231.0 | 0.7279 | 0.7254 |
| 0.0001 | 12.0 | 1344 | 1.4899 | 0.0058 | 7028.5828 | 4871.8424 | 2688.0 | 3270.0 | 0.8220 | 2686.0 | 0.8214 | 1793.0 | 1798.0 | 2026.0 | 0.8875 | 0.8850 | 887.0 | 890.0 | 1231.0 | 0.7230 | 0.7206 |
| 0.0 | 13.0 | 1456 | 1.5028 | 0.0058 | 7089.6025 | 4914.1380 | 2687.0 | 3270.0 | 0.8217 | 2682.0 | 0.8202 | 1789.0 | 1797.0 | 2026.0 | 0.8870 | 0.8830 | 887.0 | 890.0 | 1231.0 | 0.7230 | 0.7206 |
| 0.0 | 14.0 | 1568 | 1.5135 | 0.0058 | 7140.0229 | 4949.0867 | 2686.0 | 3270.0 | 0.8214 | 2681.0 | 0.8199 | 1789.0 | 1797.0 | 2026.0 | 0.8870 | 0.8830 | 886.0 | 889.0 | 1231.0 | 0.7222 | 0.7197 |
| 0.0 | 15.0 | 1680 | 1.5171 | 0.0058 | 7157.1157 | 4960.9346 | 2685.0 | 3270.0 | 0.8211 | 2682.0 | 0.8202 | 1788.0 | 1794.0 | 2026.0 | 0.8855 | 0.8825 | 888.0 | 891.0 | 1231.0 | 0.7238 | 0.7214 |
| 0.0 | 16.0 | 1792 | 1.5195 | 0.0058 | 7168.3843 | 4968.7454 | 2681.0 | 3270.0 | 0.8199 | 2677.0 | 0.8187 | 1785.0 | 1792.0 | 2026.0 | 0.8845 | 0.8810 | 886.0 | 889.0 | 1231.0 | 0.7222 | 0.7197 |
| 0.0 | 17.0 | 1904 | 1.5178 | 0.0058 | 7160.3909 | 4963.2047 | 2692.0 | 3270.0 | 0.8232 | 2690.0 | 0.8226 | 1792.0 | 1797.0 | 2026.0 | 0.8870 | 0.8845 | 892.0 | 895.0 | 1231.0 | 0.7271 | 0.7246 |
| 0.0 | 18.0 | 2016 | 1.5196 | 0.0058 | 7169.0138 | 4969.1817 | 2689.0 | 3270.0 | 0.8223 | 2684.0 | 0.8208 | 1788.0 | 1796.0 | 2026.0 | 0.8865 | 0.8825 | 890.0 | 893.0 | 1231.0 | 0.7254 | 0.7230 |
| 0.0 | 19.0 | 2128 | 1.5198 | 0.0058 | 7169.6546 | 4969.6259 | 2687.0 | 3270.0 | 0.8217 | 2684.0 | 0.8208 | 1787.0 | 1793.0 | 2026.0 | 0.8850 | 0.8820 | 891.0 | 894.0 | 1231.0 | 0.7262 | 0.7238 |
| 0.0 | 20.0 | 2240 | 1.5250 | 0.0058 | 7194.4023 | 4986.7797 | 2683.0 | 3270.0 | 0.8205 | 2680.0 | 0.8196 | 1786.0 | 1792.0 | 2026.0 | 0.8845 | 0.8815 | 888.0 | 891.0 | 1231.0 | 0.7238 | 0.7214 |
| 0.0 | 21.0 | 2352 | 1.5238 | 0.0058 | 7188.8707 | 4982.9454 | 2687.0 | 3270.0 | 0.8217 | 2682.0 | 0.8202 | 1783.0 | 1791.0 | 2026.0 | 0.8840 | 0.8801 | 893.0 | 896.0 | 1231.0 | 0.7279 | 0.7254 |
| 0.0 | 22.0 | 2464 | 1.5235 | 0.0058 | 7187.4765 | 4981.9791 | 2686.0 | 3270.0 | 0.8214 | 2684.0 | 0.8208 | 1786.0 | 1792.0 | 2026.0 | 0.8845 | 0.8815 | 892.0 | 894.0 | 1231.0 | 0.7262 | 0.7246 |
| 0.0 | 23.0 | 2576 | 1.5280 | 0.0058 | 7208.5330 | 4996.5743 | 2685.0 | 3270.0 | 0.8211 | 2681.0 | 0.8199 | 1787.0 | 1794.0 | 2026.0 | 0.8855 | 0.8820 | 888.0 | 891.0 | 1231.0 | 0.7238 | 0.7214 |
| 0.0 | 24.0 | 2688 | 1.5250 | 0.0058 | 7194.3822 | 4986.7657 | 2685.0 | 3270.0 | 0.8211 | 2682.0 | 0.8202 | 1786.0 | 1792.0 | 2026.0 | 0.8845 | 0.8815 | 890.0 | 893.0 | 1231.0 | 0.7254 | 0.7230 |
| 0.0 | 25.0 | 2800 | 1.5260 | 0.0058 | 7199.0114 | 4989.9744 | 2685.0 | 3270.0 | 0.8211 | 2682.0 | 0.8202 | 1785.0 | 1792.0 | 2026.0 | 0.8845 | 0.8810 | 891.0 | 893.0 | 1231.0 | 0.7254 | 0.7238 |
| 0.0 | 26.0 | 2912 | 1.5256 | 0.0058 | 7197.4171 | 4988.8693 | 2684.0 | 3270.0 | 0.8208 | 2681.0 | 0.8199 | 1784.0 | 1791.0 | 2026.0 | 0.8840 | 0.8806 | 891.0 | 893.0 | 1231.0 | 0.7254 | 0.7238 |
| 0.0 | 27.0 | 3024 | 1.5273 | 0.0058 | 7205.3807 | 4994.3893 | 2686.0 | 3270.0 | 0.8214 | 2681.0 | 0.8199 | 1783.0 | 1792.0 | 2026.0 | 0.8845 | 0.8801 | 892.0 | 894.0 | 1231.0 | 0.7262 | 0.7246 |
| 0.0 | 28.0 | 3136 | 1.5258 | 0.0058 | 7198.0159 | 4989.2844 | 2686.0 | 3270.0 | 0.8214 | 2682.0 | 0.8202 | 1782.0 | 1790.0 | 2026.0 | 0.8835 | 0.8796 | 894.0 | 896.0 | 1231.0 | 0.7279 | 0.7262 |
| 0.0 | 29.0 | 3248 | 1.5263 | 0.0058 | 7200.3983 | 4990.9358 | 2685.0 | 3270.0 | 0.8211 | 2683.0 | 0.8205 | 1786.0 | 1792.0 | 2026.0 | 0.8845 | 0.8815 | 891.0 | 893.0 | 1231.0 | 0.7254 | 0.7238 |
| 0.0 | 30.0 | 3360 | 1.5263 | 0.0058 | 7200.5562 | 4991.0452 | 2687.0 | 3270.0 | 0.8217 | 2683.0 | 0.8205 | 1784.0 | 1792.0 | 2026.0 | 0.8845 | 0.8806 | 893.0 | 895.0 | 1231.0 | 0.7271 | 0.7254 |
| 0.0 | 31.0 | 3472 | 1.5284 | 0.0058 | 7210.2626 | 4997.7732 | 2683.0 | 3270.0 | 0.8205 | 2678.0 | 0.8190 | 1781.0 | 1789.0 | 2026.0 | 0.8830 | 0.8791 | 891.0 | 894.0 | 1231.0 | 0.7262 | 0.7238 |
| 0.0 | 32.0 | 3584 | 1.5260 | 0.0058 | 7198.8642 | 4989.8724 | 2687.0 | 3270.0 | 0.8217 | 2682.0 | 0.8202 | 1786.0 | 1794.0 | 2026.0 | 0.8855 | 0.8815 | 890.0 | 893.0 | 1231.0 | 0.7254 | 0.7230 |
| 0.0 | 33.0 | 3696 | 1.5283 | 0.0058 | 7209.9507 | 4997.5570 | 2688.0 | 3270.0 | 0.8220 | 2682.0 | 0.8202 | 1783.0 | 1791.0 | 2026.0 | 0.8840 | 0.8801 | 893.0 | 897.0 | 1231.0 | 0.7287 | 0.7254 |
| 0.0 | 34.0 | 3808 | 1.5291 | 0.0058 | 7213.9354 | 5000.3190 | 2682.0 | 3270.0 | 0.8202 | 2679.0 | 0.8193 | 1783.0 | 1790.0 | 2026.0 | 0.8835 | 0.8801 | 890.0 | 892.0 | 1231.0 | 0.7246 | 0.7230 |
| 0.0001 | 35.0 | 3920 | 1.5265 | 0.0058 | 7201.3382 | 4991.5873 | 2691.0 | 3270.0 | 0.8229 | 2687.0 | 0.8217 | 1787.0 | 1794.0 | 2026.0 | 0.8855 | 0.8820 | 894.0 | 897.0 | 1231.0 | 0.7287 | 0.7262 |
| 0.0 | 36.0 | 4032 | 1.5271 | 0.0058 | 7204.0642 | 4993.4768 | 2689.0 | 3270.0 | 0.8223 | 2686.0 | 0.8214 | 1789.0 | 1795.0 | 2026.0 | 0.8860 | 0.8830 | 891.0 | 894.0 | 1231.0 | 0.7262 | 0.7238 |
| 0.0 | 37.0 | 4144 | 1.5280 | 0.0058 | 7208.6128 | 4996.6297 | 2683.0 | 3270.0 | 0.8205 | 2679.0 | 0.8193 | 1783.0 | 1790.0 | 2026.0 | 0.8835 | 0.8801 | 890.0 | 893.0 | 1231.0 | 0.7254 | 0.7230 |
| 0.0 | 38.0 | 4256 | 1.5250 | 0.0058 | 7194.2884 | 4986.7007 | 2686.0 | 3270.0 | 0.8214 | 2682.0 | 0.8202 | 1784.0 | 1792.0 | 2026.0 | 0.8845 | 0.8806 | 892.0 | 894.0 | 1231.0 | 0.7262 | 0.7246 |
| 0.0 | 39.0 | 4368 | 1.5264 | 0.0058 | 7201.0779 | 4991.4068 | 2686.0 | 3270.0 | 0.8214 | 2683.0 | 0.8205 | 1785.0 | 1792.0 | 2026.0 | 0.8845 | 0.8810 | 892.0 | 894.0 | 1231.0 | 0.7262 | 0.7246 |
| 0.0 | 40.0 | 4480 | 1.5270 | 0.0058 | 7203.7772 | 4993.2779 | 2686.0 | 3270.0 | 0.8214 | 2683.0 | 0.8205 | 1784.0 | 1791.0 | 2026.0 | 0.8840 | 0.8806 | 893.0 | 895.0 | 1231.0 | 0.7271 | 0.7254 |
| 1.0228 | 41.0 | 4592 | 1.5249 | 0.0058 | 7193.8448 | 4986.3933 | 2687.0 | 3270.0 | 0.8217 | 2683.0 | 0.8205 | 1785.0 | 1793.0 | 2026.0 | 0.8850 | 0.8810 | 892.0 | 894.0 | 1231.0 | 0.7262 | 0.7246 |
| 0.0 | 42.0 | 4704 | 1.5279 | 0.0058 | 7207.8794 | 4996.1213 | 2688.0 | 3270.0 | 0.8220 | 2686.0 | 0.8214 | 1789.0 | 1795.0 | 2026.0 | 0.8860 | 0.8830 | 891.0 | 893.0 | 1231.0 | 0.7254 | 0.7238 |
| 0.0 | 43.0 | 4816 | 1.5262 | 0.0058 | 7199.9043 | 4990.5934 | 2685.0 | 3270.0 | 0.8211 | 2682.0 | 0.8202 | 1786.0 | 1793.0 | 2026.0 | 0.8850 | 0.8815 | 890.0 | 892.0 | 1231.0 | 0.7246 | 0.7230 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/BoolQ_Llama-3.2-1B-0ql21waq
Base model
meta-llama/Llama-3.2-1B