BoolQ_Llama-3.2-1B-4lp9g5r3
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.5927
- Model Preparation Time: 0.0058
- Mdl: 7513.6957
- Accumulated Loss: 5208.0970
- Correct Preds: 2609.0
- Total Preds: 3270.0
- Accuracy: 0.7979
- Correct Gen Preds: 2610.0
- Gen Accuracy: 0.7982
- Correct Gen Preds 9642: 1661.0
- Correct Preds 9642: 1668.0
- Total Labels 9642: 2026.0
- Accuracy 9642: 0.8233
- Gen Accuracy 9642: 0.8198
- Correct Gen Preds 2822: 940.0
- Correct Preds 2822: 941.0
- Total Labels 2822: 1231.0
- Accuracy 2822: 0.7644
- Gen Accuracy 2822: 0.7636
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 120
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 9642 | Correct Preds 9642 | Total Labels 9642 | Accuracy 9642 | Gen Accuracy 9642 | Correct Gen Preds 2822 | Correct Preds 2822 | Total Labels 2822 | Accuracy 2822 | Gen Accuracy 2822 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7080 | 0.0058 | 3339.8933 | 2315.0376 | 2032.0 | 3270.0 | 0.6214 | 2040.0 | 0.6239 | 2007.0 | 2008.0 | 2026.0 | 0.9911 | 0.9906 | 24.0 | 24.0 | 1231.0 | 0.0195 | 0.0195 |
| 0.5297 | 1.0 | 54 | 0.5655 | 0.0058 | 2667.8169 | 1849.1897 | 2391.0 | 3270.0 | 0.7312 | 2397.0 | 0.7330 | 1482.0 | 1483.0 | 2026.0 | 0.7320 | 0.7315 | 906.0 | 908.0 | 1231.0 | 0.7376 | 0.7360 |
| 0.429 | 2.0 | 108 | 0.6555 | 0.0058 | 3092.6188 | 2143.6400 | 2562.0 | 3270.0 | 0.7835 | 2524.0 | 0.7719 | 1836.0 | 1856.0 | 2026.0 | 0.9161 | 0.9062 | 680.0 | 706.0 | 1231.0 | 0.5735 | 0.5524 |
| 0.1091 | 3.0 | 162 | 0.9086 | 0.0058 | 4286.2025 | 2970.9692 | 2566.0 | 3270.0 | 0.7847 | 2517.0 | 0.7697 | 1682.0 | 1729.0 | 2026.0 | 0.8534 | 0.8302 | 827.0 | 837.0 | 1231.0 | 0.6799 | 0.6718 |
| 0.3827 | 4.0 | 216 | 1.0952 | 0.0058 | 5166.9108 | 3581.4297 | 2592.0 | 3270.0 | 0.7927 | 2498.0 | 0.7639 | 1556.0 | 1643.0 | 2026.0 | 0.8110 | 0.7680 | 933.0 | 949.0 | 1231.0 | 0.7709 | 0.7579 |
| 0.0001 | 5.0 | 270 | 1.4491 | 0.0058 | 6836.4286 | 4738.6512 | 2590.0 | 3270.0 | 0.7920 | 2589.0 | 0.7917 | 1644.0 | 1653.0 | 2026.0 | 0.8159 | 0.8115 | 936.0 | 937.0 | 1231.0 | 0.7612 | 0.7604 |
| 0.0006 | 6.0 | 324 | 1.6538 | 0.0058 | 7802.1499 | 5408.0382 | 2602.0 | 3270.0 | 0.7957 | 2605.0 | 0.7966 | 1797.0 | 1801.0 | 2026.0 | 0.8889 | 0.8870 | 799.0 | 801.0 | 1231.0 | 0.6507 | 0.6491 |
| 0.0001 | 7.0 | 378 | 1.5927 | 0.0058 | 7513.6957 | 5208.0970 | 2609.0 | 3270.0 | 0.7979 | 2610.0 | 0.7982 | 1661.0 | 1668.0 | 2026.0 | 0.8233 | 0.8198 | 940.0 | 941.0 | 1231.0 | 0.7644 | 0.7636 |
| 0.0 | 8.0 | 432 | 1.6183 | 0.0058 | 7634.6804 | 5291.9572 | 2601.0 | 3270.0 | 0.7954 | 2603.0 | 0.7960 | 1657.0 | 1663.0 | 2026.0 | 0.8208 | 0.8179 | 937.0 | 938.0 | 1231.0 | 0.7620 | 0.7612 |
| 0.0 | 9.0 | 486 | 1.6373 | 0.0058 | 7724.0835 | 5353.9267 | 2602.0 | 3270.0 | 0.7957 | 2605.0 | 0.7966 | 1659.0 | 1664.0 | 2026.0 | 0.8213 | 0.8189 | 938.0 | 938.0 | 1231.0 | 0.7620 | 0.7620 |
| 0.0 | 10.0 | 540 | 1.6487 | 0.0058 | 7777.7540 | 5391.1283 | 2603.0 | 3270.0 | 0.7960 | 2607.0 | 0.7972 | 1660.0 | 1665.0 | 2026.0 | 0.8218 | 0.8193 | 938.0 | 938.0 | 1231.0 | 0.7620 | 0.7620 |
| 0.0 | 11.0 | 594 | 1.6598 | 0.0058 | 7830.1323 | 5427.4341 | 2603.0 | 3270.0 | 0.7960 | 2607.0 | 0.7972 | 1659.0 | 1664.0 | 2026.0 | 0.8213 | 0.8189 | 939.0 | 939.0 | 1231.0 | 0.7628 | 0.7628 |
| 0.0 | 12.0 | 648 | 1.6698 | 0.0058 | 7877.2390 | 5460.0860 | 2604.0 | 3270.0 | 0.7963 | 2607.0 | 0.7972 | 1661.0 | 1667.0 | 2026.0 | 0.8228 | 0.8198 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 13.0 | 702 | 1.6788 | 0.0058 | 7920.0781 | 5489.7798 | 2603.0 | 3270.0 | 0.7960 | 2606.0 | 0.7969 | 1660.0 | 1666.0 | 2026.0 | 0.8223 | 0.8193 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 14.0 | 756 | 1.6880 | 0.0058 | 7963.3426 | 5519.7685 | 2604.0 | 3270.0 | 0.7963 | 2607.0 | 0.7972 | 1660.0 | 1666.0 | 2026.0 | 0.8223 | 0.8193 | 938.0 | 938.0 | 1231.0 | 0.7620 | 0.7620 |
| 0.0 | 15.0 | 810 | 1.6918 | 0.0058 | 7981.2343 | 5532.1700 | 2603.0 | 3270.0 | 0.7960 | 2607.0 | 0.7972 | 1660.0 | 1665.0 | 2026.0 | 0.8218 | 0.8193 | 938.0 | 938.0 | 1231.0 | 0.7620 | 0.7620 |
| 0.0 | 16.0 | 864 | 1.6997 | 0.0058 | 8018.5627 | 5558.0441 | 2606.0 | 3270.0 | 0.7969 | 2610.0 | 0.7982 | 1663.0 | 1668.0 | 2026.0 | 0.8233 | 0.8208 | 938.0 | 938.0 | 1231.0 | 0.7620 | 0.7620 |
| 0.0 | 17.0 | 918 | 1.7052 | 0.0058 | 8044.3508 | 5575.9191 | 2601.0 | 3270.0 | 0.7954 | 2605.0 | 0.7966 | 1659.0 | 1664.0 | 2026.0 | 0.8213 | 0.8189 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 18.0 | 972 | 1.7084 | 0.0058 | 8059.3827 | 5586.3384 | 2602.0 | 3270.0 | 0.7957 | 2606.0 | 0.7969 | 1660.0 | 1665.0 | 2026.0 | 0.8218 | 0.8193 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 19.0 | 1026 | 1.7110 | 0.0058 | 8071.9707 | 5595.0637 | 2603.0 | 3270.0 | 0.7960 | 2607.0 | 0.7972 | 1661.0 | 1666.0 | 2026.0 | 0.8223 | 0.8198 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 20.0 | 1080 | 1.7147 | 0.0058 | 8089.1272 | 5606.9557 | 2602.0 | 3270.0 | 0.7957 | 2605.0 | 0.7966 | 1659.0 | 1665.0 | 2026.0 | 0.8218 | 0.8189 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 21.0 | 1134 | 1.7125 | 0.0058 | 8079.1133 | 5600.0146 | 2605.0 | 3270.0 | 0.7966 | 2609.0 | 0.7979 | 1662.0 | 1667.0 | 2026.0 | 0.8228 | 0.8203 | 938.0 | 938.0 | 1231.0 | 0.7620 | 0.7620 |
| 0.0 | 22.0 | 1188 | 1.7151 | 0.0058 | 8091.0910 | 5608.3169 | 2604.0 | 3270.0 | 0.7963 | 2607.0 | 0.7972 | 1659.0 | 1665.0 | 2026.0 | 0.8218 | 0.8189 | 939.0 | 939.0 | 1231.0 | 0.7628 | 0.7628 |
| 0.0 | 23.0 | 1242 | 1.7160 | 0.0058 | 8095.3381 | 5611.2608 | 2602.0 | 3270.0 | 0.7957 | 2606.0 | 0.7969 | 1660.0 | 1665.0 | 2026.0 | 0.8218 | 0.8193 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 24.0 | 1296 | 1.7144 | 0.0058 | 8087.7767 | 5606.0196 | 2605.0 | 3270.0 | 0.7966 | 2609.0 | 0.7979 | 1663.0 | 1668.0 | 2026.0 | 0.8233 | 0.8208 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 25.0 | 1350 | 1.7150 | 0.0058 | 8090.7756 | 5608.0983 | 2604.0 | 3270.0 | 0.7963 | 2608.0 | 0.7976 | 1662.0 | 1667.0 | 2026.0 | 0.8228 | 0.8203 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 26.0 | 1404 | 1.7128 | 0.0058 | 8080.3770 | 5600.8906 | 2603.0 | 3270.0 | 0.7960 | 2607.0 | 0.7972 | 1662.0 | 1667.0 | 2026.0 | 0.8228 | 0.8203 | 936.0 | 936.0 | 1231.0 | 0.7604 | 0.7604 |
| 0.0 | 27.0 | 1458 | 1.7153 | 0.0058 | 8092.1583 | 5609.0567 | 2605.0 | 3270.0 | 0.7966 | 2609.0 | 0.7979 | 1662.0 | 1667.0 | 2026.0 | 0.8228 | 0.8203 | 938.0 | 938.0 | 1231.0 | 0.7620 | 0.7620 |
| 0.0 | 28.0 | 1512 | 1.7140 | 0.0058 | 8086.0919 | 5604.8518 | 2604.0 | 3270.0 | 0.7963 | 2607.0 | 0.7972 | 1660.0 | 1666.0 | 2026.0 | 0.8223 | 0.8193 | 938.0 | 938.0 | 1231.0 | 0.7620 | 0.7620 |
| 0.0 | 29.0 | 1566 | 1.7153 | 0.0058 | 8092.1390 | 5609.0433 | 2602.0 | 3270.0 | 0.7957 | 2606.0 | 0.7969 | 1660.0 | 1665.0 | 2026.0 | 0.8218 | 0.8193 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 30.0 | 1620 | 1.7134 | 0.0058 | 8083.1388 | 5602.8049 | 2605.0 | 3270.0 | 0.7966 | 2609.0 | 0.7979 | 1663.0 | 1668.0 | 2026.0 | 0.8233 | 0.8208 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 31.0 | 1674 | 1.7155 | 0.0058 | 8092.8767 | 5609.5547 | 2604.0 | 3270.0 | 0.7963 | 2608.0 | 0.7976 | 1661.0 | 1666.0 | 2026.0 | 0.8223 | 0.8198 | 938.0 | 938.0 | 1231.0 | 0.7620 | 0.7620 |
| 0.0 | 32.0 | 1728 | 1.7118 | 0.0058 | 8075.5841 | 5597.5684 | 2605.0 | 3270.0 | 0.7966 | 2610.0 | 0.7982 | 1663.0 | 1667.0 | 2026.0 | 0.8228 | 0.8208 | 938.0 | 938.0 | 1231.0 | 0.7620 | 0.7620 |
| 0.0 | 33.0 | 1782 | 1.7150 | 0.0058 | 8090.7632 | 5608.0897 | 2605.0 | 3270.0 | 0.7966 | 2609.0 | 0.7979 | 1663.0 | 1668.0 | 2026.0 | 0.8233 | 0.8208 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 34.0 | 1836 | 1.7158 | 0.0058 | 8094.6523 | 5610.7854 | 2605.0 | 3270.0 | 0.7966 | 2609.0 | 0.7979 | 1663.0 | 1668.0 | 2026.0 | 0.8233 | 0.8208 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 35.0 | 1890 | 1.7144 | 0.0058 | 8087.7042 | 5605.9694 | 2606.0 | 3270.0 | 0.7969 | 2610.0 | 0.7982 | 1664.0 | 1669.0 | 2026.0 | 0.8238 | 0.8213 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
| 0.0 | 36.0 | 1944 | 1.7155 | 0.0058 | 8093.0727 | 5609.6905 | 2602.0 | 3270.0 | 0.7957 | 2607.0 | 0.7972 | 1660.0 | 1664.0 | 2026.0 | 0.8213 | 0.8193 | 938.0 | 938.0 | 1231.0 | 0.7620 | 0.7620 |
| 0.0 | 37.0 | 1998 | 1.7157 | 0.0058 | 8094.0158 | 5610.3442 | 2604.0 | 3270.0 | 0.7963 | 2608.0 | 0.7976 | 1662.0 | 1667.0 | 2026.0 | 0.8228 | 0.8203 | 937.0 | 937.0 | 1231.0 | 0.7612 | 0.7612 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/BoolQ_Llama-3.2-1B-4lp9g5r3
Base model
meta-llama/Llama-3.2-1B