BoolQ_Llama-3.2-1B-131yj8sj
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.4452
- Model Preparation Time: 0.0057
- Mdl: 6818.1174
- Accumulated Loss: 4725.9588
- Correct Preds: 2702.0
- Total Preds: 3270.0
- Accuracy: 0.8263
- Correct Gen Preds: 2701.0
- Gen Accuracy: 0.8260
- Correct Gen Preds 9642: 1791.0
- Correct Preds 9642: 1798.0
- Total Labels 9642: 2026.0
- Accuracy 9642: 0.8875
- Gen Accuracy 9642: 0.8840
- Correct Gen Preds 2822: 901.0
- Correct Preds 2822: 904.0
- Total Labels 2822: 1231.0
- Accuracy 2822: 0.7344
- Gen Accuracy 2822: 0.7319
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 120
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 9642 | Correct Preds 9642 | Total Labels 9642 | Accuracy 9642 | Gen Accuracy 9642 | Correct Gen Preds 2822 | Correct Preds 2822 | Total Labels 2822 | Accuracy 2822 | Gen Accuracy 2822 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7080 | 0.0057 | 3339.8933 | 2315.0376 | 2032.0 | 3270.0 | 0.6214 | 2040.0 | 0.6239 | 2007.0 | 2008.0 | 2026.0 | 0.9911 | 0.9906 | 24.0 | 24.0 | 1231.0 | 0.0195 | 0.0195 |
| 0.2476 | 1.0 | 143 | 0.4988 | 0.0057 | 2353.0385 | 1631.0020 | 2591.0 | 3270.0 | 0.7924 | 2599.0 | 0.7948 | 1843.0 | 1843.0 | 2026.0 | 0.9097 | 0.9097 | 747.0 | 748.0 | 1231.0 | 0.6076 | 0.6068 |
| 0.0885 | 2.0 | 286 | 0.5426 | 0.0057 | 2559.9190 | 1774.4006 | 2626.0 | 3270.0 | 0.8031 | 2626.0 | 0.8031 | 1900.0 | 1906.0 | 2026.0 | 0.9408 | 0.9378 | 717.0 | 720.0 | 1231.0 | 0.5849 | 0.5825 |
| 0.0086 | 3.0 | 429 | 0.7471 | 0.0057 | 3524.5342 | 2443.0209 | 2655.0 | 3270.0 | 0.8119 | 2625.0 | 0.8028 | 1638.0 | 1667.0 | 2026.0 | 0.8228 | 0.8085 | 978.0 | 988.0 | 1231.0 | 0.8026 | 0.7945 |
| 0.0002 | 4.0 | 572 | 1.1866 | 0.0057 | 5597.8044 | 3880.1023 | 2662.0 | 3270.0 | 0.8141 | 2663.0 | 0.8144 | 1703.0 | 1707.0 | 2026.0 | 0.8425 | 0.8406 | 953.0 | 955.0 | 1231.0 | 0.7758 | 0.7742 |
| 0.0115 | 5.0 | 715 | 1.3058 | 0.0057 | 6160.2400 | 4269.9530 | 2673.0 | 3270.0 | 0.8174 | 2664.0 | 0.8147 | 1791.0 | 1797.0 | 2026.0 | 0.8870 | 0.8840 | 864.0 | 876.0 | 1231.0 | 0.7116 | 0.7019 |
| 0.0 | 6.0 | 858 | 1.4452 | 0.0057 | 6818.1174 | 4725.9588 | 2702.0 | 3270.0 | 0.8263 | 2701.0 | 0.8260 | 1791.0 | 1798.0 | 2026.0 | 0.8875 | 0.8840 | 901.0 | 904.0 | 1231.0 | 0.7344 | 0.7319 |
| 0.0 | 7.0 | 1001 | 1.4433 | 0.0057 | 6808.9128 | 4719.5787 | 2698.0 | 3270.0 | 0.8251 | 2704.0 | 0.8269 | 1812.0 | 1814.0 | 2026.0 | 0.8954 | 0.8944 | 883.0 | 884.0 | 1231.0 | 0.7181 | 0.7173 |
| 0.0 | 8.0 | 1144 | 1.3856 | 0.0057 | 6536.7240 | 4530.9118 | 2691.0 | 3270.0 | 0.8229 | 2694.0 | 0.8239 | 1768.0 | 1772.0 | 2026.0 | 0.8746 | 0.8727 | 917.0 | 919.0 | 1231.0 | 0.7465 | 0.7449 |
| 0.9802 | 9.0 | 1287 | 1.4773 | 0.0057 | 6969.2721 | 4830.7313 | 2692.0 | 3270.0 | 0.8232 | 2698.0 | 0.8251 | 1793.0 | 1795.0 | 2026.0 | 0.8860 | 0.8850 | 897.0 | 897.0 | 1231.0 | 0.7287 | 0.7287 |
| 0.0 | 10.0 | 1430 | 1.5437 | 0.0057 | 7282.6372 | 5047.9395 | 2695.0 | 3270.0 | 0.8242 | 2701.0 | 0.8260 | 1775.0 | 1777.0 | 2026.0 | 0.8771 | 0.8761 | 917.0 | 918.0 | 1231.0 | 0.7457 | 0.7449 |
| 0.0 | 11.0 | 1573 | 1.5490 | 0.0057 | 7307.5108 | 5065.1805 | 2690.0 | 3270.0 | 0.8226 | 2696.0 | 0.8245 | 1771.0 | 1773.0 | 2026.0 | 0.8751 | 0.8741 | 916.0 | 917.0 | 1231.0 | 0.7449 | 0.7441 |
| 0.0 | 12.0 | 1716 | 1.5529 | 0.0057 | 7325.9736 | 5077.9779 | 2692.0 | 3270.0 | 0.8232 | 2697.0 | 0.8248 | 1773.0 | 1775.0 | 2026.0 | 0.8761 | 0.8751 | 916.0 | 917.0 | 1231.0 | 0.7449 | 0.7441 |
| 0.0 | 13.0 | 1859 | 1.5565 | 0.0057 | 7343.1664 | 5089.8951 | 2691.0 | 3270.0 | 0.8229 | 2696.0 | 0.8245 | 1771.0 | 1773.0 | 2026.0 | 0.8751 | 0.8741 | 917.0 | 918.0 | 1231.0 | 0.7457 | 0.7449 |
| 0.0 | 14.0 | 2002 | 1.5552 | 0.0057 | 7336.7036 | 5085.4154 | 2692.0 | 3270.0 | 0.8232 | 2697.0 | 0.8248 | 1772.0 | 1774.0 | 2026.0 | 0.8756 | 0.8746 | 917.0 | 918.0 | 1231.0 | 0.7457 | 0.7449 |
| 0.9802 | 15.0 | 2145 | 1.5579 | 0.0057 | 7349.6490 | 5094.3885 | 2695.0 | 3270.0 | 0.8242 | 2700.0 | 0.8257 | 1774.0 | 1776.0 | 2026.0 | 0.8766 | 0.8756 | 918.0 | 919.0 | 1231.0 | 0.7465 | 0.7457 |
| 0.0 | 16.0 | 2288 | 1.5570 | 0.0057 | 7345.2574 | 5091.3444 | 2689.0 | 3270.0 | 0.8223 | 2694.0 | 0.8239 | 1770.0 | 1772.0 | 2026.0 | 0.8746 | 0.8736 | 916.0 | 917.0 | 1231.0 | 0.7449 | 0.7441 |
| 0.0 | 17.0 | 2431 | 1.5594 | 0.0057 | 7356.5874 | 5099.1978 | 2693.0 | 3270.0 | 0.8235 | 2699.0 | 0.8254 | 1772.0 | 1774.0 | 2026.0 | 0.8756 | 0.8746 | 918.0 | 919.0 | 1231.0 | 0.7465 | 0.7457 |
| 0.0 | 18.0 | 2574 | 1.5588 | 0.0057 | 7354.0051 | 5097.4079 | 2693.0 | 3270.0 | 0.8235 | 2699.0 | 0.8254 | 1773.0 | 1775.0 | 2026.0 | 0.8761 | 0.8751 | 917.0 | 918.0 | 1231.0 | 0.7457 | 0.7449 |
| 0.0 | 19.0 | 2717 | 1.5574 | 0.0057 | 7347.1134 | 5092.6310 | 2694.0 | 3270.0 | 0.8239 | 2700.0 | 0.8257 | 1775.0 | 1777.0 | 2026.0 | 0.8771 | 0.8761 | 916.0 | 917.0 | 1231.0 | 0.7449 | 0.7441 |
| 0.0 | 20.0 | 2860 | 1.5598 | 0.0057 | 7358.7582 | 5100.7025 | 2694.0 | 3270.0 | 0.8239 | 2699.0 | 0.8254 | 1776.0 | 1778.0 | 2026.0 | 0.8776 | 0.8766 | 915.0 | 916.0 | 1231.0 | 0.7441 | 0.7433 |
| 0.0 | 21.0 | 3003 | 1.5610 | 0.0057 | 7364.2419 | 5104.5035 | 2693.0 | 3270.0 | 0.8235 | 2699.0 | 0.8254 | 1773.0 | 1775.0 | 2026.0 | 0.8761 | 0.8751 | 917.0 | 918.0 | 1231.0 | 0.7457 | 0.7449 |
| 0.0 | 22.0 | 3146 | 1.5590 | 0.0057 | 7354.8963 | 5098.0257 | 2695.0 | 3270.0 | 0.8242 | 2700.0 | 0.8257 | 1775.0 | 1777.0 | 2026.0 | 0.8771 | 0.8761 | 917.0 | 918.0 | 1231.0 | 0.7457 | 0.7449 |
| 0.0 | 23.0 | 3289 | 1.5609 | 0.0057 | 7363.6331 | 5104.0815 | 2692.0 | 3270.0 | 0.8232 | 2698.0 | 0.8251 | 1773.0 | 1775.0 | 2026.0 | 0.8761 | 0.8751 | 916.0 | 917.0 | 1231.0 | 0.7449 | 0.7441 |
| 0.0 | 24.0 | 3432 | 1.5620 | 0.0057 | 7368.7476 | 5107.6266 | 2694.0 | 3270.0 | 0.8239 | 2699.0 | 0.8254 | 1775.0 | 1777.0 | 2026.0 | 0.8771 | 0.8761 | 916.0 | 917.0 | 1231.0 | 0.7449 | 0.7441 |
| 0.0 | 25.0 | 3575 | 1.5613 | 0.0057 | 7365.4606 | 5105.3482 | 2693.0 | 3270.0 | 0.8235 | 2699.0 | 0.8254 | 1774.0 | 1776.0 | 2026.0 | 0.8766 | 0.8756 | 916.0 | 917.0 | 1231.0 | 0.7449 | 0.7441 |
| 0.0 | 26.0 | 3718 | 1.5604 | 0.0057 | 7361.4952 | 5102.5996 | 2692.0 | 3270.0 | 0.8232 | 2697.0 | 0.8248 | 1773.0 | 1775.0 | 2026.0 | 0.8761 | 0.8751 | 916.0 | 917.0 | 1231.0 | 0.7449 | 0.7441 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/BoolQ_Llama-3.2-1B-131yj8sj
Base model
meta-llama/Llama-3.2-1B