BoolQ_Llama-3.2-1B-50ztosrf
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.9785
- Model Preparation Time: 0.0058
- Mdl: 4616.0341
- Accumulated Loss: 3199.5911
- Correct Preds: 2776.0
- Total Preds: 3270.0
- Accuracy: 0.8489
- Correct Gen Preds: 2773.0
- Gen Accuracy: 0.8480
- Correct Gen Preds 9642: 1791.0
- Correct Preds 9642: 1798.0
- Total Labels 9642: 2026.0
- Accuracy 9642: 0.8875
- Gen Accuracy 9642: 0.8840
- Correct Gen Preds 2822: 973.0
- Correct Preds 2822: 978.0
- Total Labels 2822: 1231.0
- Accuracy 2822: 0.7945
- Gen Accuracy 2822: 0.7904
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 120
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 9642 | Correct Preds 9642 | Total Labels 9642 | Accuracy 9642 | Gen Accuracy 9642 | Correct Gen Preds 2822 | Correct Preds 2822 | Total Labels 2822 | Accuracy 2822 | Gen Accuracy 2822 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7080 | 0.0058 | 3339.8933 | 2315.0376 | 2032.0 | 3270.0 | 0.6214 | 2040.0 | 0.6239 | 2007.0 | 2008.0 | 2026.0 | 0.9911 | 0.9906 | 24.0 | 24.0 | 1231.0 | 0.0195 | 0.0195 |
| 0.3209 | 1.0 | 295 | 0.4495 | 0.0058 | 2120.4374 | 1469.7752 | 2653.0 | 3270.0 | 0.8113 | 2047.0 | 0.6260 | 1160.0 | 1557.0 | 2026.0 | 0.7685 | 0.5726 | 877.0 | 1096.0 | 1231.0 | 0.8903 | 0.7124 |
| 0.2469 | 2.0 | 590 | 0.4858 | 0.0058 | 2291.6124 | 1588.4247 | 2711.0 | 3270.0 | 0.8291 | 2070.0 | 0.6330 | 1100.0 | 1638.0 | 2026.0 | 0.8085 | 0.5429 | 963.0 | 1073.0 | 1231.0 | 0.8716 | 0.7823 |
| 0.001 | 3.0 | 885 | 0.9723 | 0.0058 | 4586.9576 | 3179.4368 | 2740.0 | 3270.0 | 0.8379 | 2711.0 | 0.8291 | 1752.0 | 1781.0 | 2026.0 | 0.8791 | 0.8648 | 950.0 | 959.0 | 1231.0 | 0.7790 | 0.7717 |
| 0.0001 | 4.0 | 1180 | 0.9785 | 0.0058 | 4616.0341 | 3199.5911 | 2776.0 | 3270.0 | 0.8489 | 2773.0 | 0.8480 | 1791.0 | 1798.0 | 2026.0 | 0.8875 | 0.8840 | 973.0 | 978.0 | 1231.0 | 0.7945 | 0.7904 |
| 0.0 | 5.0 | 1475 | 1.0825 | 0.0058 | 5106.9664 | 3539.8794 | 2764.0 | 3270.0 | 0.8453 | 2752.0 | 0.8416 | 1812.0 | 1828.0 | 2026.0 | 0.9023 | 0.8944 | 931.0 | 936.0 | 1231.0 | 0.7604 | 0.7563 |
| 0.0 | 6.0 | 1770 | 1.1417 | 0.0058 | 5385.9903 | 3733.2840 | 2768.0 | 3270.0 | 0.8465 | 2745.0 | 0.8394 | 1811.0 | 1832.0 | 2026.0 | 0.9042 | 0.8939 | 927.0 | 936.0 | 1231.0 | 0.7604 | 0.7530 |
| 0.0 | 7.0 | 2065 | 1.2458 | 0.0058 | 5877.3016 | 4073.8351 | 2762.0 | 3270.0 | 0.8446 | 2744.0 | 0.8391 | 1787.0 | 1807.0 | 2026.0 | 0.8919 | 0.8820 | 950.0 | 955.0 | 1231.0 | 0.7758 | 0.7717 |
| 0.0 | 8.0 | 2360 | 1.2477 | 0.0058 | 5886.0491 | 4079.8983 | 2762.0 | 3270.0 | 0.8446 | 2744.0 | 0.8391 | 1802.0 | 1821.0 | 2026.0 | 0.8988 | 0.8894 | 935.0 | 941.0 | 1231.0 | 0.7644 | 0.7595 |
| 0.6191 | 9.0 | 2655 | 1.2608 | 0.0058 | 5947.9186 | 4122.7830 | 2768.0 | 3270.0 | 0.8465 | 2749.0 | 0.8407 | 1805.0 | 1824.0 | 2026.0 | 0.9003 | 0.8909 | 937.0 | 944.0 | 1231.0 | 0.7669 | 0.7612 |
| 0.0 | 10.0 | 2950 | 1.2320 | 0.0058 | 5812.1306 | 4028.6620 | 2765.0 | 3270.0 | 0.8456 | 2746.0 | 0.8398 | 1807.0 | 1826.0 | 2026.0 | 0.9013 | 0.8919 | 932.0 | 939.0 | 1231.0 | 0.7628 | 0.7571 |
| 0.0001 | 11.0 | 3245 | 1.2418 | 0.0058 | 5858.3967 | 4060.7312 | 2767.0 | 3270.0 | 0.8462 | 2745.0 | 0.8394 | 1807.0 | 1829.0 | 2026.0 | 0.9028 | 0.8919 | 931.0 | 938.0 | 1231.0 | 0.7620 | 0.7563 |
| 0.0 | 12.0 | 3540 | 1.2715 | 0.0058 | 5998.6521 | 4157.9488 | 2765.0 | 3270.0 | 0.8456 | 2750.0 | 0.8410 | 1800.0 | 1817.0 | 2026.0 | 0.8968 | 0.8885 | 943.0 | 948.0 | 1231.0 | 0.7701 | 0.7660 |
| 0.0 | 13.0 | 3835 | 1.2808 | 0.0058 | 6042.3338 | 4188.2267 | 2760.0 | 3270.0 | 0.8440 | 2745.0 | 0.8394 | 1799.0 | 1816.0 | 2026.0 | 0.8963 | 0.8880 | 939.0 | 944.0 | 1231.0 | 0.7669 | 0.7628 |
| 0.0 | 14.0 | 4130 | 1.2792 | 0.0058 | 6034.8824 | 4183.0618 | 2765.0 | 3270.0 | 0.8456 | 2751.0 | 0.8413 | 1800.0 | 1818.0 | 2026.0 | 0.8973 | 0.8885 | 944.0 | 947.0 | 1231.0 | 0.7693 | 0.7669 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 1
Model tree for donoway/BoolQ_Llama-3.2-1B-50ztosrf
Base model
meta-llama/Llama-3.2-1B