BoolQ_Llama-3.2-1B-xemwi1ki
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.1526
- Model Preparation Time: 0.0056
- Mdl: 5437.5984
- Accumulated Loss: 3769.0560
- Correct Preds: 2788.0
- Total Preds: 3270.0
- Accuracy: 0.8526
- Correct Gen Preds: 2792.0
- Gen Accuracy: 0.8538
- Correct Gen Preds 9642: 1819.0
- Correct Preds 9642: 1819.0
- Total Labels 9642: 2026.0
- Accuracy 9642: 0.8978
- Gen Accuracy 9642: 0.8978
- Correct Gen Preds 2822: 969.0
- Correct Preds 2822: 969.0
- Total Labels 2822: 1231.0
- Accuracy 2822: 0.7872
- Gen Accuracy 2822: 0.7872
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 120
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 9642 | Correct Preds 9642 | Total Labels 9642 | Accuracy 9642 | Gen Accuracy 9642 | Correct Gen Preds 2822 | Correct Preds 2822 | Total Labels 2822 | Accuracy 2822 | Gen Accuracy 2822 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7080 | 0.0056 | 3339.8933 | 2315.0376 | 2032.0 | 3270.0 | 0.6214 | 2040.0 | 0.6239 | 2007.0 | 2008.0 | 2026.0 | 0.9911 | 0.9906 | 24.0 | 24.0 | 1231.0 | 0.0195 | 0.0195 |
| 0.3269 | 1.0 | 280 | 0.4252 | 0.0056 | 2005.8025 | 1390.3164 | 2742.0 | 3270.0 | 0.8385 | 2747.0 | 0.8401 | 1852.0 | 1855.0 | 2026.0 | 0.9156 | 0.9141 | 885.0 | 887.0 | 1231.0 | 0.7206 | 0.7189 |
| 0.0893 | 2.0 | 560 | 0.4500 | 0.0056 | 2123.1104 | 1471.6280 | 2759.0 | 3270.0 | 0.8437 | 2597.0 | 0.7942 | 1714.0 | 1823.0 | 2026.0 | 0.8998 | 0.8460 | 875.0 | 936.0 | 1231.0 | 0.7604 | 0.7108 |
| 0.001 | 3.0 | 840 | 0.8164 | 0.0056 | 3851.5423 | 2669.6857 | 2759.0 | 3270.0 | 0.8437 | 2739.0 | 0.8376 | 1764.0 | 1785.0 | 2026.0 | 0.8810 | 0.8707 | 967.0 | 974.0 | 1231.0 | 0.7912 | 0.7855 |
| 0.0101 | 4.0 | 1120 | 0.9761 | 0.0056 | 4604.8533 | 3191.8411 | 2776.0 | 3270.0 | 0.8489 | 2781.0 | 0.8505 | 1869.0 | 1870.0 | 2026.0 | 0.9230 | 0.9225 | 906.0 | 906.0 | 1231.0 | 0.7360 | 0.7360 |
| 0.0 | 5.0 | 1400 | 1.1266 | 0.0056 | 5314.7413 | 3683.8980 | 2771.0 | 3270.0 | 0.8474 | 2774.0 | 0.8483 | 1790.0 | 1791.0 | 2026.0 | 0.8840 | 0.8835 | 980.0 | 980.0 | 1231.0 | 0.7961 | 0.7961 |
| 0.0 | 6.0 | 1680 | 1.0680 | 0.0056 | 5038.6131 | 3492.5005 | 2772.0 | 3270.0 | 0.8477 | 2776.0 | 0.8489 | 1855.0 | 1855.0 | 2026.0 | 0.9156 | 0.9156 | 917.0 | 917.0 | 1231.0 | 0.7449 | 0.7449 |
| 0.0001 | 7.0 | 1960 | 1.1526 | 0.0056 | 5437.5984 | 3769.0560 | 2788.0 | 3270.0 | 0.8526 | 2792.0 | 0.8538 | 1819.0 | 1819.0 | 2026.0 | 0.8978 | 0.8978 | 969.0 | 969.0 | 1231.0 | 0.7872 | 0.7872 |
| 0.0 | 8.0 | 2240 | 1.1742 | 0.0056 | 5539.3681 | 3839.5974 | 2783.0 | 3270.0 | 0.8511 | 2788.0 | 0.8526 | 1819.0 | 1819.0 | 2026.0 | 0.8978 | 0.8978 | 964.0 | 964.0 | 1231.0 | 0.7831 | 0.7831 |
| 0.0 | 9.0 | 2520 | 1.1807 | 0.0056 | 5570.2856 | 3861.0278 | 2786.0 | 3270.0 | 0.8520 | 2791.0 | 0.8535 | 1820.0 | 1820.0 | 2026.0 | 0.8983 | 0.8983 | 966.0 | 966.0 | 1231.0 | 0.7847 | 0.7847 |
| 0.0 | 10.0 | 2800 | 1.2039 | 0.0056 | 5679.6573 | 3936.8384 | 2783.0 | 3270.0 | 0.8511 | 2787.0 | 0.8523 | 1818.0 | 1818.0 | 2026.0 | 0.8973 | 0.8973 | 965.0 | 965.0 | 1231.0 | 0.7839 | 0.7839 |
| 0.0 | 11.0 | 3080 | 1.2093 | 0.0056 | 5704.9015 | 3954.3364 | 2784.0 | 3270.0 | 0.8514 | 2789.0 | 0.8529 | 1821.0 | 1821.0 | 2026.0 | 0.8988 | 0.8988 | 963.0 | 963.0 | 1231.0 | 0.7823 | 0.7823 |
| 0.0 | 12.0 | 3360 | 1.2143 | 0.0056 | 5728.6013 | 3970.7638 | 2785.0 | 3270.0 | 0.8517 | 2790.0 | 0.8532 | 1820.0 | 1820.0 | 2026.0 | 0.8983 | 0.8983 | 965.0 | 965.0 | 1231.0 | 0.7839 | 0.7839 |
| 0.0 | 13.0 | 3640 | 1.2183 | 0.0056 | 5747.3020 | 3983.7262 | 2785.0 | 3270.0 | 0.8517 | 2789.0 | 0.8529 | 1820.0 | 1820.0 | 2026.0 | 0.8983 | 0.8983 | 965.0 | 965.0 | 1231.0 | 0.7839 | 0.7839 |
| 0.0001 | 14.0 | 3920 | 1.2219 | 0.0056 | 5764.5638 | 3995.6911 | 2786.0 | 3270.0 | 0.8520 | 2791.0 | 0.8535 | 1821.0 | 1821.0 | 2026.0 | 0.8988 | 0.8988 | 965.0 | 965.0 | 1231.0 | 0.7839 | 0.7839 |
| 0.0 | 15.0 | 4200 | 1.2225 | 0.0056 | 5767.2744 | 3997.5700 | 2786.0 | 3270.0 | 0.8520 | 2791.0 | 0.8535 | 1820.0 | 1820.0 | 2026.0 | 0.8983 | 0.8983 | 966.0 | 966.0 | 1231.0 | 0.7847 | 0.7847 |
| 0.0 | 16.0 | 4480 | 1.2261 | 0.0056 | 5784.1350 | 4009.2569 | 2786.0 | 3270.0 | 0.8520 | 2791.0 | 0.8535 | 1819.0 | 1819.0 | 2026.0 | 0.8978 | 0.8978 | 967.0 | 967.0 | 1231.0 | 0.7855 | 0.7855 |
| 0.0 | 17.0 | 4760 | 1.2237 | 0.0056 | 5772.7257 | 4001.3485 | 2785.0 | 3270.0 | 0.8517 | 2790.0 | 0.8532 | 1820.0 | 1820.0 | 2026.0 | 0.8983 | 0.8983 | 965.0 | 965.0 | 1231.0 | 0.7839 | 0.7839 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/BoolQ_Llama-3.2-1B-xemwi1ki
Base model
meta-llama/Llama-3.2-1B