BoolQ_Llama-3.2-1B-5r42yp3k
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.5466
- Model Preparation Time: 0.0056
- Mdl: 7296.3329
- Accumulated Loss: 5057.4326
- Correct Preds: 2619.0
- Total Preds: 3270.0
- Accuracy: 0.8009
- Correct Gen Preds: 2594.0
- Gen Accuracy: 0.7933
- Correct Gen Preds 9642: 1748.0
- Correct Preds 9642: 1776.0
- Total Labels 9642: 2026.0
- Accuracy 9642: 0.8766
- Gen Accuracy 9642: 0.8628
- Correct Gen Preds 2822: 838.0
- Correct Preds 2822: 843.0
- Total Labels 2822: 1231.0
- Accuracy 2822: 0.6848
- Gen Accuracy 2822: 0.6807
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 120
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 9642 | Correct Preds 9642 | Total Labels 9642 | Accuracy 9642 | Gen Accuracy 9642 | Correct Gen Preds 2822 | Correct Preds 2822 | Total Labels 2822 | Accuracy 2822 | Gen Accuracy 2822 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7080 | 0.0056 | 3339.8933 | 2315.0376 | 2032.0 | 3270.0 | 0.6214 | 2040.0 | 0.6239 | 2007.0 | 2008.0 | 2026.0 | 0.9911 | 0.9906 | 24.0 | 24.0 | 1231.0 | 0.0195 | 0.0195 |
| 0.4335 | 1.0 | 43 | 0.5330 | 0.0056 | 2514.6645 | 1743.0326 | 2457.0 | 3270.0 | 0.7514 | 2447.0 | 0.7483 | 1619.0 | 1630.0 | 2026.0 | 0.8045 | 0.7991 | 819.0 | 827.0 | 1231.0 | 0.6718 | 0.6653 |
| 0.2605 | 2.0 | 86 | 0.6563 | 0.0056 | 3096.0653 | 2146.0289 | 2450.0 | 3270.0 | 0.7492 | 1969.0 | 0.6021 | 1023.0 | 1427.0 | 2026.0 | 0.7043 | 0.5049 | 939.0 | 1023.0 | 1231.0 | 0.8310 | 0.7628 |
| 0.0158 | 3.0 | 129 | 1.0674 | 0.0056 | 5035.6484 | 3490.4455 | 2536.0 | 3270.0 | 0.7755 | 2378.0 | 0.7272 | 1717.0 | 1872.0 | 2026.0 | 0.9240 | 0.8475 | 654.0 | 664.0 | 1231.0 | 0.5394 | 0.5313 |
| 0.1505 | 4.0 | 172 | 1.4954 | 0.0056 | 7054.8825 | 4890.0719 | 2587.0 | 3270.0 | 0.7911 | 2572.0 | 0.7865 | 1811.0 | 1831.0 | 2026.0 | 0.9038 | 0.8939 | 752.0 | 756.0 | 1231.0 | 0.6141 | 0.6109 |
| 0.0 | 5.0 | 215 | 1.4715 | 0.0056 | 6942.0371 | 4811.8535 | 2611.0 | 3270.0 | 0.7985 | 2575.0 | 0.7875 | 1690.0 | 1727.0 | 2026.0 | 0.8524 | 0.8342 | 877.0 | 884.0 | 1231.0 | 0.7181 | 0.7124 |
| 0.0004 | 6.0 | 258 | 1.5466 | 0.0056 | 7296.3329 | 5057.4326 | 2619.0 | 3270.0 | 0.8009 | 2594.0 | 0.7933 | 1748.0 | 1776.0 | 2026.0 | 0.8766 | 0.8628 | 838.0 | 843.0 | 1231.0 | 0.6848 | 0.6807 |
| 0.0 | 7.0 | 301 | 1.5498 | 0.0056 | 7311.3028 | 5067.8089 | 2617.0 | 3270.0 | 0.8003 | 2587.0 | 0.7911 | 1708.0 | 1739.0 | 2026.0 | 0.8583 | 0.8430 | 871.0 | 878.0 | 1231.0 | 0.7132 | 0.7076 |
| 0.0 | 8.0 | 344 | 1.5583 | 0.0056 | 7351.5687 | 5095.7191 | 2617.0 | 3270.0 | 0.8003 | 2591.0 | 0.7924 | 1708.0 | 1737.0 | 2026.0 | 0.8574 | 0.8430 | 875.0 | 880.0 | 1231.0 | 0.7149 | 0.7108 |
| 0.0 | 9.0 | 387 | 1.5645 | 0.0056 | 7380.4891 | 5115.7652 | 2615.0 | 3270.0 | 0.7997 | 2589.0 | 0.7917 | 1710.0 | 1738.0 | 2026.0 | 0.8578 | 0.8440 | 871.0 | 877.0 | 1231.0 | 0.7124 | 0.7076 |
| 0.0 | 10.0 | 430 | 1.5689 | 0.0056 | 7401.5336 | 5130.3521 | 2615.0 | 3270.0 | 0.7997 | 2593.0 | 0.7930 | 1712.0 | 1738.0 | 2026.0 | 0.8578 | 0.8450 | 873.0 | 877.0 | 1231.0 | 0.7124 | 0.7092 |
| 0.0 | 11.0 | 473 | 1.5753 | 0.0056 | 7431.6332 | 5151.2156 | 2618.0 | 3270.0 | 0.8006 | 2595.0 | 0.7936 | 1713.0 | 1738.0 | 2026.0 | 0.8578 | 0.8455 | 873.0 | 880.0 | 1231.0 | 0.7149 | 0.7092 |
| 0.0 | 12.0 | 516 | 1.5764 | 0.0056 | 7436.8304 | 5154.8180 | 2617.0 | 3270.0 | 0.8003 | 2594.0 | 0.7933 | 1714.0 | 1739.0 | 2026.0 | 0.8583 | 0.8460 | 872.0 | 878.0 | 1231.0 | 0.7132 | 0.7084 |
| 0.0 | 13.0 | 559 | 1.5821 | 0.0056 | 7463.8777 | 5173.5658 | 2616.0 | 3270.0 | 0.8 | 2592.0 | 0.7927 | 1712.0 | 1738.0 | 2026.0 | 0.8578 | 0.8450 | 872.0 | 878.0 | 1231.0 | 0.7132 | 0.7084 |
| 0.0 | 14.0 | 602 | 1.5848 | 0.0056 | 7476.3623 | 5182.2194 | 2615.0 | 3270.0 | 0.7997 | 2592.0 | 0.7927 | 1711.0 | 1737.0 | 2026.0 | 0.8574 | 0.8445 | 873.0 | 878.0 | 1231.0 | 0.7132 | 0.7092 |
| 0.0 | 15.0 | 645 | 1.5866 | 0.0056 | 7484.9367 | 5188.1628 | 2617.0 | 3270.0 | 0.8003 | 2595.0 | 0.7936 | 1712.0 | 1738.0 | 2026.0 | 0.8578 | 0.8450 | 874.0 | 879.0 | 1231.0 | 0.7141 | 0.7100 |
| 0.9802 | 16.0 | 688 | 1.5898 | 0.0056 | 7499.9718 | 5198.5843 | 2617.0 | 3270.0 | 0.8003 | 2597.0 | 0.7942 | 1714.0 | 1738.0 | 2026.0 | 0.8578 | 0.8460 | 875.0 | 879.0 | 1231.0 | 0.7141 | 0.7108 |
| 0.0 | 17.0 | 731 | 1.5963 | 0.0056 | 7530.6554 | 5219.8526 | 2616.0 | 3270.0 | 0.8 | 2597.0 | 0.7942 | 1715.0 | 1739.0 | 2026.0 | 0.8583 | 0.8465 | 874.0 | 877.0 | 1231.0 | 0.7124 | 0.7100 |
| 0.0 | 18.0 | 774 | 1.6015 | 0.0056 | 7555.0401 | 5236.7547 | 2613.0 | 3270.0 | 0.7991 | 2592.0 | 0.7927 | 1712.0 | 1737.0 | 2026.0 | 0.8574 | 0.8450 | 872.0 | 876.0 | 1231.0 | 0.7116 | 0.7084 |
| 0.0 | 19.0 | 817 | 1.5991 | 0.0056 | 7543.8108 | 5228.9712 | 2618.0 | 3270.0 | 0.8006 | 2597.0 | 0.7942 | 1713.0 | 1738.0 | 2026.0 | 0.8578 | 0.8455 | 876.0 | 880.0 | 1231.0 | 0.7149 | 0.7116 |
| 0.0 | 20.0 | 860 | 1.6021 | 0.0056 | 7558.1173 | 5238.8877 | 2616.0 | 3270.0 | 0.8 | 2596.0 | 0.7939 | 1715.0 | 1739.0 | 2026.0 | 0.8583 | 0.8465 | 873.0 | 877.0 | 1231.0 | 0.7124 | 0.7092 |
| 0.0 | 21.0 | 903 | 1.6036 | 0.0056 | 7565.0561 | 5243.6973 | 2614.0 | 3270.0 | 0.7994 | 2594.0 | 0.7933 | 1713.0 | 1737.0 | 2026.0 | 0.8574 | 0.8455 | 873.0 | 877.0 | 1231.0 | 0.7124 | 0.7092 |
| 0.0 | 22.0 | 946 | 1.6052 | 0.0056 | 7572.8549 | 5249.1031 | 2615.0 | 3270.0 | 0.7997 | 2596.0 | 0.7939 | 1713.0 | 1737.0 | 2026.0 | 0.8574 | 0.8455 | 874.0 | 878.0 | 1231.0 | 0.7132 | 0.7100 |
| 0.0 | 23.0 | 989 | 1.6049 | 0.0056 | 7571.4610 | 5248.1369 | 2614.0 | 3270.0 | 0.7994 | 2595.0 | 0.7936 | 1712.0 | 1736.0 | 2026.0 | 0.8569 | 0.8450 | 875.0 | 878.0 | 1231.0 | 0.7132 | 0.7108 |
| 0.0 | 24.0 | 1032 | 1.6037 | 0.0056 | 7565.6381 | 5244.1007 | 2616.0 | 3270.0 | 0.8 | 2597.0 | 0.7942 | 1716.0 | 1739.0 | 2026.0 | 0.8583 | 0.8470 | 873.0 | 877.0 | 1231.0 | 0.7124 | 0.7092 |
| 0.0 | 25.0 | 1075 | 1.6096 | 0.0056 | 7593.4658 | 5263.3894 | 2615.0 | 3270.0 | 0.7997 | 2595.0 | 0.7936 | 1714.0 | 1738.0 | 2026.0 | 0.8578 | 0.8460 | 873.0 | 877.0 | 1231.0 | 0.7124 | 0.7092 |
| 0.0 | 26.0 | 1118 | 1.6081 | 0.0056 | 7586.3418 | 5258.4514 | 2618.0 | 3270.0 | 0.8006 | 2600.0 | 0.7951 | 1717.0 | 1739.0 | 2026.0 | 0.8583 | 0.8475 | 875.0 | 879.0 | 1231.0 | 0.7141 | 0.7108 |
| 0.0 | 27.0 | 1161 | 1.6060 | 0.0056 | 7576.7036 | 5251.7707 | 2615.0 | 3270.0 | 0.7997 | 2594.0 | 0.7933 | 1712.0 | 1737.0 | 2026.0 | 0.8574 | 0.8450 | 874.0 | 878.0 | 1231.0 | 0.7132 | 0.7100 |
| 0.0 | 28.0 | 1204 | 1.6088 | 0.0056 | 7589.7099 | 5260.7860 | 2617.0 | 3270.0 | 0.8003 | 2598.0 | 0.7945 | 1717.0 | 1739.0 | 2026.0 | 0.8583 | 0.8475 | 873.0 | 878.0 | 1231.0 | 0.7132 | 0.7092 |
| 0.0 | 29.0 | 1247 | 1.6068 | 0.0056 | 7580.2581 | 5254.2345 | 2613.0 | 3270.0 | 0.7991 | 2595.0 | 0.7936 | 1717.0 | 1740.0 | 2026.0 | 0.8588 | 0.8475 | 869.0 | 873.0 | 1231.0 | 0.7092 | 0.7059 |
| 0.0 | 30.0 | 1290 | 1.6088 | 0.0056 | 7589.7604 | 5260.8210 | 2616.0 | 3270.0 | 0.8 | 2599.0 | 0.7948 | 1716.0 | 1738.0 | 2026.0 | 0.8578 | 0.8470 | 875.0 | 878.0 | 1231.0 | 0.7132 | 0.7108 |
| 0.0 | 31.0 | 1333 | 1.6060 | 0.0056 | 7576.4338 | 5251.5837 | 2611.0 | 3270.0 | 0.7985 | 2592.0 | 0.7927 | 1713.0 | 1736.0 | 2026.0 | 0.8569 | 0.8455 | 871.0 | 875.0 | 1231.0 | 0.7108 | 0.7076 |
| 0.0 | 32.0 | 1376 | 1.6103 | 0.0056 | 7596.7626 | 5265.6746 | 2618.0 | 3270.0 | 0.8006 | 2599.0 | 0.7948 | 1716.0 | 1740.0 | 2026.0 | 0.8588 | 0.8470 | 875.0 | 878.0 | 1231.0 | 0.7132 | 0.7108 |
| 0.0 | 33.0 | 1419 | 1.6099 | 0.0056 | 7594.6633 | 5264.2194 | 2612.0 | 3270.0 | 0.7988 | 2594.0 | 0.7933 | 1715.0 | 1737.0 | 2026.0 | 0.8574 | 0.8465 | 871.0 | 875.0 | 1231.0 | 0.7108 | 0.7076 |
| 0.0 | 34.0 | 1462 | 1.6107 | 0.0056 | 7598.6742 | 5266.9996 | 2616.0 | 3270.0 | 0.8 | 2597.0 | 0.7942 | 1716.0 | 1738.0 | 2026.0 | 0.8578 | 0.8470 | 873.0 | 878.0 | 1231.0 | 0.7132 | 0.7092 |
| 0.0 | 35.0 | 1505 | 1.6082 | 0.0056 | 7586.7298 | 5258.7204 | 2617.0 | 3270.0 | 0.8003 | 2601.0 | 0.7954 | 1718.0 | 1738.0 | 2026.0 | 0.8578 | 0.8480 | 874.0 | 879.0 | 1231.0 | 0.7141 | 0.7100 |
| 0.0 | 36.0 | 1548 | 1.6120 | 0.0056 | 7604.7402 | 5271.2042 | 2617.0 | 3270.0 | 0.8003 | 2601.0 | 0.7954 | 1718.0 | 1738.0 | 2026.0 | 0.8578 | 0.8480 | 875.0 | 879.0 | 1231.0 | 0.7141 | 0.7108 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/BoolQ_Llama-3.2-1B-5r42yp3k
Base model
meta-llama/Llama-3.2-1B