BoolQ_Llama-3.2-1B-dlyt1wr4
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.3222
- Model Preparation Time: 0.0041
- Mdl: 6237.6140
- Accumulated Loss: 4323.5846
- Correct Preds: 2327.0
- Total Preds: 3270.0
- Accuracy: 0.7116
- Correct Gen Preds: 2316.0
- Gen Accuracy: 0.7083
- Correct Gen Preds 9642: 1502.0
- Correct Preds 9642: 1512.0
- Total Labels 9642: 2026.0
- Accuracy 9642: 0.7463
- Gen Accuracy 9642: 0.7414
- Correct Gen Preds 2822: 805.0
- Correct Preds 2822: 815.0
- Total Labels 2822: 1231.0
- Accuracy 2822: 0.6621
- Gen Accuracy 2822: 0.6539
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 120
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 9642 | Correct Preds 9642 | Total Labels 9642 | Accuracy 9642 | Gen Accuracy 9642 | Correct Gen Preds 2822 | Correct Preds 2822 | Total Labels 2822 | Accuracy 2822 | Gen Accuracy 2822 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 0.7080 | 0.0041 | 3339.8933 | 2315.0376 | 2032.0 | 3270.0 | 0.6214 | 2040.0 | 0.6239 | 2007.0 | 2008.0 | 2026.0 | 0.9911 | 0.9906 | 24.0 | 24.0 | 1231.0 | 0.0195 | 0.0195 |
| 0.7339 | 1.0 | 3 | 0.6871 | 0.0041 | 3241.5040 | 2246.8394 | 2049.0 | 3270.0 | 0.6266 | 1935.0 | 0.5917 | 1121.0 | 1187.0 | 2026.0 | 0.5859 | 0.5533 | 805.0 | 862.0 | 1231.0 | 0.7002 | 0.6539 |
| 0.2203 | 2.0 | 6 | 0.7155 | 0.0041 | 3375.4781 | 2339.7031 | 2151.0 | 3270.0 | 0.6578 | 1873.0 | 0.5728 | 1185.0 | 1356.0 | 2026.0 | 0.6693 | 0.5849 | 679.0 | 795.0 | 1231.0 | 0.6458 | 0.5516 |
| 0.0775 | 3.0 | 9 | 1.5181 | 0.0041 | 7161.7687 | 4964.1598 | 2122.0 | 3270.0 | 0.6489 | 2127.0 | 0.6505 | 1936.0 | 1938.0 | 2026.0 | 0.9566 | 0.9556 | 182.0 | 184.0 | 1231.0 | 0.1495 | 0.1478 |
| 0.0019 | 4.0 | 12 | 1.3222 | 0.0041 | 6237.6140 | 4323.5846 | 2327.0 | 3270.0 | 0.7116 | 2316.0 | 0.7083 | 1502.0 | 1512.0 | 2026.0 | 0.7463 | 0.7414 | 805.0 | 815.0 | 1231.0 | 0.6621 | 0.6539 |
| 0.0006 | 5.0 | 15 | 2.1361 | 0.0041 | 10077.3021 | 6985.0536 | 2195.0 | 3270.0 | 0.6713 | 2062.0 | 0.6306 | 1097.0 | 1185.0 | 2026.0 | 0.5849 | 0.5415 | 956.0 | 1010.0 | 1231.0 | 0.8205 | 0.7766 |
| 0.0001 | 6.0 | 18 | 2.5726 | 0.0041 | 12136.6096 | 8412.4567 | 2200.0 | 3270.0 | 0.6728 | 1960.0 | 0.5994 | 1078.0 | 1230.0 | 2026.0 | 0.6071 | 0.5321 | 873.0 | 970.0 | 1231.0 | 0.7880 | 0.7092 |
| 0.0001 | 7.0 | 21 | 2.8170 | 0.0041 | 13289.4765 | 9211.5632 | 2203.0 | 3270.0 | 0.6737 | 1910.0 | 0.5841 | 1086.0 | 1268.0 | 2026.0 | 0.6259 | 0.5360 | 815.0 | 935.0 | 1231.0 | 0.7595 | 0.6621 |
| 0.0 | 8.0 | 24 | 2.9660 | 0.0041 | 13992.3481 | 9698.7567 | 2207.0 | 3270.0 | 0.6749 | 1914.0 | 0.5853 | 1112.0 | 1287.0 | 2026.0 | 0.6352 | 0.5489 | 793.0 | 920.0 | 1231.0 | 0.7474 | 0.6442 |
| 0.0 | 9.0 | 27 | 3.0542 | 0.0041 | 14408.4594 | 9987.1830 | 2215.0 | 3270.0 | 0.6774 | 1939.0 | 0.5930 | 1141.0 | 1301.0 | 2026.0 | 0.6422 | 0.5632 | 789.0 | 914.0 | 1231.0 | 0.7425 | 0.6409 |
| 0.0 | 10.0 | 30 | 3.1077 | 0.0041 | 14661.1462 | 10162.3322 | 2211.0 | 3270.0 | 0.6761 | 1955.0 | 0.5979 | 1161.0 | 1305.0 | 2026.0 | 0.6441 | 0.5731 | 785.0 | 906.0 | 1231.0 | 0.7360 | 0.6377 |
| 0.0 | 11.0 | 33 | 3.1439 | 0.0041 | 14831.7314 | 10280.5728 | 2207.0 | 3270.0 | 0.6749 | 1970.0 | 0.6024 | 1170.0 | 1306.0 | 2026.0 | 0.6446 | 0.5775 | 791.0 | 901.0 | 1231.0 | 0.7319 | 0.6426 |
| 0.0 | 12.0 | 36 | 3.1697 | 0.0041 | 14953.5728 | 10365.0268 | 2207.0 | 3270.0 | 0.6749 | 1979.0 | 0.6052 | 1179.0 | 1310.0 | 2026.0 | 0.6466 | 0.5819 | 791.0 | 897.0 | 1231.0 | 0.7287 | 0.6426 |
| 0.0 | 13.0 | 39 | 3.1841 | 0.0041 | 15021.1170 | 10411.8449 | 2204.0 | 3270.0 | 0.6740 | 1990.0 | 0.6086 | 1186.0 | 1311.0 | 2026.0 | 0.6471 | 0.5854 | 795.0 | 893.0 | 1231.0 | 0.7254 | 0.6458 |
| 0.0 | 14.0 | 42 | 3.1923 | 0.0041 | 15060.2306 | 10438.9564 | 2207.0 | 3270.0 | 0.6749 | 1997.0 | 0.6107 | 1186.0 | 1313.0 | 2026.0 | 0.6481 | 0.5854 | 802.0 | 894.0 | 1231.0 | 0.7262 | 0.6515 |
| 0.0 | 15.0 | 45 | 3.2004 | 0.0041 | 15098.1299 | 10465.2261 | 2204.0 | 3270.0 | 0.6740 | 2012.0 | 0.6153 | 1195.0 | 1312.0 | 2026.0 | 0.6476 | 0.5898 | 808.0 | 892.0 | 1231.0 | 0.7246 | 0.6564 |
| 0.0 | 16.0 | 48 | 3.2018 | 0.0041 | 15105.0253 | 10470.0057 | 2207.0 | 3270.0 | 0.6749 | 2013.0 | 0.6156 | 1194.0 | 1312.0 | 2026.0 | 0.6476 | 0.5893 | 810.0 | 895.0 | 1231.0 | 0.7271 | 0.6580 |
| 0.0 | 17.0 | 51 | 3.2077 | 0.0041 | 15132.8760 | 10489.3104 | 2208.0 | 3270.0 | 0.6752 | 2020.0 | 0.6177 | 1200.0 | 1315.0 | 2026.0 | 0.6491 | 0.5923 | 811.0 | 893.0 | 1231.0 | 0.7254 | 0.6588 |
| 0.0 | 18.0 | 54 | 3.2123 | 0.0041 | 15154.4908 | 10504.2926 | 2206.0 | 3270.0 | 0.6746 | 2021.0 | 0.6180 | 1197.0 | 1313.0 | 2026.0 | 0.6481 | 0.5908 | 815.0 | 893.0 | 1231.0 | 0.7254 | 0.6621 |
| 0.0 | 19.0 | 57 | 3.2174 | 0.0041 | 15178.6611 | 10521.0462 | 2207.0 | 3270.0 | 0.6749 | 2024.0 | 0.6190 | 1201.0 | 1315.0 | 2026.0 | 0.6491 | 0.5928 | 814.0 | 892.0 | 1231.0 | 0.7246 | 0.6613 |
| 0.0 | 20.0 | 60 | 3.2183 | 0.0041 | 15182.4947 | 10523.7034 | 2211.0 | 3270.0 | 0.6761 | 2028.0 | 0.6202 | 1203.0 | 1318.0 | 2026.0 | 0.6505 | 0.5938 | 816.0 | 893.0 | 1231.0 | 0.7254 | 0.6629 |
| 0.0 | 21.0 | 63 | 3.2196 | 0.0041 | 15188.7486 | 10528.0383 | 2208.0 | 3270.0 | 0.6752 | 2026.0 | 0.6196 | 1201.0 | 1313.0 | 2026.0 | 0.6481 | 0.5928 | 816.0 | 895.0 | 1231.0 | 0.7271 | 0.6629 |
| 0.0 | 22.0 | 66 | 3.2239 | 0.0041 | 15208.9157 | 10542.0171 | 2204.0 | 3270.0 | 0.6740 | 2035.0 | 0.6223 | 1204.0 | 1315.0 | 2026.0 | 0.6491 | 0.5943 | 822.0 | 889.0 | 1231.0 | 0.7222 | 0.6677 |
| 0.0 | 23.0 | 69 | 3.2223 | 0.0041 | 15201.5672 | 10536.9235 | 2206.0 | 3270.0 | 0.6746 | 2030.0 | 0.6208 | 1202.0 | 1311.0 | 2026.0 | 0.6471 | 0.5933 | 819.0 | 895.0 | 1231.0 | 0.7271 | 0.6653 |
| 0.0 | 24.0 | 72 | 3.2266 | 0.0041 | 15221.7566 | 10550.9177 | 2206.0 | 3270.0 | 0.6746 | 2034.0 | 0.6220 | 1204.0 | 1312.0 | 2026.0 | 0.6476 | 0.5943 | 821.0 | 894.0 | 1231.0 | 0.7262 | 0.6669 |
| 0.0 | 25.0 | 75 | 3.2295 | 0.0041 | 15235.6263 | 10560.5314 | 2205.0 | 3270.0 | 0.6743 | 2039.0 | 0.6235 | 1208.0 | 1314.0 | 2026.0 | 0.6486 | 0.5962 | 822.0 | 891.0 | 1231.0 | 0.7238 | 0.6677 |
| 0.0 | 26.0 | 78 | 3.2290 | 0.0041 | 15233.1976 | 10558.8480 | 2210.0 | 3270.0 | 0.6758 | 2042.0 | 0.6245 | 1210.0 | 1317.0 | 2026.0 | 0.6500 | 0.5972 | 823.0 | 893.0 | 1231.0 | 0.7254 | 0.6686 |
| 0.0 | 27.0 | 81 | 3.2308 | 0.0041 | 15241.5995 | 10564.6717 | 2204.0 | 3270.0 | 0.6740 | 2044.0 | 0.6251 | 1208.0 | 1312.0 | 2026.0 | 0.6476 | 0.5962 | 827.0 | 892.0 | 1231.0 | 0.7246 | 0.6718 |
| 0.0 | 28.0 | 84 | 3.2303 | 0.0041 | 15239.1917 | 10563.0028 | 2209.0 | 3270.0 | 0.6755 | 2046.0 | 0.6257 | 1210.0 | 1316.0 | 2026.0 | 0.6496 | 0.5972 | 827.0 | 893.0 | 1231.0 | 0.7254 | 0.6718 |
| 0.0 | 29.0 | 87 | 3.2346 | 0.0041 | 15259.7713 | 10577.2675 | 2206.0 | 3270.0 | 0.6746 | 2051.0 | 0.6272 | 1214.0 | 1315.0 | 2026.0 | 0.6491 | 0.5992 | 828.0 | 891.0 | 1231.0 | 0.7238 | 0.6726 |
| 0.0 | 30.0 | 90 | 3.2359 | 0.0041 | 15265.6533 | 10581.3446 | 2207.0 | 3270.0 | 0.6749 | 2053.0 | 0.6278 | 1213.0 | 1315.0 | 2026.0 | 0.6491 | 0.5987 | 831.0 | 892.0 | 1231.0 | 0.7246 | 0.6751 |
| 0.0 | 31.0 | 93 | 3.2395 | 0.0041 | 15282.6219 | 10593.1063 | 2203.0 | 3270.0 | 0.6737 | 2059.0 | 0.6297 | 1215.0 | 1311.0 | 2026.0 | 0.6471 | 0.5997 | 835.0 | 892.0 | 1231.0 | 0.7246 | 0.6783 |
| 0.0 | 32.0 | 96 | 3.2376 | 0.0041 | 15273.6693 | 10586.9008 | 2204.0 | 3270.0 | 0.6740 | 2058.0 | 0.6294 | 1215.0 | 1312.0 | 2026.0 | 0.6476 | 0.5997 | 834.0 | 892.0 | 1231.0 | 0.7246 | 0.6775 |
| 0.0 | 33.0 | 99 | 3.2397 | 0.0041 | 15283.7410 | 10593.8820 | 2205.0 | 3270.0 | 0.6743 | 2065.0 | 0.6315 | 1221.0 | 1312.0 | 2026.0 | 0.6476 | 0.6027 | 835.0 | 893.0 | 1231.0 | 0.7254 | 0.6783 |
| 0.0 | 34.0 | 102 | 3.2420 | 0.0041 | 15294.5155 | 10601.3503 | 2204.0 | 3270.0 | 0.6740 | 2060.0 | 0.6300 | 1219.0 | 1313.0 | 2026.0 | 0.6481 | 0.6017 | 832.0 | 891.0 | 1231.0 | 0.7238 | 0.6759 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/BoolQ_Llama-3.2-1B-dlyt1wr4
Base model
meta-llama/Llama-3.2-1B