BoolQ_Llama-3.2-1B-dlyt1wr4

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3222
  • Model Preparation Time: 0.0041
  • Mdl: 6237.6140
  • Accumulated Loss: 4323.5846
  • Correct Preds: 2327.0
  • Total Preds: 3270.0
  • Accuracy: 0.7116
  • Correct Gen Preds: 2316.0
  • Gen Accuracy: 0.7083
  • Correct Gen Preds 9642: 1502.0
  • Correct Preds 9642: 1512.0
  • Total Labels 9642: 2026.0
  • Accuracy 9642: 0.7463
  • Gen Accuracy 9642: 0.7414
  • Correct Gen Preds 2822: 805.0
  • Correct Preds 2822: 815.0
  • Total Labels 2822: 1231.0
  • Accuracy 2822: 0.6621
  • Gen Accuracy 2822: 0.6539

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 120
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 9642 Correct Preds 9642 Total Labels 9642 Accuracy 9642 Gen Accuracy 9642 Correct Gen Preds 2822 Correct Preds 2822 Total Labels 2822 Accuracy 2822 Gen Accuracy 2822
No log 0 0 0.7080 0.0041 3339.8933 2315.0376 2032.0 3270.0 0.6214 2040.0 0.6239 2007.0 2008.0 2026.0 0.9911 0.9906 24.0 24.0 1231.0 0.0195 0.0195
0.7339 1.0 3 0.6871 0.0041 3241.5040 2246.8394 2049.0 3270.0 0.6266 1935.0 0.5917 1121.0 1187.0 2026.0 0.5859 0.5533 805.0 862.0 1231.0 0.7002 0.6539
0.2203 2.0 6 0.7155 0.0041 3375.4781 2339.7031 2151.0 3270.0 0.6578 1873.0 0.5728 1185.0 1356.0 2026.0 0.6693 0.5849 679.0 795.0 1231.0 0.6458 0.5516
0.0775 3.0 9 1.5181 0.0041 7161.7687 4964.1598 2122.0 3270.0 0.6489 2127.0 0.6505 1936.0 1938.0 2026.0 0.9566 0.9556 182.0 184.0 1231.0 0.1495 0.1478
0.0019 4.0 12 1.3222 0.0041 6237.6140 4323.5846 2327.0 3270.0 0.7116 2316.0 0.7083 1502.0 1512.0 2026.0 0.7463 0.7414 805.0 815.0 1231.0 0.6621 0.6539
0.0006 5.0 15 2.1361 0.0041 10077.3021 6985.0536 2195.0 3270.0 0.6713 2062.0 0.6306 1097.0 1185.0 2026.0 0.5849 0.5415 956.0 1010.0 1231.0 0.8205 0.7766
0.0001 6.0 18 2.5726 0.0041 12136.6096 8412.4567 2200.0 3270.0 0.6728 1960.0 0.5994 1078.0 1230.0 2026.0 0.6071 0.5321 873.0 970.0 1231.0 0.7880 0.7092
0.0001 7.0 21 2.8170 0.0041 13289.4765 9211.5632 2203.0 3270.0 0.6737 1910.0 0.5841 1086.0 1268.0 2026.0 0.6259 0.5360 815.0 935.0 1231.0 0.7595 0.6621
0.0 8.0 24 2.9660 0.0041 13992.3481 9698.7567 2207.0 3270.0 0.6749 1914.0 0.5853 1112.0 1287.0 2026.0 0.6352 0.5489 793.0 920.0 1231.0 0.7474 0.6442
0.0 9.0 27 3.0542 0.0041 14408.4594 9987.1830 2215.0 3270.0 0.6774 1939.0 0.5930 1141.0 1301.0 2026.0 0.6422 0.5632 789.0 914.0 1231.0 0.7425 0.6409
0.0 10.0 30 3.1077 0.0041 14661.1462 10162.3322 2211.0 3270.0 0.6761 1955.0 0.5979 1161.0 1305.0 2026.0 0.6441 0.5731 785.0 906.0 1231.0 0.7360 0.6377
0.0 11.0 33 3.1439 0.0041 14831.7314 10280.5728 2207.0 3270.0 0.6749 1970.0 0.6024 1170.0 1306.0 2026.0 0.6446 0.5775 791.0 901.0 1231.0 0.7319 0.6426
0.0 12.0 36 3.1697 0.0041 14953.5728 10365.0268 2207.0 3270.0 0.6749 1979.0 0.6052 1179.0 1310.0 2026.0 0.6466 0.5819 791.0 897.0 1231.0 0.7287 0.6426
0.0 13.0 39 3.1841 0.0041 15021.1170 10411.8449 2204.0 3270.0 0.6740 1990.0 0.6086 1186.0 1311.0 2026.0 0.6471 0.5854 795.0 893.0 1231.0 0.7254 0.6458
0.0 14.0 42 3.1923 0.0041 15060.2306 10438.9564 2207.0 3270.0 0.6749 1997.0 0.6107 1186.0 1313.0 2026.0 0.6481 0.5854 802.0 894.0 1231.0 0.7262 0.6515
0.0 15.0 45 3.2004 0.0041 15098.1299 10465.2261 2204.0 3270.0 0.6740 2012.0 0.6153 1195.0 1312.0 2026.0 0.6476 0.5898 808.0 892.0 1231.0 0.7246 0.6564
0.0 16.0 48 3.2018 0.0041 15105.0253 10470.0057 2207.0 3270.0 0.6749 2013.0 0.6156 1194.0 1312.0 2026.0 0.6476 0.5893 810.0 895.0 1231.0 0.7271 0.6580
0.0 17.0 51 3.2077 0.0041 15132.8760 10489.3104 2208.0 3270.0 0.6752 2020.0 0.6177 1200.0 1315.0 2026.0 0.6491 0.5923 811.0 893.0 1231.0 0.7254 0.6588
0.0 18.0 54 3.2123 0.0041 15154.4908 10504.2926 2206.0 3270.0 0.6746 2021.0 0.6180 1197.0 1313.0 2026.0 0.6481 0.5908 815.0 893.0 1231.0 0.7254 0.6621
0.0 19.0 57 3.2174 0.0041 15178.6611 10521.0462 2207.0 3270.0 0.6749 2024.0 0.6190 1201.0 1315.0 2026.0 0.6491 0.5928 814.0 892.0 1231.0 0.7246 0.6613
0.0 20.0 60 3.2183 0.0041 15182.4947 10523.7034 2211.0 3270.0 0.6761 2028.0 0.6202 1203.0 1318.0 2026.0 0.6505 0.5938 816.0 893.0 1231.0 0.7254 0.6629
0.0 21.0 63 3.2196 0.0041 15188.7486 10528.0383 2208.0 3270.0 0.6752 2026.0 0.6196 1201.0 1313.0 2026.0 0.6481 0.5928 816.0 895.0 1231.0 0.7271 0.6629
0.0 22.0 66 3.2239 0.0041 15208.9157 10542.0171 2204.0 3270.0 0.6740 2035.0 0.6223 1204.0 1315.0 2026.0 0.6491 0.5943 822.0 889.0 1231.0 0.7222 0.6677
0.0 23.0 69 3.2223 0.0041 15201.5672 10536.9235 2206.0 3270.0 0.6746 2030.0 0.6208 1202.0 1311.0 2026.0 0.6471 0.5933 819.0 895.0 1231.0 0.7271 0.6653
0.0 24.0 72 3.2266 0.0041 15221.7566 10550.9177 2206.0 3270.0 0.6746 2034.0 0.6220 1204.0 1312.0 2026.0 0.6476 0.5943 821.0 894.0 1231.0 0.7262 0.6669
0.0 25.0 75 3.2295 0.0041 15235.6263 10560.5314 2205.0 3270.0 0.6743 2039.0 0.6235 1208.0 1314.0 2026.0 0.6486 0.5962 822.0 891.0 1231.0 0.7238 0.6677
0.0 26.0 78 3.2290 0.0041 15233.1976 10558.8480 2210.0 3270.0 0.6758 2042.0 0.6245 1210.0 1317.0 2026.0 0.6500 0.5972 823.0 893.0 1231.0 0.7254 0.6686
0.0 27.0 81 3.2308 0.0041 15241.5995 10564.6717 2204.0 3270.0 0.6740 2044.0 0.6251 1208.0 1312.0 2026.0 0.6476 0.5962 827.0 892.0 1231.0 0.7246 0.6718
0.0 28.0 84 3.2303 0.0041 15239.1917 10563.0028 2209.0 3270.0 0.6755 2046.0 0.6257 1210.0 1316.0 2026.0 0.6496 0.5972 827.0 893.0 1231.0 0.7254 0.6718
0.0 29.0 87 3.2346 0.0041 15259.7713 10577.2675 2206.0 3270.0 0.6746 2051.0 0.6272 1214.0 1315.0 2026.0 0.6491 0.5992 828.0 891.0 1231.0 0.7238 0.6726
0.0 30.0 90 3.2359 0.0041 15265.6533 10581.3446 2207.0 3270.0 0.6749 2053.0 0.6278 1213.0 1315.0 2026.0 0.6491 0.5987 831.0 892.0 1231.0 0.7246 0.6751
0.0 31.0 93 3.2395 0.0041 15282.6219 10593.1063 2203.0 3270.0 0.6737 2059.0 0.6297 1215.0 1311.0 2026.0 0.6471 0.5997 835.0 892.0 1231.0 0.7246 0.6783
0.0 32.0 96 3.2376 0.0041 15273.6693 10586.9008 2204.0 3270.0 0.6740 2058.0 0.6294 1215.0 1312.0 2026.0 0.6476 0.5997 834.0 892.0 1231.0 0.7246 0.6775
0.0 33.0 99 3.2397 0.0041 15283.7410 10593.8820 2205.0 3270.0 0.6743 2065.0 0.6315 1221.0 1312.0 2026.0 0.6476 0.6027 835.0 893.0 1231.0 0.7254 0.6783
0.0 34.0 102 3.2420 0.0041 15294.5155 10601.3503 2204.0 3270.0 0.6740 2060.0 0.6300 1219.0 1313.0 2026.0 0.6481 0.6017 832.0 891.0 1231.0 0.7238 0.6759

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/BoolQ_Llama-3.2-1B-dlyt1wr4

Finetuned
(899)
this model