GSM8K-Binary_Llama-3.2-1B-nbi456jq

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5355
  • Model Preparation Time: 0.0059
  • Mdl: 5482.8873
  • Accumulated Loss: 3800.4479
  • Correct Preds: 1901.0
  • Total Preds: 2475.0
  • Accuracy: 0.7681
  • Correct Gen Preds: 1902.0
  • Gen Accuracy: 0.7685
  • Correct Gen Preds 34192: 1031.0
  • Correct Preds 34192: 1036.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.8662
  • Gen Accuracy 34192: 0.8620
  • Correct Gen Preds 41568: 863.0
  • Correct Preds 41568: 865.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.6827
  • Gen Accuracy 41568: 0.6811

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0059 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
0.7326 1.0 10 0.7570 0.0059 2702.9154 1873.5182 1282.0 2475.0 0.5180 8.0 0.0032 0.0 33.0 1196.0 0.0276 0.0 0.0 1249.0 1267.0 0.9858 0.0
0.9778 2.0 20 0.8808 0.0059 3144.9448 2179.9096 1267.0 2475.0 0.5119 9.0 0.0036 0.0 0.0 1196.0 0.0 0.0 0.0 1267.0 1267.0 1.0 0.0
0.4933 3.0 30 0.6766 0.0059 2415.9199 1674.5881 1548.0 2475.0 0.6255 8.0 0.0032 0.0 1160.0 1196.0 0.9699 0.0 0.0 388.0 1267.0 0.3062 0.0
1.1978 4.0 40 0.6905 0.0059 2465.4758 1708.9376 1727.0 2475.0 0.6978 27.0 0.0109 0.0 1109.0 1196.0 0.9273 0.0 18.0 618.0 1267.0 0.4878 0.0142
0.4315 5.0 50 0.6687 0.0059 2387.7910 1655.0906 1826.0 2475.0 0.7378 373.0 0.1507 54.0 754.0 1196.0 0.6304 0.0452 311.0 1072.0 1267.0 0.8461 0.2455
0.3427 6.0 60 0.7262 0.0059 2592.9923 1797.3253 1831.0 2475.0 0.7398 272.0 0.1099 100.0 1007.0 1196.0 0.8420 0.0836 164.0 824.0 1267.0 0.6504 0.1294
0.0081 7.0 70 1.4295 0.0059 5104.1074 3537.8977 1826.0 2475.0 0.7378 1745.0 0.7051 1066.0 1090.0 1196.0 0.9114 0.8913 671.0 736.0 1267.0 0.5809 0.5296
0.0085 8.0 80 1.0812 0.0059 3860.6583 2676.0044 1875.0 2475.0 0.7576 1447.0 0.5846 707.0 989.0 1196.0 0.8269 0.5911 731.0 886.0 1267.0 0.6993 0.5770
0.0003 9.0 90 1.4846 0.0059 5300.9803 3674.3596 1882.0 2475.0 0.7604 1872.0 0.7564 835.0 843.0 1196.0 0.7048 0.6982 1030.0 1039.0 1267.0 0.8200 0.8129
0.1005 10.0 100 1.4367 0.0059 5130.0125 3555.8537 1885.0 2475.0 0.7616 1674.0 0.6764 909.0 1020.0 1196.0 0.8528 0.7600 758.0 865.0 1267.0 0.6827 0.5983
0.0 11.0 110 1.9148 0.0059 6837.0276 4739.0664 1807.0 2475.0 0.7301 1758.0 0.7103 1077.0 1103.0 1196.0 0.9222 0.9005 673.0 704.0 1267.0 0.5556 0.5312
0.0813 12.0 120 1.8798 0.0059 6712.1494 4652.5074 1781.0 2475.0 0.7196 1715.0 0.6929 1117.0 1142.0 1196.0 0.9548 0.9339 590.0 639.0 1267.0 0.5043 0.4657
0.0064 13.0 130 1.8761 0.0059 6698.9090 4643.3299 1809.0 2475.0 0.7309 1772.0 0.7160 1102.0 1114.0 1196.0 0.9314 0.9214 662.0 695.0 1267.0 0.5485 0.5225
0.0039 14.0 140 1.5355 0.0059 5482.8873 3800.4479 1901.0 2475.0 0.7681 1902.0 0.7685 1031.0 1036.0 1196.0 0.8662 0.8620 863.0 865.0 1267.0 0.6827 0.6811
0.0 15.0 150 1.9186 0.0059 6850.6854 4748.5333 1880.0 2475.0 0.7596 1876.0 0.7580 1058.0 1061.0 1196.0 0.8871 0.8846 810.0 819.0 1267.0 0.6464 0.6393
0.8402 16.0 160 1.8719 0.0059 6684.1083 4633.0708 1881.0 2475.0 0.76 1878.0 0.7588 1048.0 1052.0 1196.0 0.8796 0.8763 822.0 829.0 1267.0 0.6543 0.6488
0.0 17.0 170 1.8653 0.0059 6660.5351 4616.7311 1880.0 2475.0 0.7596 1875.0 0.7576 1046.0 1051.0 1196.0 0.8788 0.8746 821.0 829.0 1267.0 0.6543 0.6480
0.0 18.0 180 1.8660 0.0059 6662.8011 4618.3018 1883.0 2475.0 0.7608 1879.0 0.7592 1048.0 1052.0 1196.0 0.8796 0.8763 823.0 831.0 1267.0 0.6559 0.6496
0.0 19.0 190 1.8677 0.0059 6669.1108 4622.6753 1881.0 2475.0 0.76 1878.0 0.7588 1047.0 1051.0 1196.0 0.8788 0.8754 823.0 830.0 1267.0 0.6551 0.6496
0.8401 20.0 200 1.8674 0.0059 6667.9216 4621.8510 1879.0 2475.0 0.7592 1877.0 0.7584 1047.0 1051.0 1196.0 0.8788 0.8754 822.0 828.0 1267.0 0.6535 0.6488
0.0 21.0 210 1.8694 0.0059 6675.1438 4626.8571 1880.0 2475.0 0.7596 1877.0 0.7584 1047.0 1051.0 1196.0 0.8788 0.8754 822.0 829.0 1267.0 0.6543 0.6488
0.0 22.0 220 1.8717 0.0059 6683.3395 4632.5379 1884.0 2475.0 0.7612 1880.0 0.7596 1048.0 1052.0 1196.0 0.8796 0.8763 824.0 832.0 1267.0 0.6567 0.6504
0.0 23.0 230 1.8715 0.0059 6682.6701 4632.0739 1879.0 2475.0 0.7592 1876.0 0.7580 1047.0 1051.0 1196.0 0.8788 0.8754 821.0 828.0 1267.0 0.6535 0.6480
0.0 24.0 240 1.8731 0.0059 6688.0780 4635.8224 1882.0 2475.0 0.7604 1879.0 0.7592 1048.0 1052.0 1196.0 0.8796 0.8763 823.0 830.0 1267.0 0.6551 0.6496
0.8401 25.0 250 1.8756 0.0059 6697.1790 4642.1307 1881.0 2475.0 0.76 1877.0 0.7584 1049.0 1053.0 1196.0 0.8804 0.8771 820.0 828.0 1267.0 0.6535 0.6472
0.0 26.0 260 1.8749 0.0059 6694.5782 4640.3280 1883.0 2475.0 0.7608 1880.0 0.7596 1048.0 1052.0 1196.0 0.8796 0.8763 824.0 831.0 1267.0 0.6559 0.6504
0.0 27.0 270 1.8745 0.0059 6693.2298 4639.3933 1883.0 2475.0 0.7608 1880.0 0.7596 1049.0 1053.0 1196.0 0.8804 0.8771 823.0 830.0 1267.0 0.6551 0.6496
0.0 28.0 280 1.8745 0.0059 6693.1717 4639.3531 1881.0 2475.0 0.76 1877.0 0.7584 1048.0 1052.0 1196.0 0.8796 0.8763 821.0 829.0 1267.0 0.6543 0.6480
0.0 29.0 290 1.8742 0.0059 6692.1128 4638.6191 1879.0 2475.0 0.7592 1876.0 0.7580 1047.0 1051.0 1196.0 0.8788 0.8754 821.0 828.0 1267.0 0.6535 0.6480
0.8401 30.0 300 1.8717 0.0059 6683.2398 4632.4689 1882.0 2475.0 0.7604 1878.0 0.7588 1047.0 1051.0 1196.0 0.8788 0.8754 823.0 831.0 1267.0 0.6559 0.6496
0.0 31.0 310 1.8724 0.0059 6685.8100 4634.2504 1884.0 2475.0 0.7612 1882.0 0.7604 1049.0 1053.0 1196.0 0.8804 0.8771 825.0 831.0 1267.0 0.6559 0.6511
0.0 32.0 320 1.8741 0.0059 6691.6236 4638.2800 1880.0 2475.0 0.7596 1877.0 0.7584 1047.0 1051.0 1196.0 0.8788 0.8754 822.0 829.0 1267.0 0.6543 0.6488
0.0 33.0 330 1.8752 0.0059 6695.7522 4641.1417 1881.0 2475.0 0.76 1877.0 0.7584 1046.0 1050.0 1196.0 0.8779 0.8746 823.0 831.0 1267.0 0.6559 0.6496
0.0 34.0 340 1.8739 0.0059 6691.0397 4637.8753 1883.0 2475.0 0.7608 1880.0 0.7596 1049.0 1053.0 1196.0 0.8804 0.8771 823.0 830.0 1267.0 0.6551 0.6496
0.0 35.0 350 1.8770 0.0059 6702.0900 4645.5348 1880.0 2475.0 0.7596 1876.0 0.7580 1046.0 1050.0 1196.0 0.8779 0.8746 822.0 830.0 1267.0 0.6551 0.6488
0.8401 36.0 360 1.8746 0.0059 6693.5160 4639.5917 1881.0 2475.0 0.76 1878.0 0.7588 1046.0 1050.0 1196.0 0.8779 0.8746 824.0 831.0 1267.0 0.6559 0.6504
0.0 37.0 370 1.8764 0.0059 6699.8720 4643.9974 1882.0 2475.0 0.7604 1879.0 0.7592 1049.0 1053.0 1196.0 0.8804 0.8771 822.0 829.0 1267.0 0.6543 0.6488
0.0 38.0 380 1.8760 0.0059 6698.4345 4643.0010 1883.0 2475.0 0.7608 1880.0 0.7596 1049.0 1053.0 1196.0 0.8804 0.8771 823.0 830.0 1267.0 0.6551 0.6496
0.8401 39.0 390 1.8762 0.0059 6699.1595 4643.5035 1882.0 2475.0 0.7604 1876.0 0.7580 1047.0 1051.0 1196.0 0.8788 0.8754 821.0 831.0 1267.0 0.6559 0.6480
0.0 40.0 400 1.8766 0.0059 6700.5968 4644.4997 1880.0 2475.0 0.7596 1876.0 0.7580 1046.0 1050.0 1196.0 0.8779 0.8746 822.0 830.0 1267.0 0.6551 0.6488
0.0 41.0 410 1.8781 0.0059 6706.0725 4648.2953 1883.0 2475.0 0.7608 1880.0 0.7596 1048.0 1052.0 1196.0 0.8796 0.8763 824.0 831.0 1267.0 0.6559 0.6504
0.0 42.0 420 1.8763 0.0059 6699.8219 4643.9626 1882.0 2475.0 0.7604 1878.0 0.7588 1047.0 1051.0 1196.0 0.8788 0.8754 823.0 831.0 1267.0 0.6559 0.6496
0.0 43.0 430 1.8763 0.0059 6699.6423 4643.8382 1880.0 2475.0 0.7596 1876.0 0.7580 1048.0 1052.0 1196.0 0.8796 0.8763 820.0 828.0 1267.0 0.6535 0.6472
0.0 44.0 440 1.8772 0.0059 6702.7796 4646.0128 1881.0 2475.0 0.76 1878.0 0.7588 1046.0 1050.0 1196.0 0.8779 0.8746 824.0 831.0 1267.0 0.6559 0.6504

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-nbi456jq

Finetuned
(903)
this model