GSM8K-Binary_Llama-3.2-1B-cmatu09u

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3294
  • Model Preparation Time: 0.0057
  • Mdl: 4746.8979
  • Accumulated Loss: 3290.2989
  • Correct Preds: 1929.0
  • Total Preds: 2475.0
  • Accuracy: 0.7794
  • Correct Gen Preds: 1897.0
  • Gen Accuracy: 0.7665
  • Correct Gen Preds 34192: 1020.0
  • Correct Preds 34192: 1040.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.8696
  • Gen Accuracy 34192: 0.8528
  • Correct Gen Preds 41568: 869.0
  • Correct Preds 41568: 889.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.7017
  • Gen Accuracy 41568: 0.6859

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0057 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
0.7705 1.0 20 0.7229 0.0057 2581.2769 1789.2048 1387.0 2475.0 0.5604 1395.0 0.5636 151.0 151.0 1196.0 0.1263 0.1263 1236.0 1236.0 1267.0 0.9755 0.9755
0.7361 2.0 40 0.8157 0.0057 2912.7554 2018.9682 1357.0 2475.0 0.5483 12.0 0.0048 0.0 105.0 1196.0 0.0878 0.0 3.0 1252.0 1267.0 0.9882 0.0024
0.7985 3.0 60 0.6055 0.0057 2161.9344 1498.5388 1805.0 2475.0 0.7293 9.0 0.0036 0.0 697.0 1196.0 0.5828 0.0 0.0 1108.0 1267.0 0.8745 0.0
1.027 4.0 80 0.6487 0.0057 2316.2232 1605.4836 1907.0 2475.0 0.7705 827.0 0.3341 504.0 1037.0 1196.0 0.8671 0.4214 315.0 870.0 1267.0 0.6867 0.2486
1.0135 5.0 100 0.8148 0.0057 2909.2559 2016.5426 1876.0 2475.0 0.7580 444.0 0.1794 267.0 1061.0 1196.0 0.8871 0.2232 169.0 815.0 1267.0 0.6433 0.1334
0.6827 6.0 120 1.0484 0.0057 3743.6020 2594.8672 1910.0 2475.0 0.7717 1146.0 0.4630 604.0 1013.0 1196.0 0.8470 0.5050 534.0 897.0 1267.0 0.7080 0.4215
0.494 7.0 140 1.1964 0.0057 4272.1202 2961.2080 1753.0 2475.0 0.7083 1133.0 0.4578 802.0 1130.0 1196.0 0.9448 0.6706 322.0 623.0 1267.0 0.4917 0.2541
0.0061 8.0 160 1.0888 0.0057 3887.6479 2694.7122 1891.0 2475.0 0.7640 1312.0 0.5301 750.0 1062.0 1196.0 0.8880 0.6271 554.0 829.0 1267.0 0.6543 0.4373
0.0001 9.0 180 1.2353 0.0057 4410.9305 3057.4240 1894.0 2475.0 0.7653 1628.0 0.6578 928.0 1059.0 1196.0 0.8855 0.7759 690.0 835.0 1267.0 0.6590 0.5446
0.0352 10.0 200 1.3294 0.0057 4746.8979 3290.2989 1929.0 2475.0 0.7794 1897.0 0.7665 1020.0 1040.0 1196.0 0.8696 0.8528 869.0 889.0 1267.0 0.7017 0.6859
0.0002 11.0 220 1.6740 0.0057 5977.3293 4143.1690 1873.0 2475.0 0.7568 1862.0 0.7523 1066.0 1070.0 1196.0 0.8946 0.8913 788.0 803.0 1267.0 0.6338 0.6219
0.0 12.0 240 1.6150 0.0057 5766.6040 3997.1053 1899.0 2475.0 0.7673 1890.0 0.7636 1039.0 1044.0 1196.0 0.8729 0.8687 843.0 855.0 1267.0 0.6748 0.6654
0.0 13.0 260 1.5999 0.0057 5712.7643 3959.7865 1901.0 2475.0 0.7681 1893.0 0.7648 1034.0 1039.0 1196.0 0.8687 0.8645 851.0 862.0 1267.0 0.6803 0.6717
0.0 14.0 280 1.5964 0.0057 5700.2985 3951.1459 1900.0 2475.0 0.7677 1893.0 0.7648 1032.0 1037.0 1196.0 0.8671 0.8629 853.0 863.0 1267.0 0.6811 0.6732
0.0 15.0 300 1.5956 0.0057 5697.2293 3949.0184 1900.0 2475.0 0.7677 1893.0 0.7648 1031.0 1036.0 1196.0 0.8662 0.8620 854.0 864.0 1267.0 0.6819 0.6740
0.0 16.0 320 1.5949 0.0057 5694.9231 3947.4199 1903.0 2475.0 0.7689 1895.0 0.7657 1031.0 1036.0 1196.0 0.8662 0.8620 856.0 867.0 1267.0 0.6843 0.6756
0.0 17.0 340 1.5931 0.0057 5688.4595 3942.9397 1903.0 2475.0 0.7689 1895.0 0.7657 1030.0 1035.0 1196.0 0.8654 0.8612 857.0 868.0 1267.0 0.6851 0.6764
0.0 18.0 360 1.5933 0.0057 5689.2287 3943.4728 1898.0 2475.0 0.7669 1892.0 0.7644 1028.0 1033.0 1196.0 0.8637 0.8595 856.0 865.0 1267.0 0.6827 0.6756
0.0 19.0 380 1.5946 0.0057 5693.9438 3946.7411 1898.0 2475.0 0.7669 1892.0 0.7644 1028.0 1033.0 1196.0 0.8637 0.8595 856.0 865.0 1267.0 0.6827 0.6756
0.0 20.0 400 1.5934 0.0057 5689.5182 3943.6735 1899.0 2475.0 0.7673 1893.0 0.7648 1026.0 1031.0 1196.0 0.8620 0.8579 859.0 868.0 1267.0 0.6851 0.6780
0.0 21.0 420 1.5928 0.0057 5687.2881 3942.1277 1900.0 2475.0 0.7677 1893.0 0.7648 1029.0 1033.0 1196.0 0.8637 0.8604 856.0 867.0 1267.0 0.6843 0.6756
0.0 22.0 440 1.5908 0.0057 5680.3600 3937.3255 1904.0 2475.0 0.7693 1897.0 0.7665 1028.0 1033.0 1196.0 0.8637 0.8595 861.0 871.0 1267.0 0.6875 0.6796
0.0 23.0 460 1.5912 0.0057 5681.5422 3938.1450 1903.0 2475.0 0.7689 1896.0 0.7661 1027.0 1032.0 1196.0 0.8629 0.8587 861.0 871.0 1267.0 0.6875 0.6796
0.0 24.0 480 1.5899 0.0057 5676.8926 3934.9221 1902.0 2475.0 0.7685 1895.0 0.7657 1026.0 1031.0 1196.0 0.8620 0.8579 861.0 871.0 1267.0 0.6875 0.6796
0.0 25.0 500 1.5925 0.0057 5686.1677 3941.3511 1898.0 2475.0 0.7669 1891.0 0.7640 1023.0 1028.0 1196.0 0.8595 0.8554 860.0 870.0 1267.0 0.6867 0.6788
0.0 26.0 520 1.5916 0.0057 5682.9516 3939.1219 1899.0 2475.0 0.7673 1893.0 0.7648 1023.0 1028.0 1196.0 0.8595 0.8554 862.0 871.0 1267.0 0.6875 0.6803
0.0 27.0 540 1.5918 0.0057 5683.6733 3939.6221 1906.0 2475.0 0.7701 1900.0 0.7677 1026.0 1031.0 1196.0 0.8620 0.8579 866.0 875.0 1267.0 0.6906 0.6835
0.0 28.0 560 1.5920 0.0057 5684.3380 3940.0829 1902.0 2475.0 0.7685 1896.0 0.7661 1024.0 1028.0 1196.0 0.8595 0.8562 864.0 874.0 1267.0 0.6898 0.6819
0.0 29.0 580 1.5914 0.0057 5682.5389 3938.8358 1905.0 2475.0 0.7697 1899.0 0.7673 1024.0 1029.0 1196.0 0.8604 0.8562 867.0 876.0 1267.0 0.6914 0.6843
0.6534 30.0 600 1.5917 0.0057 5683.5174 3939.5141 1903.0 2475.0 0.7689 1898.0 0.7669 1024.0 1028.0 1196.0 0.8595 0.8562 866.0 875.0 1267.0 0.6906 0.6835
0.6535 31.0 620 1.5925 0.0057 5686.4309 3941.5335 1904.0 2475.0 0.7693 1899.0 0.7673 1022.0 1027.0 1196.0 0.8587 0.8545 869.0 877.0 1267.0 0.6922 0.6859
0.0 32.0 640 1.5930 0.0057 5688.0845 3942.6797 1904.0 2475.0 0.7693 1898.0 0.7669 1024.0 1029.0 1196.0 0.8604 0.8562 866.0 875.0 1267.0 0.6906 0.6835
0.0 33.0 660 1.5921 0.0057 5684.7574 3940.3736 1908.0 2475.0 0.7709 1902.0 0.7685 1024.0 1028.0 1196.0 0.8595 0.8562 870.0 880.0 1267.0 0.6946 0.6867
0.0 34.0 680 1.5929 0.0057 5687.8234 3942.4988 1904.0 2475.0 0.7693 1898.0 0.7669 1023.0 1028.0 1196.0 0.8595 0.8554 867.0 876.0 1267.0 0.6914 0.6843
0.0 35.0 700 1.5931 0.0057 5688.4900 3942.9608 1907.0 2475.0 0.7705 1902.0 0.7685 1024.0 1029.0 1196.0 0.8604 0.8562 870.0 878.0 1267.0 0.6930 0.6867
0.0 36.0 720 1.5928 0.0057 5687.3227 3942.1517 1902.0 2475.0 0.7685 1896.0 0.7661 1021.0 1026.0 1196.0 0.8579 0.8537 867.0 876.0 1267.0 0.6914 0.6843

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-cmatu09u

Finetuned
(903)
this model