GSM8K-Binary_Llama-3.2-1B-t1es2358

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6948
  • Model Preparation Time: 0.0057
  • Mdl: 6051.5793
  • Accumulated Loss: 4194.6351
  • Correct Preds: 1929.0
  • Total Preds: 2475.0
  • Accuracy: 0.7794
  • Correct Gen Preds: 1931.0
  • Gen Accuracy: 0.7802
  • Correct Gen Preds 34192: 1030.0
  • Correct Preds 34192: 1033.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.8637
  • Gen Accuracy 34192: 0.8612
  • Correct Gen Preds 41568: 893.0
  • Correct Preds 41568: 896.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.7072
  • Gen Accuracy 41568: 0.7048

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0057 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
0.6955 1.0 16 2.7154 0.0057 9695.7816 6720.6037 1267.0 2475.0 0.5119 792.0 0.32 0.0 0.0 1196.0 0.0 0.0 784.0 1267.0 1267.0 1.0 0.6188
0.533 2.0 32 0.6027 0.0057 2152.0001 1491.6528 1775.0 2475.0 0.7172 61.0 0.0246 0.0 775.0 1196.0 0.6480 0.0 53.0 1000.0 1267.0 0.7893 0.0418
1.3715 3.0 48 0.5791 0.0057 2067.9529 1433.3957 1822.0 2475.0 0.7362 109.0 0.0440 0.0 833.0 1196.0 0.6965 0.0 101.0 989.0 1267.0 0.7806 0.0797
0.2271 4.0 64 0.9926 0.0057 3544.4020 2456.7922 1577.0 2475.0 0.6372 768.0 0.3103 14.0 363.0 1196.0 0.3035 0.0117 746.0 1214.0 1267.0 0.9582 0.5888
0.2091 5.0 80 0.8711 0.0057 3110.5432 2156.0643 1729.0 2475.0 0.6986 514.0 0.2077 21.0 589.0 1196.0 0.4925 0.0176 485.0 1140.0 1267.0 0.8998 0.3828
0.2002 6.0 96 0.8098 0.0057 2891.4579 2004.2059 1784.0 2475.0 0.7208 758.0 0.3063 196.0 694.0 1196.0 0.5803 0.1639 554.0 1090.0 1267.0 0.8603 0.4373
2.1977 7.0 112 0.8017 0.0057 2862.7237 1984.2889 1898.0 2475.0 0.7669 1637.0 0.6614 732.0 932.0 1196.0 0.7793 0.6120 897.0 966.0 1267.0 0.7624 0.7080
0.0003 8.0 128 1.1076 0.0057 3954.8343 2741.2823 1916.0 2475.0 0.7741 1823.0 0.7366 918.0 987.0 1196.0 0.8253 0.7676 897.0 929.0 1267.0 0.7332 0.7080
0.0011 9.0 144 1.3468 0.0057 4808.8412 3333.2347 1907.0 2475.0 0.7705 1869.0 0.7552 1035.0 1060.0 1196.0 0.8863 0.8654 827.0 847.0 1267.0 0.6685 0.6527
0.0001 10.0 160 1.4107 0.0057 5037.1772 3491.5052 1923.0 2475.0 0.7770 1910.0 0.7717 955.0 966.0 1196.0 0.8077 0.7985 947.0 957.0 1267.0 0.7553 0.7474
0.0001 11.0 176 1.4842 0.0057 5299.6148 3673.4131 1868.0 2475.0 0.7547 1862.0 0.7523 835.0 842.0 1196.0 0.7040 0.6982 1019.0 1026.0 1267.0 0.8098 0.8043
0.0 12.0 192 1.5446 0.0057 5515.1040 3822.7788 1892.0 2475.0 0.7644 1895.0 0.7657 1065.0 1067.0 1196.0 0.8921 0.8905 823.0 825.0 1267.0 0.6511 0.6496
0.0 13.0 208 1.5332 0.0057 5474.7240 3794.7895 1870.0 2475.0 0.7556 1876.0 0.7580 792.0 792.0 1196.0 0.6622 0.6622 1077.0 1078.0 1267.0 0.8508 0.8500
0.0 14.0 224 1.9613 0.0057 7003.1659 4854.2247 1848.0 2475.0 0.7467 1840.0 0.7434 1100.0 1110.0 1196.0 0.9281 0.9197 733.0 738.0 1267.0 0.5825 0.5785
0.0 15.0 240 1.4236 0.0057 5083.1904 3523.3991 1917.0 2475.0 0.7745 1916.0 0.7741 916.0 921.0 1196.0 0.7701 0.7659 992.0 996.0 1267.0 0.7861 0.7830
1.0693 16.0 256 2.0197 0.0057 7211.7473 4998.8023 1853.0 2475.0 0.7487 1846.0 0.7459 1106.0 1116.0 1196.0 0.9331 0.9247 733.0 737.0 1267.0 0.5817 0.5785
0.0 17.0 272 1.7188 0.0057 6137.2676 4254.0297 1916.0 2475.0 0.7741 1919.0 0.7754 1042.0 1046.0 1196.0 0.8746 0.8712 869.0 870.0 1267.0 0.6867 0.6859
0.0 18.0 288 1.6905 0.0057 6036.3903 4184.1069 1928.0 2475.0 0.7790 1930.0 0.7798 1028.0 1032.0 1196.0 0.8629 0.8595 894.0 896.0 1267.0 0.7072 0.7056
0.0 19.0 304 1.6884 0.0057 6028.7578 4178.8164 1928.0 2475.0 0.7790 1930.0 0.7798 1027.0 1031.0 1196.0 0.8620 0.8587 895.0 897.0 1267.0 0.7080 0.7064
1.0693 20.0 320 1.6903 0.0057 6035.5522 4183.5260 1927.0 2475.0 0.7786 1929.0 0.7794 1028.0 1032.0 1196.0 0.8629 0.8595 893.0 895.0 1267.0 0.7064 0.7048
0.0 21.0 336 1.6904 0.0057 6035.8311 4183.7193 1926.0 2475.0 0.7782 1928.0 0.7790 1028.0 1032.0 1196.0 0.8629 0.8595 892.0 894.0 1267.0 0.7056 0.7040
0.0 22.0 352 1.6915 0.0057 6039.6899 4186.3941 1927.0 2475.0 0.7786 1929.0 0.7794 1029.0 1033.0 1196.0 0.8637 0.8604 892.0 894.0 1267.0 0.7056 0.7040
0.0 23.0 368 1.6897 0.0057 6033.4480 4182.0675 1926.0 2475.0 0.7782 1928.0 0.7790 1028.0 1032.0 1196.0 0.8629 0.8595 892.0 894.0 1267.0 0.7056 0.7040
0.0 24.0 384 1.6894 0.0057 6032.2481 4181.2357 1925.0 2475.0 0.7778 1927.0 0.7786 1027.0 1031.0 1196.0 0.8620 0.8587 892.0 894.0 1267.0 0.7056 0.7040
0.0 25.0 400 1.6915 0.0057 6039.9660 4186.5854 1926.0 2475.0 0.7782 1928.0 0.7790 1028.0 1032.0 1196.0 0.8629 0.8595 892.0 894.0 1267.0 0.7056 0.7040
0.0 26.0 416 1.6913 0.0057 6039.1520 4186.0212 1926.0 2475.0 0.7782 1928.0 0.7790 1027.0 1031.0 1196.0 0.8620 0.8587 893.0 895.0 1267.0 0.7064 0.7048
0.0 27.0 432 1.6897 0.0057 6033.4235 4182.0505 1925.0 2475.0 0.7778 1927.0 0.7786 1028.0 1032.0 1196.0 0.8629 0.8595 891.0 893.0 1267.0 0.7048 0.7032
0.0 28.0 448 1.6935 0.0057 6046.8524 4191.3587 1925.0 2475.0 0.7778 1927.0 0.7786 1027.0 1031.0 1196.0 0.8620 0.8587 892.0 894.0 1267.0 0.7056 0.7040
0.0 29.0 464 1.6911 0.0057 6038.2717 4185.4110 1927.0 2475.0 0.7786 1928.0 0.7790 1028.0 1032.0 1196.0 0.8629 0.8595 893.0 895.0 1267.0 0.7064 0.7048
0.0 30.0 480 1.6925 0.0057 6043.4707 4189.0147 1924.0 2475.0 0.7774 1926.0 0.7782 1027.0 1031.0 1196.0 0.8620 0.8587 891.0 893.0 1267.0 0.7048 0.7032
0.0 31.0 496 1.6919 0.0057 6041.0498 4187.3366 1927.0 2475.0 0.7786 1929.0 0.7794 1029.0 1033.0 1196.0 0.8637 0.8604 892.0 894.0 1267.0 0.7056 0.7040
0.0 32.0 512 1.6942 0.0057 6049.2594 4193.0271 1926.0 2475.0 0.7782 1929.0 0.7794 1030.0 1033.0 1196.0 0.8637 0.8612 891.0 893.0 1267.0 0.7048 0.7032
0.0 33.0 528 1.6910 0.0057 6038.0633 4185.2665 1925.0 2475.0 0.7778 1927.0 0.7786 1028.0 1032.0 1196.0 0.8629 0.8595 891.0 893.0 1267.0 0.7048 0.7032
0.0 34.0 544 1.6925 0.0057 6043.4894 4189.0276 1926.0 2475.0 0.7782 1928.0 0.7790 1028.0 1032.0 1196.0 0.8629 0.8595 892.0 894.0 1267.0 0.7056 0.7040
0.0 35.0 560 1.6927 0.0057 6044.0617 4189.4243 1926.0 2475.0 0.7782 1929.0 0.7794 1030.0 1033.0 1196.0 0.8637 0.8612 891.0 893.0 1267.0 0.7048 0.7032
0.0 36.0 576 1.6948 0.0057 6051.5793 4194.6351 1929.0 2475.0 0.7794 1931.0 0.7802 1030.0 1033.0 1196.0 0.8637 0.8612 893.0 896.0 1267.0 0.7072 0.7048
0.0 37.0 592 1.6934 0.0057 6046.4448 4191.0762 1929.0 2475.0 0.7794 1931.0 0.7802 1030.0 1034.0 1196.0 0.8645 0.8612 893.0 895.0 1267.0 0.7064 0.7048
1.0693 38.0 608 1.6932 0.0057 6045.8882 4190.6903 1929.0 2475.0 0.7794 1931.0 0.7802 1031.0 1035.0 1196.0 0.8654 0.8620 892.0 894.0 1267.0 0.7056 0.7040
0.0 39.0 624 1.6938 0.0057 6048.0484 4192.1877 1927.0 2475.0 0.7786 1929.0 0.7794 1028.0 1032.0 1196.0 0.8629 0.8595 893.0 895.0 1267.0 0.7064 0.7048
1.0693 40.0 640 1.6934 0.0057 6046.4959 4191.1116 1925.0 2475.0 0.7778 1927.0 0.7786 1029.0 1033.0 1196.0 0.8637 0.8604 890.0 892.0 1267.0 0.7040 0.7024
2.1385 41.0 656 1.6937 0.0057 6047.7154 4191.9569 1926.0 2475.0 0.7782 1928.0 0.7790 1028.0 1032.0 1196.0 0.8629 0.8595 892.0 894.0 1267.0 0.7056 0.7040
0.0 42.0 672 1.6953 0.0057 6053.1946 4195.7548 1927.0 2475.0 0.7786 1929.0 0.7794 1028.0 1032.0 1196.0 0.8629 0.8595 893.0 895.0 1267.0 0.7064 0.7048
0.0 43.0 688 1.6933 0.0057 6046.0399 4190.7955 1926.0 2475.0 0.7782 1928.0 0.7790 1028.0 1032.0 1196.0 0.8629 0.8595 892.0 894.0 1267.0 0.7056 0.7040
0.0 44.0 704 1.6956 0.0057 6054.3461 4196.5529 1924.0 2475.0 0.7774 1926.0 0.7782 1029.0 1033.0 1196.0 0.8637 0.8604 889.0 891.0 1267.0 0.7032 0.7017
0.0 45.0 720 1.6941 0.0057 6049.2096 4192.9925 1925.0 2475.0 0.7778 1927.0 0.7786 1027.0 1031.0 1196.0 0.8620 0.8587 892.0 894.0 1267.0 0.7056 0.7040
0.0 46.0 736 1.6948 0.0057 6051.6480 4194.6827 1923.0 2475.0 0.7770 1926.0 0.7782 1027.0 1030.0 1196.0 0.8612 0.8587 891.0 893.0 1267.0 0.7048 0.7032
0.0 47.0 752 1.6936 0.0057 6047.2010 4191.6003 1925.0 2475.0 0.7778 1928.0 0.7790 1029.0 1032.0 1196.0 0.8629 0.8604 891.0 893.0 1267.0 0.7048 0.7032
0.0 48.0 768 1.6951 0.0057 6052.5512 4195.3088 1925.0 2475.0 0.7778 1928.0 0.7790 1030.0 1033.0 1196.0 0.8637 0.8612 890.0 892.0 1267.0 0.7040 0.7024
0.0 49.0 784 1.6953 0.0057 6053.3328 4195.8506 1928.0 2475.0 0.7790 1930.0 0.7798 1030.0 1034.0 1196.0 0.8645 0.8612 892.0 894.0 1267.0 0.7056 0.7040
1.0693 50.0 800 1.6930 0.0057 6044.9666 4190.0515 1929.0 2475.0 0.7794 1931.0 0.7802 1029.0 1033.0 1196.0 0.8637 0.8604 894.0 896.0 1267.0 0.7072 0.7056
0.0 51.0 816 1.6963 0.0057 6057.0501 4198.4272 1925.0 2475.0 0.7778 1927.0 0.7786 1030.0 1034.0 1196.0 0.8645 0.8612 889.0 891.0 1267.0 0.7032 0.7017
1.0693 52.0 832 1.6956 0.0057 6054.2657 4196.4972 1929.0 2475.0 0.7794 1932.0 0.7806 1031.0 1034.0 1196.0 0.8645 0.8620 893.0 895.0 1267.0 0.7064 0.7048
0.0 53.0 848 1.6938 0.0057 6048.0179 4192.1666 1929.0 2475.0 0.7794 1931.0 0.7802 1030.0 1034.0 1196.0 0.8645 0.8612 893.0 895.0 1267.0 0.7064 0.7048
1.0693 54.0 864 1.6953 0.0057 6053.2072 4195.7635 1927.0 2475.0 0.7786 1928.0 0.7790 1028.0 1032.0 1196.0 0.8629 0.8595 893.0 895.0 1267.0 0.7064 0.7048
0.0 55.0 880 1.6969 0.0057 6059.0550 4199.8169 1921.0 2475.0 0.7762 1923.0 0.7770 1027.0 1031.0 1196.0 0.8620 0.8587 888.0 890.0 1267.0 0.7024 0.7009
1.0693 56.0 896 1.6933 0.0057 6046.1288 4190.8571 1925.0 2475.0 0.7778 1927.0 0.7786 1029.0 1033.0 1196.0 0.8637 0.8604 890.0 892.0 1267.0 0.7040 0.7024
0.0 57.0 912 1.6951 0.0057 6052.4662 4195.2499 1925.0 2475.0 0.7778 1927.0 0.7786 1029.0 1033.0 1196.0 0.8637 0.8604 890.0 892.0 1267.0 0.7040 0.7024
0.0 58.0 928 1.6964 0.0057 6057.2886 4198.5925 1925.0 2475.0 0.7778 1927.0 0.7786 1029.0 1033.0 1196.0 0.8637 0.8604 890.0 892.0 1267.0 0.7040 0.7024
1.0693 59.0 944 1.6958 0.0057 6055.0135 4197.0155 1927.0 2475.0 0.7786 1929.0 0.7794 1030.0 1034.0 1196.0 0.8645 0.8612 891.0 893.0 1267.0 0.7048 0.7032
0.0 60.0 960 1.6954 0.0057 6053.5372 4195.9922 1927.0 2475.0 0.7786 1930.0 0.7798 1031.0 1034.0 1196.0 0.8645 0.8620 891.0 893.0 1267.0 0.7048 0.7032
0.0 61.0 976 1.6950 0.0057 6052.3097 4195.1414 1927.0 2475.0 0.7786 1929.0 0.7794 1028.0 1032.0 1196.0 0.8629 0.8595 893.0 895.0 1267.0 0.7064 0.7048
0.0 62.0 992 1.6957 0.0057 6054.6858 4196.7884 1926.0 2475.0 0.7782 1928.0 0.7790 1028.0 1032.0 1196.0 0.8629 0.8595 892.0 894.0 1267.0 0.7056 0.7040
0.0 63.0 1008 1.6960 0.0057 6055.8022 4197.5622 1923.0 2475.0 0.7770 1925.0 0.7778 1027.0 1031.0 1196.0 0.8620 0.8587 890.0 892.0 1267.0 0.7040 0.7024
2.1385 64.0 1024 1.6980 0.0057 6062.8445 4202.4436 1925.0 2475.0 0.7778 1927.0 0.7786 1027.0 1031.0 1196.0 0.8620 0.8587 892.0 894.0 1267.0 0.7056 0.7040
0.0 65.0 1040 1.6945 0.0057 6050.4921 4193.8815 1927.0 2475.0 0.7786 1929.0 0.7794 1029.0 1033.0 1196.0 0.8637 0.8604 892.0 894.0 1267.0 0.7056 0.7040
0.0 66.0 1056 1.6963 0.0057 6057.0681 4198.4397 1926.0 2475.0 0.7782 1928.0 0.7790 1029.0 1033.0 1196.0 0.8637 0.8604 891.0 893.0 1267.0 0.7048 0.7032

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-t1es2358

Finetuned
(903)
this model