GSM8K-Binary_Llama-3.2-1B-jevfwxa5

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3576
  • Model Preparation Time: 0.0059
  • Mdl: 4847.5727
  • Accumulated Loss: 3360.0814
  • Correct Preds: 1896.0
  • Total Preds: 2475.0
  • Accuracy: 0.7661
  • Correct Gen Preds: 1904.0
  • Gen Accuracy: 0.7693
  • Correct Gen Preds 34192: 961.0
  • Correct Preds 34192: 961.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.8035
  • Gen Accuracy 34192: 0.8035
  • Correct Gen Preds 41568: 935.0
  • Correct Preds 41568: 935.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.7380
  • Gen Accuracy 41568: 0.7380

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0059 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
0.872 1.0 8 0.7247 0.0059 2587.5709 1793.5675 1455.0 2475.0 0.5879 8.0 0.0032 0.0 481.0 1196.0 0.4022 0.0 0.0 974.0 1267.0 0.7687 0.0
0.6521 2.0 16 0.6294 0.0059 2247.2598 1557.6818 1770.0 2475.0 0.7152 10.0 0.0040 0.0 912.0 1196.0 0.7625 0.0 2.0 858.0 1267.0 0.6772 0.0016
0.473 3.0 24 0.7213 0.0059 2575.3764 1785.1149 1616.0 2475.0 0.6529 14.0 0.0057 0.0 1160.0 1196.0 0.9699 0.0 6.0 456.0 1267.0 0.3599 0.0047
0.2653 4.0 32 0.5926 0.0059 2115.9963 1466.6969 1824.0 2475.0 0.7370 309.0 0.1248 70.0 1037.0 1196.0 0.8671 0.0585 231.0 787.0 1267.0 0.6212 0.1823
1.1097 5.0 40 0.7075 0.0059 2526.3900 1751.1601 1843.0 2475.0 0.7446 528.0 0.2133 39.0 823.0 1196.0 0.6881 0.0326 481.0 1020.0 1267.0 0.8051 0.3796
0.1011 6.0 48 0.7255 0.0059 2590.4327 1795.5511 1866.0 2475.0 0.7539 990.0 0.4 552.0 1056.0 1196.0 0.8829 0.4615 430.0 810.0 1267.0 0.6393 0.3394
0.0076 7.0 56 0.9199 0.0059 3284.7845 2276.8391 1863.0 2475.0 0.7527 1675.0 0.6768 923.0 998.0 1196.0 0.8344 0.7717 744.0 865.0 1267.0 0.6827 0.5872
0.1554 8.0 64 2.4840 0.0059 8869.4230 6147.8155 1658.0 2475.0 0.6699 1518.0 0.6133 1106.0 1160.0 1196.0 0.9699 0.9247 405.0 498.0 1267.0 0.3931 0.3197
0.0043 9.0 72 1.8407 0.0059 6572.5429 4555.7396 1833.0 2475.0 0.7406 1810.0 0.7313 1104.0 1109.0 1196.0 0.9273 0.9231 698.0 724.0 1267.0 0.5714 0.5509
0.0012 10.0 80 1.3940 0.0059 4977.4916 3450.1343 1856.0 2475.0 0.7499 1834.0 0.7410 1030.0 1040.0 1196.0 0.8696 0.8612 796.0 816.0 1267.0 0.6440 0.6283
1.8099 11.0 88 1.1367 0.0059 4058.7883 2813.3377 1833.0 2475.0 0.7406 1769.0 0.7147 754.0 784.0 1196.0 0.6555 0.6304 1007.0 1049.0 1267.0 0.8279 0.7948
0.0013 12.0 96 1.4702 0.0059 5249.6676 3638.7923 1824.0 2475.0 0.7370 1802.0 0.7281 1073.0 1078.0 1196.0 0.9013 0.8972 721.0 746.0 1267.0 0.5888 0.5691
0.9049 13.0 104 1.2923 0.0059 4614.4626 3198.5017 1882.0 2475.0 0.7604 1890.0 0.7636 931.0 931.0 1196.0 0.7784 0.7784 951.0 951.0 1267.0 0.7506 0.7506
0.0 14.0 112 1.3205 0.0059 4714.9289 3268.1397 1883.0 2475.0 0.7608 1891.0 0.7640 909.0 909.0 1196.0 0.7600 0.7600 974.0 974.0 1267.0 0.7687 0.7687
0.9048 15.0 120 1.3371 0.0059 4774.4603 3309.4037 1890.0 2475.0 0.7636 1898.0 0.7669 940.0 940.0 1196.0 0.7860 0.7860 950.0 950.0 1267.0 0.7498 0.7498
0.9048 16.0 128 1.3482 0.0059 4813.9970 3336.8084 1890.0 2475.0 0.7636 1898.0 0.7669 948.0 948.0 1196.0 0.7926 0.7926 942.0 942.0 1267.0 0.7435 0.7435
0.0 17.0 136 1.3548 0.0059 4837.6611 3353.2112 1889.0 2475.0 0.7632 1897.0 0.7665 955.0 955.0 1196.0 0.7985 0.7985 934.0 934.0 1267.0 0.7372 0.7372
0.9048 18.0 144 1.3576 0.0059 4847.5727 3360.0814 1896.0 2475.0 0.7661 1904.0 0.7693 961.0 961.0 1196.0 0.8035 0.8035 935.0 935.0 1267.0 0.7380 0.7380
0.0 19.0 152 1.3615 0.0059 4861.3027 3369.5983 1890.0 2475.0 0.7636 1898.0 0.7669 958.0 958.0 1196.0 0.8010 0.8010 932.0 932.0 1267.0 0.7356 0.7356
0.9048 20.0 160 1.3636 0.0059 4868.9856 3374.9237 1890.0 2475.0 0.7636 1898.0 0.7669 962.0 962.0 1196.0 0.8043 0.8043 928.0 928.0 1267.0 0.7324 0.7324
0.0 21.0 168 1.3645 0.0059 4872.2361 3377.1767 1892.0 2475.0 0.7644 1900.0 0.7677 963.0 963.0 1196.0 0.8052 0.8052 929.0 929.0 1267.0 0.7332 0.7332
0.0 22.0 176 1.3666 0.0059 4879.8524 3382.4559 1887.0 2475.0 0.7624 1895.0 0.7657 961.0 961.0 1196.0 0.8035 0.8035 926.0 926.0 1267.0 0.7309 0.7309
0.0 23.0 184 1.3683 0.0059 4885.5762 3386.4234 1890.0 2475.0 0.7636 1898.0 0.7669 967.0 967.0 1196.0 0.8085 0.8085 923.0 923.0 1267.0 0.7285 0.7285
0.0 24.0 192 1.3706 0.0059 4893.9476 3392.2260 1884.0 2475.0 0.7612 1892.0 0.7644 966.0 966.0 1196.0 0.8077 0.8077 918.0 918.0 1267.0 0.7245 0.7245
0.0 25.0 200 1.3715 0.0059 4897.1641 3394.4555 1894.0 2475.0 0.7653 1902.0 0.7685 969.0 969.0 1196.0 0.8102 0.8102 925.0 925.0 1267.0 0.7301 0.7301
0.0 26.0 208 1.3739 0.0059 4905.6882 3400.3640 1889.0 2475.0 0.7632 1897.0 0.7665 966.0 966.0 1196.0 0.8077 0.8077 923.0 923.0 1267.0 0.7285 0.7285
0.0 27.0 216 1.3761 0.0059 4913.4331 3405.7323 1885.0 2475.0 0.7616 1893.0 0.7648 967.0 967.0 1196.0 0.8085 0.8085 918.0 918.0 1267.0 0.7245 0.7245
0.0 28.0 224 1.3770 0.0059 4916.9617 3408.1781 1891.0 2475.0 0.7640 1899.0 0.7673 971.0 971.0 1196.0 0.8119 0.8119 920.0 920.0 1267.0 0.7261 0.7261
0.0 29.0 232 1.3771 0.0059 4917.0011 3408.2055 1887.0 2475.0 0.7624 1895.0 0.7657 970.0 970.0 1196.0 0.8110 0.8110 917.0 917.0 1267.0 0.7238 0.7238
0.0 30.0 240 1.3800 0.0059 4927.4709 3415.4626 1892.0 2475.0 0.7644 1900.0 0.7677 975.0 975.0 1196.0 0.8152 0.8152 917.0 917.0 1267.0 0.7238 0.7238
0.0 31.0 248 1.3791 0.0059 4924.1567 3413.1653 1892.0 2475.0 0.7644 1900.0 0.7677 972.0 972.0 1196.0 0.8127 0.8127 920.0 920.0 1267.0 0.7261 0.7261
0.0 32.0 256 1.3816 0.0059 4933.1472 3419.3971 1889.0 2475.0 0.7632 1897.0 0.7665 973.0 973.0 1196.0 0.8135 0.8135 916.0 916.0 1267.0 0.7230 0.7230
0.0 33.0 264 1.3827 0.0059 4937.3087 3422.2816 1888.0 2475.0 0.7628 1896.0 0.7661 975.0 975.0 1196.0 0.8152 0.8152 913.0 913.0 1267.0 0.7206 0.7206
0.0 34.0 272 1.3832 0.0059 4938.8121 3423.3237 1889.0 2475.0 0.7632 1897.0 0.7665 974.0 974.0 1196.0 0.8144 0.8144 915.0 915.0 1267.0 0.7222 0.7222
0.0 35.0 280 1.3851 0.0059 4945.8881 3428.2284 1891.0 2475.0 0.7640 1899.0 0.7673 978.0 978.0 1196.0 0.8177 0.8177 913.0 913.0 1267.0 0.7206 0.7206
0.0 36.0 288 1.3842 0.0059 4942.6233 3425.9654 1892.0 2475.0 0.7644 1900.0 0.7677 977.0 977.0 1196.0 0.8169 0.8169 915.0 915.0 1267.0 0.7222 0.7222
0.0 37.0 296 1.3861 0.0059 4949.1341 3430.4784 1886.0 2475.0 0.7620 1894.0 0.7653 975.0 975.0 1196.0 0.8152 0.8152 911.0 911.0 1267.0 0.7190 0.7190
0.0 38.0 304 1.3866 0.0059 4951.1992 3431.9098 1891.0 2475.0 0.7640 1899.0 0.7673 976.0 976.0 1196.0 0.8161 0.8161 915.0 915.0 1267.0 0.7222 0.7222
0.0 39.0 312 1.3877 0.0059 4954.8532 3434.4425 1891.0 2475.0 0.7640 1899.0 0.7673 979.0 979.0 1196.0 0.8186 0.8186 912.0 912.0 1267.0 0.7198 0.7198
0.9048 40.0 320 1.3879 0.0059 4955.7900 3435.0919 1891.0 2475.0 0.7640 1899.0 0.7673 978.0 978.0 1196.0 0.8177 0.8177 913.0 913.0 1267.0 0.7206 0.7206
0.0 41.0 328 1.3896 0.0059 4961.8652 3439.3029 1891.0 2475.0 0.7640 1899.0 0.7673 978.0 978.0 1196.0 0.8177 0.8177 913.0 913.0 1267.0 0.7206 0.7206
0.0 42.0 336 1.3883 0.0059 4957.0459 3435.9624 1891.0 2475.0 0.7640 1899.0 0.7673 980.0 980.0 1196.0 0.8194 0.8194 911.0 911.0 1267.0 0.7190 0.7190
0.0 43.0 344 1.3898 0.0059 4962.4755 3439.7259 1889.0 2475.0 0.7632 1897.0 0.7665 979.0 979.0 1196.0 0.8186 0.8186 910.0 910.0 1267.0 0.7182 0.7182
0.0 44.0 352 1.3905 0.0059 4965.0345 3441.4997 1889.0 2475.0 0.7632 1897.0 0.7665 980.0 980.0 1196.0 0.8194 0.8194 909.0 909.0 1267.0 0.7174 0.7174
0.0 45.0 360 1.3905 0.0059 4964.9209 3441.4209 1891.0 2475.0 0.7640 1899.0 0.7673 981.0 981.0 1196.0 0.8202 0.8202 910.0 910.0 1267.0 0.7182 0.7182
0.0 46.0 368 1.3915 0.0059 4968.4294 3443.8529 1891.0 2475.0 0.7640 1899.0 0.7673 980.0 980.0 1196.0 0.8194 0.8194 911.0 911.0 1267.0 0.7190 0.7190
0.0 47.0 376 1.3918 0.0059 4969.5671 3444.6414 1889.0 2475.0 0.7632 1897.0 0.7665 981.0 981.0 1196.0 0.8202 0.8202 908.0 908.0 1267.0 0.7167 0.7167
0.0 48.0 384 1.3916 0.0059 4968.9056 3444.1829 1890.0 2475.0 0.7636 1897.0 0.7665 980.0 981.0 1196.0 0.8202 0.8194 909.0 909.0 1267.0 0.7174 0.7174

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-jevfwxa5

Finetuned
(903)
this model