GSM8K-Binary_Llama-3.2-1B-kyohy420

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7333
  • Model Preparation Time: 0.0057
  • Mdl: 6189.0935
  • Accumulated Loss: 4289.9527
  • Correct Preds: 1966.0
  • Total Preds: 2475.0
  • Accuracy: 0.7943
  • Correct Gen Preds: 1919.0
  • Gen Accuracy: 0.7754
  • Correct Gen Preds 34192: 1033.0
  • Correct Preds 34192: 1051.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.8788
  • Gen Accuracy 34192: 0.8637
  • Correct Gen Preds 41568: 879.0
  • Correct Preds 41568: 915.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.7222
  • Gen Accuracy 41568: 0.6938

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.001
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0057 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
0.5871 1.0 41 0.5846 0.0057 2087.4184 1446.8881 1815.0 2475.0 0.7333 8.0 0.0032 0.0 796.0 1196.0 0.6656 0.0 0.0 1019.0 1267.0 0.8043 0.0
0.3259 2.0 82 0.5427 0.0057 1937.7495 1343.1456 1911.0 2475.0 0.7721 8.0 0.0032 1.0 957.0 1196.0 0.8002 0.0008 0.0 954.0 1267.0 0.7530 0.0
0.1383 3.0 123 0.6677 0.0057 2384.2568 1652.6409 1892.0 2475.0 0.7644 87.0 0.0352 19.0 845.0 1196.0 0.7065 0.0159 60.0 1047.0 1267.0 0.8264 0.0474
0.5615 4.0 164 0.9346 0.0057 3337.0389 2313.0591 1909.0 2475.0 0.7713 1042.0 0.4210 614.0 1035.0 1196.0 0.8654 0.5134 420.0 874.0 1267.0 0.6898 0.3315
0.0373 5.0 205 0.9718 0.0057 3470.0702 2405.2694 1938.0 2475.0 0.7830 1048.0 0.4234 528.0 975.0 1196.0 0.8152 0.4415 514.0 963.0 1267.0 0.7601 0.4057
0.0417 6.0 246 1.2990 0.0057 4638.2735 3215.0062 1895.0 2475.0 0.7657 1372.0 0.5543 553.0 873.0 1196.0 0.7299 0.4624 812.0 1022.0 1267.0 0.8066 0.6409
0.006 7.0 287 1.6147 0.0057 5765.6029 3996.4114 1942.0 2475.0 0.7846 1793.0 0.7244 962.0 1013.0 1196.0 0.8470 0.8043 823.0 929.0 1267.0 0.7332 0.6496
0.0 8.0 328 1.6999 0.0057 6069.9536 4207.3712 1965.0 2475.0 0.7939 1893.0 0.7648 1030.0 1052.0 1196.0 0.8796 0.8612 856.0 913.0 1267.0 0.7206 0.6756
0.0 9.0 369 1.7440 0.0057 6227.1358 4316.3216 1964.0 2475.0 0.7935 1913.0 0.7729 1037.0 1053.0 1196.0 0.8804 0.8671 869.0 911.0 1267.0 0.7190 0.6859
0.0 10.0 410 1.7394 0.0057 6210.7145 4304.9393 1963.0 2475.0 0.7931 1915.0 0.7737 1035.0 1052.0 1196.0 0.8796 0.8654 873.0 911.0 1267.0 0.7190 0.6890
0.0 11.0 451 1.7371 0.0057 6202.5019 4299.2467 1963.0 2475.0 0.7931 1914.0 0.7733 1033.0 1050.0 1196.0 0.8779 0.8637 874.0 913.0 1267.0 0.7206 0.6898
0.0 12.0 492 1.7354 0.0057 6196.6385 4295.1825 1964.0 2475.0 0.7935 1915.0 0.7737 1034.0 1051.0 1196.0 0.8788 0.8645 874.0 913.0 1267.0 0.7206 0.6898
0.0 13.0 533 1.7333 0.0057 6189.0935 4289.9527 1966.0 2475.0 0.7943 1919.0 0.7754 1033.0 1051.0 1196.0 0.8788 0.8637 879.0 915.0 1267.0 0.7222 0.6938
0.0 14.0 574 1.7293 0.0057 6174.5892 4279.8991 1962.0 2475.0 0.7927 1917.0 0.7745 1031.0 1047.0 1196.0 0.8754 0.8620 879.0 915.0 1267.0 0.7222 0.6938
0.0 15.0 615 1.7313 0.0057 6182.0153 4285.0465 1962.0 2475.0 0.7927 1919.0 0.7754 1030.0 1047.0 1196.0 0.8754 0.8612 882.0 915.0 1267.0 0.7222 0.6961
0.0 16.0 656 1.7293 0.0057 6174.8938 4280.1102 1964.0 2475.0 0.7935 1919.0 0.7754 1031.0 1048.0 1196.0 0.8763 0.8620 881.0 916.0 1267.0 0.7230 0.6953
0.0 17.0 697 1.7291 0.0057 6174.0165 4279.5021 1964.0 2475.0 0.7935 1920.0 0.7758 1032.0 1049.0 1196.0 0.8771 0.8629 881.0 915.0 1267.0 0.7222 0.6953
0.0 18.0 738 1.7284 0.0057 6171.6047 4277.8304 1964.0 2475.0 0.7935 1920.0 0.7758 1031.0 1048.0 1196.0 0.8763 0.8620 882.0 916.0 1267.0 0.7230 0.6961
0.0 19.0 779 1.7288 0.0057 6172.8248 4278.6761 1965.0 2475.0 0.7939 1922.0 0.7766 1030.0 1047.0 1196.0 0.8754 0.8612 885.0 918.0 1267.0 0.7245 0.6985
0.0 20.0 820 1.7275 0.0057 6168.3684 4275.5871 1963.0 2475.0 0.7931 1919.0 0.7754 1030.0 1047.0 1196.0 0.8754 0.8612 882.0 916.0 1267.0 0.7230 0.6961
0.0 21.0 861 1.7275 0.0057 6168.3839 4275.5979 1964.0 2475.0 0.7935 1923.0 0.7770 1031.0 1047.0 1196.0 0.8754 0.8620 885.0 917.0 1267.0 0.7238 0.6985
0.0 22.0 902 1.7271 0.0057 6166.9583 4274.6097 1964.0 2475.0 0.7935 1924.0 0.7774 1033.0 1047.0 1196.0 0.8754 0.8637 884.0 917.0 1267.0 0.7238 0.6977
0.4356 23.0 943 1.7266 0.0057 6165.0094 4273.2589 1966.0 2475.0 0.7943 1923.0 0.7770 1030.0 1047.0 1196.0 0.8754 0.8612 886.0 919.0 1267.0 0.7253 0.6993
0.8713 24.0 984 1.7272 0.0057 6167.4313 4274.9376 1965.0 2475.0 0.7939 1926.0 0.7782 1032.0 1047.0 1196.0 0.8754 0.8629 887.0 918.0 1267.0 0.7245 0.7001
0.4356 25.0 1025 1.7285 0.0057 6171.7803 4277.9521 1963.0 2475.0 0.7931 1921.0 0.7762 1030.0 1047.0 1196.0 0.8754 0.8612 884.0 916.0 1267.0 0.7230 0.6977

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-kyohy420

Finetuned
(903)
this model