GSM8K-Binary_Llama-3.2-1B-n45gfm9o

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4701
  • Model Preparation Time: 0.0059
  • Mdl: 5249.0917
  • Accumulated Loss: 3638.3931
  • Correct Preds: 1954.0
  • Total Preds: 2475.0
  • Accuracy: 0.7895
  • Correct Gen Preds: 1866.0
  • Gen Accuracy: 0.7539
  • Correct Gen Preds 34192: 949.0
  • Correct Preds 34192: 1014.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.8478
  • Gen Accuracy 34192: 0.7935
  • Correct Gen Preds 41568: 908.0
  • Correct Preds 41568: 940.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.7419
  • Gen Accuracy 41568: 0.7167

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.001
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0059 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
0.774 1.0 36 0.7619 0.0059 2720.3405 1885.5964 1654.0 2475.0 0.6683 8.0 0.0032 0.0 565.0 1196.0 0.4724 0.0 0.0 1089.0 1267.0 0.8595 0.0
0.558 2.0 72 0.7026 0.0059 2508.8193 1738.9811 1704.0 2475.0 0.6885 8.0 0.0032 0.0 527.0 1196.0 0.4406 0.0 0.0 1177.0 1267.0 0.9290 0.0
0.3275 3.0 108 0.7695 0.0059 2747.8047 1904.6331 1687.0 2475.0 0.6816 16.0 0.0065 0.0 533.0 1196.0 0.4457 0.0 8.0 1154.0 1267.0 0.9108 0.0063
0.2039 4.0 144 0.6071 0.0059 2167.6033 1502.4681 1932.0 2475.0 0.7806 71.0 0.0287 2.0 916.0 1196.0 0.7659 0.0017 61.0 1016.0 1267.0 0.8019 0.0481
0.2072 5.0 180 1.0169 0.0059 3630.9726 2516.7984 1924.0 2475.0 0.7774 1267.0 0.5119 597.0 1044.0 1196.0 0.8729 0.4992 661.0 880.0 1267.0 0.6946 0.5217
0.066 6.0 216 1.0546 0.0059 3765.5184 2610.0584 1893.0 2475.0 0.7648 1017.0 0.4109 500.0 1079.0 1196.0 0.9022 0.4181 509.0 814.0 1267.0 0.6425 0.4017
0.0004 7.0 252 1.2220 0.0059 4363.2817 3024.3964 1906.0 2475.0 0.7701 1454.0 0.5875 544.0 850.0 1196.0 0.7107 0.4548 902.0 1056.0 1267.0 0.8335 0.7119
0.0021 8.0 288 1.7093 0.0059 6103.2758 4230.4684 1892.0 2475.0 0.7644 1727.0 0.6978 1018.0 1103.0 1196.0 0.9222 0.8512 701.0 789.0 1267.0 0.6227 0.5533
0.0002 9.0 324 1.5981 0.0059 5706.2934 3955.3012 1900.0 2475.0 0.7677 1761.0 0.7115 1001.0 1078.0 1196.0 0.9013 0.8370 752.0 822.0 1267.0 0.6488 0.5935
0.0001 10.0 360 1.4701 0.0059 5249.0917 3638.3931 1954.0 2475.0 0.7895 1866.0 0.7539 949.0 1014.0 1196.0 0.8478 0.7935 908.0 940.0 1267.0 0.7419 0.7167
0.0 11.0 396 1.4879 0.0059 5312.7263 3682.5013 1950.0 2475.0 0.7879 1869.0 0.7552 958.0 1020.0 1196.0 0.8528 0.8010 903.0 930.0 1267.0 0.7340 0.7127
0.0 12.0 432 1.4948 0.0059 5337.3050 3699.5379 1948.0 2475.0 0.7871 1867.0 0.7543 960.0 1022.0 1196.0 0.8545 0.8027 898.0 926.0 1267.0 0.7309 0.7088
0.0 13.0 468 1.5004 0.0059 5357.5988 3713.6045 1946.0 2475.0 0.7863 1866.0 0.7539 961.0 1024.0 1196.0 0.8562 0.8035 896.0 922.0 1267.0 0.7277 0.7072
0.7841 14.0 504 1.5063 0.0059 5378.5450 3728.1233 1948.0 2475.0 0.7871 1871.0 0.7560 966.0 1026.0 1196.0 0.8579 0.8077 896.0 922.0 1267.0 0.7277 0.7072
0.0 15.0 540 1.5092 0.0059 5388.9284 3735.3205 1945.0 2475.0 0.7859 1871.0 0.7560 970.0 1026.0 1196.0 0.8579 0.8110 893.0 919.0 1267.0 0.7253 0.7048
0.7841 16.0 576 1.5127 0.0059 5401.2640 3743.8709 1944.0 2475.0 0.7855 1869.0 0.7552 968.0 1025.0 1196.0 0.8570 0.8094 892.0 919.0 1267.0 0.7253 0.7040
0.0 17.0 612 1.5144 0.0059 5407.4933 3748.1887 1943.0 2475.0 0.7851 1875.0 0.7576 973.0 1026.0 1196.0 0.8579 0.8135 893.0 917.0 1267.0 0.7238 0.7048
0.0 18.0 648 1.5177 0.0059 5419.3768 3756.4257 1947.0 2475.0 0.7867 1872.0 0.7564 972.0 1029.0 1196.0 0.8604 0.8127 892.0 918.0 1267.0 0.7245 0.7040
0.0 19.0 684 1.5195 0.0059 5425.7845 3760.8672 1945.0 2475.0 0.7859 1877.0 0.7584 976.0 1028.0 1196.0 0.8595 0.8161 893.0 917.0 1267.0 0.7238 0.7048
0.0 20.0 720 1.5238 0.0059 5441.1232 3771.4992 1944.0 2475.0 0.7855 1875.0 0.7576 972.0 1027.0 1196.0 0.8587 0.8127 895.0 917.0 1267.0 0.7238 0.7064
0.0 21.0 756 1.5265 0.0059 5450.7090 3778.1436 1948.0 2475.0 0.7871 1881.0 0.76 979.0 1031.0 1196.0 0.8620 0.8186 893.0 917.0 1267.0 0.7238 0.7048
0.0 22.0 792 1.5266 0.0059 5450.9628 3778.3195 1944.0 2475.0 0.7855 1874.0 0.7572 974.0 1029.0 1196.0 0.8604 0.8144 891.0 915.0 1267.0 0.7222 0.7032
0.0 23.0 828 1.5306 0.0059 5465.2671 3788.2345 1947.0 2475.0 0.7867 1877.0 0.7584 979.0 1032.0 1196.0 0.8629 0.8186 890.0 915.0 1267.0 0.7222 0.7024
0.0 24.0 864 1.5319 0.0059 5469.8094 3791.3829 1950.0 2475.0 0.7879 1882.0 0.7604 981.0 1033.0 1196.0 0.8637 0.8202 893.0 917.0 1267.0 0.7238 0.7048

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-n45gfm9o

Finetuned
(903)
this model