GSM8K-Binary_Llama-3.2-1B-trn9haqb

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3861
  • Model Preparation Time: 0.0059
  • Mdl: 4949.1510
  • Accumulated Loss: 3430.4901
  • Correct Preds: 1952.0
  • Total Preds: 2475.0
  • Accuracy: 0.7887
  • Correct Gen Preds: 1959.0
  • Gen Accuracy: 0.7915
  • Correct Gen Preds 34192: 979.0
  • Correct Preds 34192: 980.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.8194
  • Gen Accuracy 34192: 0.8186
  • Correct Gen Preds 41568: 971.0
  • Correct Preds 41568: 972.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.7672
  • Gen Accuracy 41568: 0.7664

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.001
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0059 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
0.5562 1.0 33 0.5791 0.0059 2067.7129 1433.2294 1800.0 2475.0 0.7273 136.0 0.0549 0.0 1001.0 1196.0 0.8370 0.0 129.0 799.0 1267.0 0.6306 0.1018
0.2241 2.0 66 0.6917 0.0059 2469.9161 1712.0154 1750.0 2475.0 0.7071 25.0 0.0101 2.0 1150.0 1196.0 0.9615 0.0017 16.0 600.0 1267.0 0.4736 0.0126
0.2445 3.0 99 0.5807 0.0059 2073.3593 1437.1432 1901.0 2475.0 0.7681 672.0 0.2715 56.0 817.0 1196.0 0.6831 0.0468 608.0 1084.0 1267.0 0.8556 0.4799
0.3792 4.0 132 1.0107 0.0059 3608.9732 2501.5496 1808.0 2475.0 0.7305 752.0 0.3038 361.0 1124.0 1196.0 0.9398 0.3018 383.0 684.0 1267.0 0.5399 0.3023
0.5311 5.0 165 1.0453 0.0059 3732.4069 2587.1073 1949.0 2475.0 0.7875 1925.0 0.7778 965.0 987.0 1196.0 0.8253 0.8069 951.0 962.0 1267.0 0.7593 0.7506
0.0003 6.0 198 1.1808 0.0059 4216.3306 2922.5376 1929.0 2475.0 0.7794 1889.0 0.7632 997.0 1034.0 1196.0 0.8645 0.8336 885.0 895.0 1267.0 0.7064 0.6985
0.4714 7.0 231 1.7950 0.0059 6409.2129 4442.5278 1910.0 2475.0 0.7717 1900.0 0.7677 1077.0 1092.0 1196.0 0.9130 0.9005 815.0 818.0 1267.0 0.6456 0.6433
0.0002 8.0 264 1.3861 0.0059 4949.1510 3430.4901 1952.0 2475.0 0.7887 1959.0 0.7915 979.0 980.0 1196.0 0.8194 0.8186 971.0 972.0 1267.0 0.7672 0.7664
0.0001 9.0 297 1.8078 0.0059 6455.1510 4474.3697 1889.0 2475.0 0.7632 1895.0 0.7657 1088.0 1089.0 1196.0 0.9105 0.9097 799.0 800.0 1267.0 0.6314 0.6306
0.0 10.0 330 1.6442 0.0059 5870.8161 4069.3396 1937.0 2475.0 0.7826 1944.0 0.7855 1059.0 1059.0 1196.0 0.8855 0.8855 877.0 878.0 1267.0 0.6930 0.6922
0.0 11.0 363 1.6431 0.0059 5866.8306 4066.5771 1938.0 2475.0 0.7830 1946.0 0.7863 1058.0 1058.0 1196.0 0.8846 0.8846 880.0 880.0 1267.0 0.6946 0.6946
0.0 12.0 396 1.6410 0.0059 5859.5168 4061.5076 1934.0 2475.0 0.7814 1941.0 0.7842 1055.0 1055.0 1196.0 0.8821 0.8821 878.0 879.0 1267.0 0.6938 0.6930
0.0 13.0 429 1.6420 0.0059 5863.0062 4063.9262 1935.0 2475.0 0.7818 1943.0 0.7851 1056.0 1056.0 1196.0 0.8829 0.8829 879.0 879.0 1267.0 0.6938 0.6938
0.0 14.0 462 1.6393 0.0059 5853.5075 4057.3422 1936.0 2475.0 0.7822 1944.0 0.7855 1055.0 1055.0 1196.0 0.8821 0.8821 881.0 881.0 1267.0 0.6953 0.6953
0.4705 15.0 495 1.6394 0.0059 5853.9322 4057.6366 1935.0 2475.0 0.7818 1943.0 0.7851 1054.0 1054.0 1196.0 0.8813 0.8813 881.0 881.0 1267.0 0.6953 0.6953
0.0 16.0 528 1.6388 0.0059 5851.4802 4055.9370 1936.0 2475.0 0.7822 1944.0 0.7855 1055.0 1055.0 1196.0 0.8821 0.8821 881.0 881.0 1267.0 0.6953 0.6953
0.0 17.0 561 1.6396 0.0059 5854.5643 4058.0747 1937.0 2475.0 0.7826 1945.0 0.7859 1054.0 1054.0 1196.0 0.8813 0.8813 883.0 883.0 1267.0 0.6969 0.6969
0.4705 18.0 594 1.6388 0.0059 5851.7692 4056.1373 1937.0 2475.0 0.7826 1945.0 0.7859 1053.0 1053.0 1196.0 0.8804 0.8804 884.0 884.0 1267.0 0.6977 0.6977
0.0 19.0 627 1.6396 0.0059 5854.6347 4058.1235 1935.0 2475.0 0.7818 1943.0 0.7851 1052.0 1052.0 1196.0 0.8796 0.8796 883.0 883.0 1267.0 0.6969 0.6969
0.0 20.0 660 1.6372 0.0059 5845.9689 4052.1169 1936.0 2475.0 0.7822 1944.0 0.7855 1052.0 1052.0 1196.0 0.8796 0.8796 884.0 884.0 1267.0 0.6977 0.6977
0.0 21.0 693 1.6389 0.0059 5852.0283 4056.3169 1935.0 2475.0 0.7818 1943.0 0.7851 1052.0 1052.0 1196.0 0.8796 0.8796 883.0 883.0 1267.0 0.6969 0.6969
0.0 22.0 726 1.6393 0.0059 5853.4144 4057.2777 1936.0 2475.0 0.7822 1944.0 0.7855 1053.0 1053.0 1196.0 0.8804 0.8804 883.0 883.0 1267.0 0.6969 0.6969
0.0 23.0 759 1.6391 0.0059 5852.5099 4056.6507 1934.0 2475.0 0.7814 1942.0 0.7846 1051.0 1051.0 1196.0 0.8788 0.8788 883.0 883.0 1267.0 0.6969 0.6969

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-trn9haqb

Finetuned
(903)
this model