GSM8K-Binary_Llama-3.2-1B-kcrbohqy

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5026
  • Model Preparation Time: 0.0059
  • Mdl: 5365.4215
  • Accumulated Loss: 3719.0268
  • Correct Preds: 1963.0
  • Total Preds: 2475.0
  • Accuracy: 0.7931
  • Correct Gen Preds: 1691.0
  • Gen Accuracy: 0.6832
  • Correct Gen Preds 34192: 871.0
  • Correct Preds 34192: 1035.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.8654
  • Gen Accuracy 34192: 0.7283
  • Correct Gen Preds 41568: 814.0
  • Correct Preds 41568: 928.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.7324
  • Gen Accuracy 41568: 0.6425

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.001
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0059 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
1.0257 1.0 47 0.6155 0.0059 2197.8915 1523.4623 1717.0 2475.0 0.6937 8.0 0.0032 0.0 1150.0 1196.0 0.9615 0.0 0.0 567.0 1267.0 0.4475 0.0
0.2829 2.0 94 0.5306 0.0059 1894.7016 1313.3071 1908.0 2475.0 0.7709 7.0 0.0028 0.0 989.0 1196.0 0.8269 0.0 0.0 919.0 1267.0 0.7253 0.0
0.0289 3.0 141 0.5964 0.0059 2129.4510 1476.0229 1931.0 2475.0 0.7802 7.0 0.0028 0.0 886.0 1196.0 0.7408 0.0 0.0 1045.0 1267.0 0.8248 0.0
0.2101 4.0 188 1.0279 0.0059 3670.3209 2544.0726 1902.0 2475.0 0.7685 689.0 0.2784 293.0 1091.0 1196.0 0.9122 0.2450 388.0 811.0 1267.0 0.6401 0.3062
0.7687 5.0 235 0.9640 0.0059 3442.2378 2385.9775 1948.0 2475.0 0.7871 446.0 0.1802 132.0 1004.0 1196.0 0.8395 0.1104 307.0 944.0 1267.0 0.7451 0.2423
0.0001 6.0 282 1.8360 0.0059 6555.6511 4544.0311 1870.0 2475.0 0.7556 1150.0 0.4646 673.0 1110.0 1196.0 0.9281 0.5627 470.0 760.0 1267.0 0.5998 0.3710
0.0114 7.0 329 1.5026 0.0059 5365.4215 3719.0268 1963.0 2475.0 0.7931 1691.0 0.6832 871.0 1035.0 1196.0 0.8654 0.7283 814.0 928.0 1267.0 0.7324 0.6425
0.0001 8.0 376 1.7366 0.0059 6200.7876 4298.0585 1937.0 2475.0 0.7826 1561.0 0.6307 686.0 938.0 1196.0 0.7843 0.5736 867.0 999.0 1267.0 0.7885 0.6843
0.0 9.0 423 1.9428 0.0059 6937.0874 4808.4226 1919.0 2475.0 0.7754 1722.0 0.6958 823.0 959.0 1196.0 0.8018 0.6881 890.0 960.0 1267.0 0.7577 0.7024
0.0 10.0 470 1.9499 0.0059 6962.3264 4825.9169 1923.0 2475.0 0.7770 1742.0 0.7038 785.0 911.0 1196.0 0.7617 0.6564 948.0 1012.0 1267.0 0.7987 0.7482
0.0 11.0 517 1.9904 0.0059 7107.2016 4926.3368 1928.0 2475.0 0.7790 1741.0 0.7034 869.0 1000.0 1196.0 0.8361 0.7266 863.0 928.0 1267.0 0.7324 0.6811
0.0 12.0 564 1.9920 0.0059 7112.9400 4930.3143 1931.0 2475.0 0.7802 1740.0 0.7030 871.0 1003.0 1196.0 0.8386 0.7283 860.0 928.0 1267.0 0.7324 0.6788
0.0 13.0 611 1.9893 0.0059 7103.2337 4923.5864 1931.0 2475.0 0.7802 1749.0 0.7067 872.0 1000.0 1196.0 0.8361 0.7291 869.0 931.0 1267.0 0.7348 0.6859
0.0 14.0 658 1.9887 0.0059 7100.8464 4921.9317 1930.0 2475.0 0.7798 1755.0 0.7091 874.0 1000.0 1196.0 0.8361 0.7308 872.0 930.0 1267.0 0.7340 0.6882
0.0 15.0 705 1.9885 0.0059 7100.2310 4921.5051 1933.0 2475.0 0.7810 1750.0 0.7071 873.0 1000.0 1196.0 0.8361 0.7299 868.0 933.0 1267.0 0.7364 0.6851
0.0 16.0 752 1.9883 0.0059 7099.4041 4920.9320 1931.0 2475.0 0.7802 1755.0 0.7091 876.0 999.0 1196.0 0.8353 0.7324 870.0 932.0 1267.0 0.7356 0.6867
0.0 17.0 799 1.9886 0.0059 7100.6278 4921.7802 1931.0 2475.0 0.7802 1755.0 0.7091 873.0 999.0 1196.0 0.8353 0.7299 874.0 932.0 1267.0 0.7356 0.6898
0.0 18.0 846 1.9860 0.0059 7091.2435 4915.2755 1934.0 2475.0 0.7814 1759.0 0.7107 877.0 999.0 1196.0 0.8353 0.7333 873.0 935.0 1267.0 0.7380 0.6890

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-kcrbohqy

Finetuned
(903)
this model