GSM8K-Binary_Llama-3.2-1B-ivxh9bjy

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7423
  • Model Preparation Time: 0.0054
  • Mdl: 2650.3578
  • Accumulated Loss: 1837.0880
  • Correct Preds: 1688.0
  • Total Preds: 2475.0
  • Accuracy: 0.6820
  • Correct Gen Preds: 7.0
  • Gen Accuracy: 0.0028
  • Correct Gen Preds 34192: 0.0
  • Correct Preds 34192: 783.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.6547
  • Gen Accuracy 34192: 0.0
  • Correct Gen Preds 41568: 0.0
  • Correct Preds 41568: 905.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.7143
  • Gen Accuracy 41568: 0.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0054 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
1.8926 1.0 3 0.7529 0.0054 2688.4400 1863.4846 1326.0 2475.0 0.5358 28.0 0.0113 10.0 888.0 1196.0 0.7425 0.0084 10.0 438.0 1267.0 0.3457 0.0079
0.8692 2.0 6 1.1837 0.0054 4226.7103 2929.7323 1267.0 2475.0 0.5119 156.0 0.0630 0.0 0.0 1196.0 0.0 0.0 148.0 1267.0 1267.0 1.0 0.1168
1.9719 3.0 9 0.7201 0.0054 2571.1173 1782.1627 1463.0 2475.0 0.5911 10.0 0.0040 0.0 1128.0 1196.0 0.9431 0.0 2.0 335.0 1267.0 0.2644 0.0016
0.5314 4.0 12 0.7689 0.0054 2745.5918 1903.0992 1349.0 2475.0 0.5451 20.0 0.0081 0.0 104.0 1196.0 0.0870 0.0 12.0 1245.0 1267.0 0.9826 0.0095
0.5337 5.0 15 0.8571 0.0054 3060.3610 2121.2806 1424.0 2475.0 0.5754 7.0 0.0028 0.0 1151.0 1196.0 0.9624 0.0 0.0 273.0 1267.0 0.2155 0.0
0.9397 6.0 18 0.8323 0.0054 2971.7447 2059.8565 1552.0 2475.0 0.6271 8.0 0.0032 0.0 401.0 1196.0 0.3353 0.0 0.0 1151.0 1267.0 0.9084 0.0
0.1934 7.0 21 0.7423 0.0054 2650.3578 1837.0880 1688.0 2475.0 0.6820 7.0 0.0028 0.0 783.0 1196.0 0.6547 0.0 0.0 905.0 1267.0 0.7143 0.0
0.9493 8.0 24 1.3176 0.0054 4704.6623 3261.0234 1640.0 2475.0 0.6626 278.0 0.1123 169.0 939.0 1196.0 0.7851 0.1413 102.0 701.0 1267.0 0.5533 0.0805
0.3022 9.0 27 1.3117 0.0054 4683.8108 3246.5703 1515.0 2475.0 0.6121 772.0 0.3119 126.0 429.0 1196.0 0.3587 0.1054 638.0 1086.0 1267.0 0.8571 0.5036
0.4787 10.0 30 1.0910 0.0054 3895.7073 2700.2985 1628.0 2475.0 0.6578 823.0 0.3325 274.0 718.0 1196.0 0.6003 0.2291 541.0 910.0 1267.0 0.7182 0.4270
0.0038 11.0 33 2.0771 0.0054 7416.5828 5140.7834 1584.0 2475.0 0.64 1201.0 0.4853 468.0 681.0 1196.0 0.5694 0.3913 726.0 903.0 1267.0 0.7127 0.5730
0.0043 12.0 36 2.2125 0.0054 7900.0977 5475.9304 1585.0 2475.0 0.6404 1442.0 0.5826 603.0 662.0 1196.0 0.5535 0.5042 832.0 923.0 1267.0 0.7285 0.6567
0.0001 13.0 39 2.2169 0.0054 7915.9312 5486.9054 1621.0 2475.0 0.6549 1458.0 0.5891 751.0 818.0 1196.0 0.6839 0.6279 700.0 803.0 1267.0 0.6338 0.5525
0.4529 14.0 42 2.5412 0.0054 9073.8672 6289.5255 1607.0 2475.0 0.6493 1464.0 0.5915 614.0 677.0 1196.0 0.5661 0.5134 843.0 930.0 1267.0 0.7340 0.6654
0.4524 15.0 45 2.9305 0.0054 10463.8496 7252.9878 1610.0 2475.0 0.6505 1474.0 0.5956 683.0 741.0 1196.0 0.6196 0.5711 784.0 869.0 1267.0 0.6859 0.6188
0.9048 16.0 48 3.2752 0.0054 11694.7828 8106.2057 1621.0 2475.0 0.6549 1480.0 0.5980 788.0 854.0 1196.0 0.7140 0.6589 685.0 767.0 1267.0 0.6054 0.5406
0.0 17.0 51 3.5192 0.0054 12565.8354 8709.9734 1616.0 2475.0 0.6529 1475.0 0.5960 827.0 897.0 1196.0 0.75 0.6915 641.0 719.0 1267.0 0.5675 0.5059
0.4524 18.0 54 3.6613 0.0054 13073.4132 9061.7995 1603.0 2475.0 0.6477 1468.0 0.5931 847.0 915.0 1196.0 0.7651 0.7082 614.0 688.0 1267.0 0.5430 0.4846
0.4524 19.0 57 3.7335 0.0054 13331.1999 9240.4837 1605.0 2475.0 0.6485 1470.0 0.5939 857.0 927.0 1196.0 0.7751 0.7166 606.0 678.0 1267.0 0.5351 0.4783
0.0 20.0 60 3.7780 0.0054 13489.9545 9350.5239 1598.0 2475.0 0.6457 1479.0 0.5976 868.0 928.0 1196.0 0.7759 0.7258 604.0 670.0 1267.0 0.5288 0.4767
0.0 21.0 63 3.7912 0.0054 13537.2531 9383.3089 1604.0 2475.0 0.6481 1484.0 0.5996 871.0 933.0 1196.0 0.7801 0.7283 606.0 671.0 1267.0 0.5296 0.4783
0.4524 22.0 66 3.7968 0.0054 13557.2370 9397.1606 1605.0 2475.0 0.6485 1480.0 0.5980 871.0 931.0 1196.0 0.7784 0.7283 602.0 674.0 1267.0 0.5320 0.4751
0.9048 23.0 69 3.8063 0.0054 13591.2194 9420.7154 1607.0 2475.0 0.6493 1484.0 0.5996 871.0 933.0 1196.0 0.7801 0.7283 606.0 674.0 1267.0 0.5320 0.4783
0.4524 24.0 72 3.8033 0.0054 13580.3724 9413.1968 1607.0 2475.0 0.6493 1489.0 0.6016 871.0 930.0 1196.0 0.7776 0.7283 611.0 677.0 1267.0 0.5343 0.4822
0.4524 25.0 75 3.8088 0.0054 13599.9088 9426.7384 1606.0 2475.0 0.6489 1492.0 0.6028 872.0 928.0 1196.0 0.7759 0.7291 613.0 678.0 1267.0 0.5351 0.4838
0.9048 26.0 78 3.8077 0.0054 13595.8914 9423.9538 1608.0 2475.0 0.6497 1494.0 0.6036 871.0 928.0 1196.0 0.7759 0.7283 616.0 680.0 1267.0 0.5367 0.4862
0.4524 27.0 81 3.8064 0.0054 13591.4585 9420.8811 1609.0 2475.0 0.6501 1493.0 0.6032 869.0 928.0 1196.0 0.7759 0.7266 617.0 681.0 1267.0 0.5375 0.4870
0.0 28.0 84 3.8025 0.0054 13577.4399 9411.1642 1608.0 2475.0 0.6497 1495.0 0.6040 870.0 928.0 1196.0 0.7759 0.7274 618.0 680.0 1267.0 0.5367 0.4878
0.0 29.0 87 3.8055 0.0054 13588.0611 9418.5263 1615.0 2475.0 0.6525 1496.0 0.6044 869.0 931.0 1196.0 0.7784 0.7266 620.0 684.0 1267.0 0.5399 0.4893
0.0 30.0 90 3.8028 0.0054 13578.4494 9411.8640 1607.0 2475.0 0.6493 1499.0 0.6057 868.0 925.0 1196.0 0.7734 0.7258 624.0 682.0 1267.0 0.5383 0.4925
0.4524 31.0 93 3.8007 0.0054 13571.0659 9406.7461 1610.0 2475.0 0.6505 1497.0 0.6048 866.0 924.0 1196.0 0.7726 0.7241 624.0 686.0 1267.0 0.5414 0.4925
0.4524 32.0 96 3.8028 0.0054 13578.6452 9411.9996 1610.0 2475.0 0.6505 1498.0 0.6053 862.0 923.0 1196.0 0.7717 0.7207 629.0 687.0 1267.0 0.5422 0.4964
0.4524 33.0 99 3.8003 0.0054 13569.6111 9405.7377 1610.0 2475.0 0.6505 1503.0 0.6073 869.0 925.0 1196.0 0.7734 0.7266 627.0 685.0 1267.0 0.5406 0.4949
0.4524 34.0 102 3.7994 0.0054 13566.5818 9403.6379 1611.0 2475.0 0.6509 1501.0 0.6065 866.0 924.0 1196.0 0.7726 0.7241 628.0 687.0 1267.0 0.5422 0.4957
0.0 35.0 105 3.7990 0.0054 13565.1272 9402.6297 1610.0 2475.0 0.6505 1501.0 0.6065 864.0 920.0 1196.0 0.7692 0.7224 630.0 690.0 1267.0 0.5446 0.4972
0.4524 36.0 108 3.8014 0.0054 13573.4117 9408.3720 1610.0 2475.0 0.6505 1499.0 0.6057 866.0 921.0 1196.0 0.7701 0.7241 626.0 689.0 1267.0 0.5438 0.4941
0.4524 37.0 111 3.7985 0.0054 13563.0986 9401.2236 1608.0 2475.0 0.6497 1503.0 0.6073 864.0 920.0 1196.0 0.7692 0.7224 632.0 688.0 1267.0 0.5430 0.4988

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-ivxh9bjy

Finetuned
(904)
this model