GSM8K-Binary_Llama-3.2-1B-fmntg7h8

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1051
  • Model Preparation Time: 0.0056
  • Mdl: 3945.9379
  • Accumulated Loss: 2735.1157
  • Correct Preds: 1562.0
  • Total Preds: 2475.0
  • Accuracy: 0.6311
  • Correct Gen Preds: 804.0
  • Gen Accuracy: 0.3248
  • Correct Gen Preds 34192: 398.0
  • Correct Preds 34192: 875.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.7316
  • Gen Accuracy 34192: 0.3328
  • Correct Gen Preds 41568: 399.0
  • Correct Preds 41568: 687.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.5422
  • Gen Accuracy 41568: 0.3149

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0056 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
1.2894 1.0 1 1.4656 0.0056 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
1.2894 2.0 2 5.0031 0.0056 17864.2446 12382.5508 1267.0 2475.0 0.5119 1274.0 0.5147 0.0 0.0 1196.0 0.0 0.0 1266.0 1267.0 1267.0 1.0 0.9992
5.3725 3.0 3 1.0418 0.0056 3719.9628 2578.4817 1267.0 2475.0 0.5119 8.0 0.0032 0.0 0.0 1196.0 0.0 0.0 0.0 1267.0 1267.0 1.0 0.0
1.0334 4.0 4 1.9141 0.0056 6834.5336 4737.3377 1196.0 2475.0 0.4832 8.0 0.0032 0.0 1196.0 1196.0 1.0 0.0 0.0 0.0 1267.0 0.0 0.0
1.7152 5.0 5 0.8599 0.0056 3070.4544 2128.2768 1201.0 2475.0 0.4853 7.0 0.0028 0.0 1188.0 1196.0 0.9933 0.0 0.0 13.0 1267.0 0.0103 0.0
0.676 6.0 6 0.9139 0.0056 3263.1526 2261.8450 1275.0 2475.0 0.5152 7.0 0.0028 0.0 17.0 1196.0 0.0142 0.0 0.0 1258.0 1267.0 0.9929 0.0
0.8111 7.0 7 0.7598 0.0056 2712.8353 1880.3942 1494.0 2475.0 0.6036 7.0 0.0028 0.0 1026.0 1196.0 0.8579 0.0 0.0 468.0 1267.0 0.3694 0.0
0.5625 8.0 8 0.7617 0.0056 2719.8767 1885.2748 1367.0 2475.0 0.5523 7.0 0.0028 0.0 296.0 1196.0 0.2475 0.0 0.0 1071.0 1267.0 0.8453 0.0
0.4464 9.0 9 0.7365 0.0056 2629.9084 1822.9136 1462.0 2475.0 0.5907 7.0 0.0028 0.0 817.0 1196.0 0.6831 0.0 0.0 645.0 1267.0 0.5091 0.0
0.295 10.0 10 0.8989 0.0056 3209.6418 2224.7542 1361.0 2475.0 0.5499 7.0 0.0028 0.0 1160.0 1196.0 0.9699 0.0 0.0 201.0 1267.0 0.1586 0.0
0.2414 11.0 11 0.7902 0.0056 2821.4112 1955.6532 1484.0 2475.0 0.5996 7.0 0.0028 0.0 1042.0 1196.0 0.8712 0.0 0.0 442.0 1267.0 0.3489 0.0
0.0982 12.0 12 0.7771 0.0056 2774.9207 1923.4284 1463.0 2475.0 0.5911 7.0 0.0028 0.0 554.0 1196.0 0.4632 0.0 0.0 909.0 1267.0 0.7174 0.0
0.068 13.0 13 0.7808 0.0056 2788.0061 1932.4986 1521.0 2475.0 0.6145 7.0 0.0028 0.0 752.0 1196.0 0.6288 0.0 0.0 769.0 1267.0 0.6069 0.0
0.015 14.0 14 1.1364 0.0056 4057.6446 2812.5449 1476.0 2475.0 0.5964 7.0 0.0028 0.0 1116.0 1196.0 0.9331 0.0 0.0 360.0 1267.0 0.2841 0.0
0.0068 15.0 15 1.2663 0.0056 4521.6005 3134.1346 1470.0 2475.0 0.5939 10.0 0.0040 0.0 1108.0 1196.0 0.9264 0.0 3.0 362.0 1267.0 0.2857 0.0024
0.0009 16.0 16 1.2510 0.0056 4467.0043 3096.2915 1493.0 2475.0 0.6032 40.0 0.0162 16.0 1072.0 1196.0 0.8963 0.0134 17.0 421.0 1267.0 0.3323 0.0134
0.0002 17.0 17 1.2050 0.0056 4302.6121 2982.3435 1527.0 2475.0 0.6170 154.0 0.0622 70.0 1041.0 1196.0 0.8704 0.0585 77.0 486.0 1267.0 0.3836 0.0608
0.0001 18.0 18 1.1676 0.0056 4169.0566 2889.7698 1536.0 2475.0 0.6206 317.0 0.1281 153.0 999.0 1196.0 0.8353 0.1279 157.0 537.0 1267.0 0.4238 0.1239
0.0001 19.0 19 1.1388 0.0056 4066.2817 2818.5317 1550.0 2475.0 0.6263 485.0 0.1960 235.0 960.0 1196.0 0.8027 0.1965 243.0 590.0 1267.0 0.4657 0.1918
0.0 20.0 20 1.1190 0.0056 3995.7100 2769.6151 1552.0 2475.0 0.6271 606.0 0.2448 307.0 922.0 1196.0 0.7709 0.2567 292.0 630.0 1267.0 0.4972 0.2305
0.0 21.0 21 1.1097 0.0056 3962.2472 2746.4205 1556.0 2475.0 0.6287 723.0 0.2921 360.0 894.0 1196.0 0.7475 0.3010 356.0 662.0 1267.0 0.5225 0.2810
0.0 22.0 22 1.1051 0.0056 3945.9379 2735.1157 1562.0 2475.0 0.6311 804.0 0.3248 398.0 875.0 1196.0 0.7316 0.3328 399.0 687.0 1267.0 0.5422 0.3149
0.0 23.0 23 1.1066 0.0056 3951.2043 2738.7661 1556.0 2475.0 0.6287 867.0 0.3503 434.0 861.0 1196.0 0.7199 0.3629 426.0 695.0 1267.0 0.5485 0.3362
0.0 24.0 24 1.1099 0.0056 3963.2388 2747.1078 1551.0 2475.0 0.6267 919.0 0.3713 463.0 848.0 1196.0 0.7090 0.3871 449.0 703.0 1267.0 0.5549 0.3544
0.0 25.0 25 1.1150 0.0056 3981.1612 2759.5307 1546.0 2475.0 0.6246 967.0 0.3907 488.0 841.0 1196.0 0.7032 0.4080 472.0 705.0 1267.0 0.5564 0.3725
0.0 26.0 26 1.1229 0.0056 4009.6006 2779.2434 1538.0 2475.0 0.6214 1000.0 0.4040 512.0 841.0 1196.0 0.7032 0.4281 481.0 697.0 1267.0 0.5501 0.3796
0.0 27.0 27 1.1302 0.0056 4035.6677 2797.3117 1546.0 2475.0 0.6246 1038.0 0.4194 535.0 844.0 1196.0 0.7057 0.4473 496.0 702.0 1267.0 0.5541 0.3915
0.0 28.0 28 1.1359 0.0056 4056.0096 2811.4116 1543.0 2475.0 0.6234 1062.0 0.4291 556.0 844.0 1196.0 0.7057 0.4649 499.0 699.0 1267.0 0.5517 0.3938
0.0 29.0 29 1.1418 0.0056 4076.9828 2825.9492 1539.0 2475.0 0.6218 1078.0 0.4356 569.0 845.0 1196.0 0.7065 0.4758 502.0 694.0 1267.0 0.5478 0.3962
0.0 30.0 30 1.1483 0.0056 4100.0299 2841.9241 1532.0 2475.0 0.6190 1096.0 0.4428 581.0 843.0 1196.0 0.7048 0.4858 508.0 689.0 1267.0 0.5438 0.4009
0.0 31.0 31 1.1516 0.0056 4111.8632 2850.1264 1542.0 2475.0 0.6230 1104.0 0.4461 590.0 852.0 1196.0 0.7124 0.4933 507.0 690.0 1267.0 0.5446 0.4002
0.0 32.0 32 1.1572 0.0056 4132.1376 2864.1795 1536.0 2475.0 0.6206 1118.0 0.4517 600.0 851.0 1196.0 0.7115 0.5017 511.0 685.0 1267.0 0.5406 0.4033
0.0 33.0 33 1.1625 0.0056 4150.7724 2877.0962 1540.0 2475.0 0.6222 1124.0 0.4541 605.0 854.0 1196.0 0.7140 0.5059 512.0 686.0 1267.0 0.5414 0.4041
0.0 34.0 34 1.1650 0.0056 4159.6856 2883.2743 1543.0 2475.0 0.6234 1132.0 0.4574 614.0 860.0 1196.0 0.7191 0.5134 511.0 683.0 1267.0 0.5391 0.4033
0.0 35.0 35 1.1686 0.0056 4172.7025 2892.2970 1546.0 2475.0 0.6246 1144.0 0.4622 622.0 861.0 1196.0 0.7199 0.5201 515.0 685.0 1267.0 0.5406 0.4065
0.0 36.0 36 1.1715 0.0056 4182.9415 2899.3941 1541.0 2475.0 0.6226 1139.0 0.4602 619.0 860.0 1196.0 0.7191 0.5176 513.0 681.0 1267.0 0.5375 0.4049
0.0 37.0 37 1.1733 0.0056 4189.4789 2903.9255 1541.0 2475.0 0.6226 1140.0 0.4606 621.0 860.0 1196.0 0.7191 0.5192 512.0 681.0 1267.0 0.5375 0.4041
0.0 38.0 38 1.1751 0.0056 4195.7990 2908.3062 1546.0 2475.0 0.6246 1141.0 0.4610 624.0 864.0 1196.0 0.7224 0.5217 510.0 682.0 1267.0 0.5383 0.4025
0.0 39.0 39 1.1772 0.0056 4203.2968 2913.5034 1542.0 2475.0 0.6230 1151.0 0.4651 629.0 863.0 1196.0 0.7216 0.5259 515.0 679.0 1267.0 0.5359 0.4065
0.0 40.0 40 1.1783 0.0056 4207.4070 2916.3523 1545.0 2475.0 0.6242 1153.0 0.4659 632.0 867.0 1196.0 0.7249 0.5284 514.0 678.0 1267.0 0.5351 0.4057
0.0 41.0 41 1.1797 0.0056 4212.1617 2919.6480 1546.0 2475.0 0.6246 1156.0 0.4671 637.0 869.0 1196.0 0.7266 0.5326 512.0 677.0 1267.0 0.5343 0.4041
0.0 42.0 42 1.1807 0.0056 4215.7863 2922.1604 1549.0 2475.0 0.6259 1152.0 0.4655 635.0 871.0 1196.0 0.7283 0.5309 510.0 678.0 1267.0 0.5351 0.4025
0.0 43.0 43 1.1811 0.0056 4217.3027 2923.2115 1546.0 2475.0 0.6246 1153.0 0.4659 638.0 870.0 1196.0 0.7274 0.5334 508.0 676.0 1267.0 0.5335 0.4009
0.0 44.0 44 1.1826 0.0056 4222.7328 2926.9753 1543.0 2475.0 0.6234 1159.0 0.4683 638.0 867.0 1196.0 0.7249 0.5334 514.0 676.0 1267.0 0.5335 0.4057
0.0 45.0 45 1.1820 0.0056 4220.4729 2925.4089 1549.0 2475.0 0.6259 1163.0 0.4699 642.0 872.0 1196.0 0.7291 0.5368 514.0 677.0 1267.0 0.5343 0.4057
0.0 46.0 46 1.1830 0.0056 4223.9636 2927.8285 1543.0 2475.0 0.6234 1156.0 0.4671 640.0 871.0 1196.0 0.7283 0.5351 509.0 672.0 1267.0 0.5304 0.4017
0.0 47.0 47 1.1842 0.0056 4228.2689 2930.8126 1546.0 2475.0 0.6246 1159.0 0.4683 641.0 871.0 1196.0 0.7283 0.5360 511.0 675.0 1267.0 0.5328 0.4033
0.0 48.0 48 1.1843 0.0056 4228.6913 2931.1054 1541.0 2475.0 0.6226 1157.0 0.4675 639.0 868.0 1196.0 0.7258 0.5343 511.0 673.0 1267.0 0.5312 0.4033
0.0 49.0 49 1.1849 0.0056 4230.8653 2932.6124 1548.0 2475.0 0.6255 1156.0 0.4671 638.0 874.0 1196.0 0.7308 0.5334 511.0 674.0 1267.0 0.5320 0.4033
0.0 50.0 50 1.1857 0.0056 4233.7065 2934.5817 1542.0 2475.0 0.6230 1158.0 0.4679 640.0 871.0 1196.0 0.7283 0.5351 511.0 671.0 1267.0 0.5296 0.4033
0.0 51.0 51 1.1861 0.0056 4235.1580 2935.5879 1545.0 2475.0 0.6242 1161.0 0.4691 639.0 869.0 1196.0 0.7266 0.5343 515.0 676.0 1267.0 0.5335 0.4065
0.0 52.0 52 1.1867 0.0056 4237.4739 2937.1931 1538.0 2475.0 0.6214 1159.0 0.4683 639.0 867.0 1196.0 0.7249 0.5343 513.0 671.0 1267.0 0.5296 0.4049

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-fmntg7h8

Finetuned
(903)
this model