GSM8K-Binary_Llama-3.2-1B-ivxh9bjy
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.7423
- Model Preparation Time: 0.0054
- Mdl: 2650.3578
- Accumulated Loss: 1837.0880
- Correct Preds: 1688.0
- Total Preds: 2475.0
- Accuracy: 0.6820
- Correct Gen Preds: 7.0
- Gen Accuracy: 0.0028
- Correct Gen Preds 34192: 0.0
- Correct Preds 34192: 783.0
- Total Labels 34192: 1196.0
- Accuracy 34192: 0.6547
- Gen Accuracy 34192: 0.0
- Correct Gen Preds 41568: 0.0
- Correct Preds 41568: 905.0
- Total Labels 41568: 1267.0
- Accuracy 41568: 0.7143
- Gen Accuracy 41568: 0.0
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 34192 | Correct Preds 34192 | Total Labels 34192 | Accuracy 34192 | Gen Accuracy 34192 | Correct Gen Preds 41568 | Correct Preds 41568 | Total Labels 41568 | Accuracy 41568 | Gen Accuracy 41568 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.4656 | 0.0054 | 5233.1723 | 3627.3586 | 1196.0 | 2475.0 | 0.4832 | 1204.0 | 0.4865 | 1196.0 | 1196.0 | 1196.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 1.8926 | 1.0 | 3 | 0.7529 | 0.0054 | 2688.4400 | 1863.4846 | 1326.0 | 2475.0 | 0.5358 | 28.0 | 0.0113 | 10.0 | 888.0 | 1196.0 | 0.7425 | 0.0084 | 10.0 | 438.0 | 1267.0 | 0.3457 | 0.0079 |
| 0.8692 | 2.0 | 6 | 1.1837 | 0.0054 | 4226.7103 | 2929.7323 | 1267.0 | 2475.0 | 0.5119 | 156.0 | 0.0630 | 0.0 | 0.0 | 1196.0 | 0.0 | 0.0 | 148.0 | 1267.0 | 1267.0 | 1.0 | 0.1168 |
| 1.9719 | 3.0 | 9 | 0.7201 | 0.0054 | 2571.1173 | 1782.1627 | 1463.0 | 2475.0 | 0.5911 | 10.0 | 0.0040 | 0.0 | 1128.0 | 1196.0 | 0.9431 | 0.0 | 2.0 | 335.0 | 1267.0 | 0.2644 | 0.0016 |
| 0.5314 | 4.0 | 12 | 0.7689 | 0.0054 | 2745.5918 | 1903.0992 | 1349.0 | 2475.0 | 0.5451 | 20.0 | 0.0081 | 0.0 | 104.0 | 1196.0 | 0.0870 | 0.0 | 12.0 | 1245.0 | 1267.0 | 0.9826 | 0.0095 |
| 0.5337 | 5.0 | 15 | 0.8571 | 0.0054 | 3060.3610 | 2121.2806 | 1424.0 | 2475.0 | 0.5754 | 7.0 | 0.0028 | 0.0 | 1151.0 | 1196.0 | 0.9624 | 0.0 | 0.0 | 273.0 | 1267.0 | 0.2155 | 0.0 |
| 0.9397 | 6.0 | 18 | 0.8323 | 0.0054 | 2971.7447 | 2059.8565 | 1552.0 | 2475.0 | 0.6271 | 8.0 | 0.0032 | 0.0 | 401.0 | 1196.0 | 0.3353 | 0.0 | 0.0 | 1151.0 | 1267.0 | 0.9084 | 0.0 |
| 0.1934 | 7.0 | 21 | 0.7423 | 0.0054 | 2650.3578 | 1837.0880 | 1688.0 | 2475.0 | 0.6820 | 7.0 | 0.0028 | 0.0 | 783.0 | 1196.0 | 0.6547 | 0.0 | 0.0 | 905.0 | 1267.0 | 0.7143 | 0.0 |
| 0.9493 | 8.0 | 24 | 1.3176 | 0.0054 | 4704.6623 | 3261.0234 | 1640.0 | 2475.0 | 0.6626 | 278.0 | 0.1123 | 169.0 | 939.0 | 1196.0 | 0.7851 | 0.1413 | 102.0 | 701.0 | 1267.0 | 0.5533 | 0.0805 |
| 0.3022 | 9.0 | 27 | 1.3117 | 0.0054 | 4683.8108 | 3246.5703 | 1515.0 | 2475.0 | 0.6121 | 772.0 | 0.3119 | 126.0 | 429.0 | 1196.0 | 0.3587 | 0.1054 | 638.0 | 1086.0 | 1267.0 | 0.8571 | 0.5036 |
| 0.4787 | 10.0 | 30 | 1.0910 | 0.0054 | 3895.7073 | 2700.2985 | 1628.0 | 2475.0 | 0.6578 | 823.0 | 0.3325 | 274.0 | 718.0 | 1196.0 | 0.6003 | 0.2291 | 541.0 | 910.0 | 1267.0 | 0.7182 | 0.4270 |
| 0.0038 | 11.0 | 33 | 2.0771 | 0.0054 | 7416.5828 | 5140.7834 | 1584.0 | 2475.0 | 0.64 | 1201.0 | 0.4853 | 468.0 | 681.0 | 1196.0 | 0.5694 | 0.3913 | 726.0 | 903.0 | 1267.0 | 0.7127 | 0.5730 |
| 0.0043 | 12.0 | 36 | 2.2125 | 0.0054 | 7900.0977 | 5475.9304 | 1585.0 | 2475.0 | 0.6404 | 1442.0 | 0.5826 | 603.0 | 662.0 | 1196.0 | 0.5535 | 0.5042 | 832.0 | 923.0 | 1267.0 | 0.7285 | 0.6567 |
| 0.0001 | 13.0 | 39 | 2.2169 | 0.0054 | 7915.9312 | 5486.9054 | 1621.0 | 2475.0 | 0.6549 | 1458.0 | 0.5891 | 751.0 | 818.0 | 1196.0 | 0.6839 | 0.6279 | 700.0 | 803.0 | 1267.0 | 0.6338 | 0.5525 |
| 0.4529 | 14.0 | 42 | 2.5412 | 0.0054 | 9073.8672 | 6289.5255 | 1607.0 | 2475.0 | 0.6493 | 1464.0 | 0.5915 | 614.0 | 677.0 | 1196.0 | 0.5661 | 0.5134 | 843.0 | 930.0 | 1267.0 | 0.7340 | 0.6654 |
| 0.4524 | 15.0 | 45 | 2.9305 | 0.0054 | 10463.8496 | 7252.9878 | 1610.0 | 2475.0 | 0.6505 | 1474.0 | 0.5956 | 683.0 | 741.0 | 1196.0 | 0.6196 | 0.5711 | 784.0 | 869.0 | 1267.0 | 0.6859 | 0.6188 |
| 0.9048 | 16.0 | 48 | 3.2752 | 0.0054 | 11694.7828 | 8106.2057 | 1621.0 | 2475.0 | 0.6549 | 1480.0 | 0.5980 | 788.0 | 854.0 | 1196.0 | 0.7140 | 0.6589 | 685.0 | 767.0 | 1267.0 | 0.6054 | 0.5406 |
| 0.0 | 17.0 | 51 | 3.5192 | 0.0054 | 12565.8354 | 8709.9734 | 1616.0 | 2475.0 | 0.6529 | 1475.0 | 0.5960 | 827.0 | 897.0 | 1196.0 | 0.75 | 0.6915 | 641.0 | 719.0 | 1267.0 | 0.5675 | 0.5059 |
| 0.4524 | 18.0 | 54 | 3.6613 | 0.0054 | 13073.4132 | 9061.7995 | 1603.0 | 2475.0 | 0.6477 | 1468.0 | 0.5931 | 847.0 | 915.0 | 1196.0 | 0.7651 | 0.7082 | 614.0 | 688.0 | 1267.0 | 0.5430 | 0.4846 |
| 0.4524 | 19.0 | 57 | 3.7335 | 0.0054 | 13331.1999 | 9240.4837 | 1605.0 | 2475.0 | 0.6485 | 1470.0 | 0.5939 | 857.0 | 927.0 | 1196.0 | 0.7751 | 0.7166 | 606.0 | 678.0 | 1267.0 | 0.5351 | 0.4783 |
| 0.0 | 20.0 | 60 | 3.7780 | 0.0054 | 13489.9545 | 9350.5239 | 1598.0 | 2475.0 | 0.6457 | 1479.0 | 0.5976 | 868.0 | 928.0 | 1196.0 | 0.7759 | 0.7258 | 604.0 | 670.0 | 1267.0 | 0.5288 | 0.4767 |
| 0.0 | 21.0 | 63 | 3.7912 | 0.0054 | 13537.2531 | 9383.3089 | 1604.0 | 2475.0 | 0.6481 | 1484.0 | 0.5996 | 871.0 | 933.0 | 1196.0 | 0.7801 | 0.7283 | 606.0 | 671.0 | 1267.0 | 0.5296 | 0.4783 |
| 0.4524 | 22.0 | 66 | 3.7968 | 0.0054 | 13557.2370 | 9397.1606 | 1605.0 | 2475.0 | 0.6485 | 1480.0 | 0.5980 | 871.0 | 931.0 | 1196.0 | 0.7784 | 0.7283 | 602.0 | 674.0 | 1267.0 | 0.5320 | 0.4751 |
| 0.9048 | 23.0 | 69 | 3.8063 | 0.0054 | 13591.2194 | 9420.7154 | 1607.0 | 2475.0 | 0.6493 | 1484.0 | 0.5996 | 871.0 | 933.0 | 1196.0 | 0.7801 | 0.7283 | 606.0 | 674.0 | 1267.0 | 0.5320 | 0.4783 |
| 0.4524 | 24.0 | 72 | 3.8033 | 0.0054 | 13580.3724 | 9413.1968 | 1607.0 | 2475.0 | 0.6493 | 1489.0 | 0.6016 | 871.0 | 930.0 | 1196.0 | 0.7776 | 0.7283 | 611.0 | 677.0 | 1267.0 | 0.5343 | 0.4822 |
| 0.4524 | 25.0 | 75 | 3.8088 | 0.0054 | 13599.9088 | 9426.7384 | 1606.0 | 2475.0 | 0.6489 | 1492.0 | 0.6028 | 872.0 | 928.0 | 1196.0 | 0.7759 | 0.7291 | 613.0 | 678.0 | 1267.0 | 0.5351 | 0.4838 |
| 0.9048 | 26.0 | 78 | 3.8077 | 0.0054 | 13595.8914 | 9423.9538 | 1608.0 | 2475.0 | 0.6497 | 1494.0 | 0.6036 | 871.0 | 928.0 | 1196.0 | 0.7759 | 0.7283 | 616.0 | 680.0 | 1267.0 | 0.5367 | 0.4862 |
| 0.4524 | 27.0 | 81 | 3.8064 | 0.0054 | 13591.4585 | 9420.8811 | 1609.0 | 2475.0 | 0.6501 | 1493.0 | 0.6032 | 869.0 | 928.0 | 1196.0 | 0.7759 | 0.7266 | 617.0 | 681.0 | 1267.0 | 0.5375 | 0.4870 |
| 0.0 | 28.0 | 84 | 3.8025 | 0.0054 | 13577.4399 | 9411.1642 | 1608.0 | 2475.0 | 0.6497 | 1495.0 | 0.6040 | 870.0 | 928.0 | 1196.0 | 0.7759 | 0.7274 | 618.0 | 680.0 | 1267.0 | 0.5367 | 0.4878 |
| 0.0 | 29.0 | 87 | 3.8055 | 0.0054 | 13588.0611 | 9418.5263 | 1615.0 | 2475.0 | 0.6525 | 1496.0 | 0.6044 | 869.0 | 931.0 | 1196.0 | 0.7784 | 0.7266 | 620.0 | 684.0 | 1267.0 | 0.5399 | 0.4893 |
| 0.0 | 30.0 | 90 | 3.8028 | 0.0054 | 13578.4494 | 9411.8640 | 1607.0 | 2475.0 | 0.6493 | 1499.0 | 0.6057 | 868.0 | 925.0 | 1196.0 | 0.7734 | 0.7258 | 624.0 | 682.0 | 1267.0 | 0.5383 | 0.4925 |
| 0.4524 | 31.0 | 93 | 3.8007 | 0.0054 | 13571.0659 | 9406.7461 | 1610.0 | 2475.0 | 0.6505 | 1497.0 | 0.6048 | 866.0 | 924.0 | 1196.0 | 0.7726 | 0.7241 | 624.0 | 686.0 | 1267.0 | 0.5414 | 0.4925 |
| 0.4524 | 32.0 | 96 | 3.8028 | 0.0054 | 13578.6452 | 9411.9996 | 1610.0 | 2475.0 | 0.6505 | 1498.0 | 0.6053 | 862.0 | 923.0 | 1196.0 | 0.7717 | 0.7207 | 629.0 | 687.0 | 1267.0 | 0.5422 | 0.4964 |
| 0.4524 | 33.0 | 99 | 3.8003 | 0.0054 | 13569.6111 | 9405.7377 | 1610.0 | 2475.0 | 0.6505 | 1503.0 | 0.6073 | 869.0 | 925.0 | 1196.0 | 0.7734 | 0.7266 | 627.0 | 685.0 | 1267.0 | 0.5406 | 0.4949 |
| 0.4524 | 34.0 | 102 | 3.7994 | 0.0054 | 13566.5818 | 9403.6379 | 1611.0 | 2475.0 | 0.6509 | 1501.0 | 0.6065 | 866.0 | 924.0 | 1196.0 | 0.7726 | 0.7241 | 628.0 | 687.0 | 1267.0 | 0.5422 | 0.4957 |
| 0.0 | 35.0 | 105 | 3.7990 | 0.0054 | 13565.1272 | 9402.6297 | 1610.0 | 2475.0 | 0.6505 | 1501.0 | 0.6065 | 864.0 | 920.0 | 1196.0 | 0.7692 | 0.7224 | 630.0 | 690.0 | 1267.0 | 0.5446 | 0.4972 |
| 0.4524 | 36.0 | 108 | 3.8014 | 0.0054 | 13573.4117 | 9408.3720 | 1610.0 | 2475.0 | 0.6505 | 1499.0 | 0.6057 | 866.0 | 921.0 | 1196.0 | 0.7701 | 0.7241 | 626.0 | 689.0 | 1267.0 | 0.5438 | 0.4941 |
| 0.4524 | 37.0 | 111 | 3.7985 | 0.0054 | 13563.0986 | 9401.2236 | 1608.0 | 2475.0 | 0.6497 | 1503.0 | 0.6073 | 864.0 | 920.0 | 1196.0 | 0.7692 | 0.7224 | 632.0 | 688.0 | 1267.0 | 0.5430 | 0.4988 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-ivxh9bjy
Base model
meta-llama/Llama-3.2-1B