GSM8K-Binary_Llama-3.2-1B-kyohy420
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.7333
- Model Preparation Time: 0.0057
- Mdl: 6189.0935
- Accumulated Loss: 4289.9527
- Correct Preds: 1966.0
- Total Preds: 2475.0
- Accuracy: 0.7943
- Correct Gen Preds: 1919.0
- Gen Accuracy: 0.7754
- Correct Gen Preds 34192: 1033.0
- Correct Preds 34192: 1051.0
- Total Labels 34192: 1196.0
- Accuracy 34192: 0.8788
- Gen Accuracy 34192: 0.8637
- Correct Gen Preds 41568: 879.0
- Correct Preds 41568: 915.0
- Total Labels 41568: 1267.0
- Accuracy 41568: 0.7222
- Gen Accuracy 41568: 0.6938
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 34192 | Correct Preds 34192 | Total Labels 34192 | Accuracy 34192 | Gen Accuracy 34192 | Correct Gen Preds 41568 | Correct Preds 41568 | Total Labels 41568 | Accuracy 41568 | Gen Accuracy 41568 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.4656 | 0.0057 | 5233.1723 | 3627.3586 | 1196.0 | 2475.0 | 0.4832 | 1204.0 | 0.4865 | 1196.0 | 1196.0 | 1196.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 0.5871 | 1.0 | 41 | 0.5846 | 0.0057 | 2087.4184 | 1446.8881 | 1815.0 | 2475.0 | 0.7333 | 8.0 | 0.0032 | 0.0 | 796.0 | 1196.0 | 0.6656 | 0.0 | 0.0 | 1019.0 | 1267.0 | 0.8043 | 0.0 |
| 0.3259 | 2.0 | 82 | 0.5427 | 0.0057 | 1937.7495 | 1343.1456 | 1911.0 | 2475.0 | 0.7721 | 8.0 | 0.0032 | 1.0 | 957.0 | 1196.0 | 0.8002 | 0.0008 | 0.0 | 954.0 | 1267.0 | 0.7530 | 0.0 |
| 0.1383 | 3.0 | 123 | 0.6677 | 0.0057 | 2384.2568 | 1652.6409 | 1892.0 | 2475.0 | 0.7644 | 87.0 | 0.0352 | 19.0 | 845.0 | 1196.0 | 0.7065 | 0.0159 | 60.0 | 1047.0 | 1267.0 | 0.8264 | 0.0474 |
| 0.5615 | 4.0 | 164 | 0.9346 | 0.0057 | 3337.0389 | 2313.0591 | 1909.0 | 2475.0 | 0.7713 | 1042.0 | 0.4210 | 614.0 | 1035.0 | 1196.0 | 0.8654 | 0.5134 | 420.0 | 874.0 | 1267.0 | 0.6898 | 0.3315 |
| 0.0373 | 5.0 | 205 | 0.9718 | 0.0057 | 3470.0702 | 2405.2694 | 1938.0 | 2475.0 | 0.7830 | 1048.0 | 0.4234 | 528.0 | 975.0 | 1196.0 | 0.8152 | 0.4415 | 514.0 | 963.0 | 1267.0 | 0.7601 | 0.4057 |
| 0.0417 | 6.0 | 246 | 1.2990 | 0.0057 | 4638.2735 | 3215.0062 | 1895.0 | 2475.0 | 0.7657 | 1372.0 | 0.5543 | 553.0 | 873.0 | 1196.0 | 0.7299 | 0.4624 | 812.0 | 1022.0 | 1267.0 | 0.8066 | 0.6409 |
| 0.006 | 7.0 | 287 | 1.6147 | 0.0057 | 5765.6029 | 3996.4114 | 1942.0 | 2475.0 | 0.7846 | 1793.0 | 0.7244 | 962.0 | 1013.0 | 1196.0 | 0.8470 | 0.8043 | 823.0 | 929.0 | 1267.0 | 0.7332 | 0.6496 |
| 0.0 | 8.0 | 328 | 1.6999 | 0.0057 | 6069.9536 | 4207.3712 | 1965.0 | 2475.0 | 0.7939 | 1893.0 | 0.7648 | 1030.0 | 1052.0 | 1196.0 | 0.8796 | 0.8612 | 856.0 | 913.0 | 1267.0 | 0.7206 | 0.6756 |
| 0.0 | 9.0 | 369 | 1.7440 | 0.0057 | 6227.1358 | 4316.3216 | 1964.0 | 2475.0 | 0.7935 | 1913.0 | 0.7729 | 1037.0 | 1053.0 | 1196.0 | 0.8804 | 0.8671 | 869.0 | 911.0 | 1267.0 | 0.7190 | 0.6859 |
| 0.0 | 10.0 | 410 | 1.7394 | 0.0057 | 6210.7145 | 4304.9393 | 1963.0 | 2475.0 | 0.7931 | 1915.0 | 0.7737 | 1035.0 | 1052.0 | 1196.0 | 0.8796 | 0.8654 | 873.0 | 911.0 | 1267.0 | 0.7190 | 0.6890 |
| 0.0 | 11.0 | 451 | 1.7371 | 0.0057 | 6202.5019 | 4299.2467 | 1963.0 | 2475.0 | 0.7931 | 1914.0 | 0.7733 | 1033.0 | 1050.0 | 1196.0 | 0.8779 | 0.8637 | 874.0 | 913.0 | 1267.0 | 0.7206 | 0.6898 |
| 0.0 | 12.0 | 492 | 1.7354 | 0.0057 | 6196.6385 | 4295.1825 | 1964.0 | 2475.0 | 0.7935 | 1915.0 | 0.7737 | 1034.0 | 1051.0 | 1196.0 | 0.8788 | 0.8645 | 874.0 | 913.0 | 1267.0 | 0.7206 | 0.6898 |
| 0.0 | 13.0 | 533 | 1.7333 | 0.0057 | 6189.0935 | 4289.9527 | 1966.0 | 2475.0 | 0.7943 | 1919.0 | 0.7754 | 1033.0 | 1051.0 | 1196.0 | 0.8788 | 0.8637 | 879.0 | 915.0 | 1267.0 | 0.7222 | 0.6938 |
| 0.0 | 14.0 | 574 | 1.7293 | 0.0057 | 6174.5892 | 4279.8991 | 1962.0 | 2475.0 | 0.7927 | 1917.0 | 0.7745 | 1031.0 | 1047.0 | 1196.0 | 0.8754 | 0.8620 | 879.0 | 915.0 | 1267.0 | 0.7222 | 0.6938 |
| 0.0 | 15.0 | 615 | 1.7313 | 0.0057 | 6182.0153 | 4285.0465 | 1962.0 | 2475.0 | 0.7927 | 1919.0 | 0.7754 | 1030.0 | 1047.0 | 1196.0 | 0.8754 | 0.8612 | 882.0 | 915.0 | 1267.0 | 0.7222 | 0.6961 |
| 0.0 | 16.0 | 656 | 1.7293 | 0.0057 | 6174.8938 | 4280.1102 | 1964.0 | 2475.0 | 0.7935 | 1919.0 | 0.7754 | 1031.0 | 1048.0 | 1196.0 | 0.8763 | 0.8620 | 881.0 | 916.0 | 1267.0 | 0.7230 | 0.6953 |
| 0.0 | 17.0 | 697 | 1.7291 | 0.0057 | 6174.0165 | 4279.5021 | 1964.0 | 2475.0 | 0.7935 | 1920.0 | 0.7758 | 1032.0 | 1049.0 | 1196.0 | 0.8771 | 0.8629 | 881.0 | 915.0 | 1267.0 | 0.7222 | 0.6953 |
| 0.0 | 18.0 | 738 | 1.7284 | 0.0057 | 6171.6047 | 4277.8304 | 1964.0 | 2475.0 | 0.7935 | 1920.0 | 0.7758 | 1031.0 | 1048.0 | 1196.0 | 0.8763 | 0.8620 | 882.0 | 916.0 | 1267.0 | 0.7230 | 0.6961 |
| 0.0 | 19.0 | 779 | 1.7288 | 0.0057 | 6172.8248 | 4278.6761 | 1965.0 | 2475.0 | 0.7939 | 1922.0 | 0.7766 | 1030.0 | 1047.0 | 1196.0 | 0.8754 | 0.8612 | 885.0 | 918.0 | 1267.0 | 0.7245 | 0.6985 |
| 0.0 | 20.0 | 820 | 1.7275 | 0.0057 | 6168.3684 | 4275.5871 | 1963.0 | 2475.0 | 0.7931 | 1919.0 | 0.7754 | 1030.0 | 1047.0 | 1196.0 | 0.8754 | 0.8612 | 882.0 | 916.0 | 1267.0 | 0.7230 | 0.6961 |
| 0.0 | 21.0 | 861 | 1.7275 | 0.0057 | 6168.3839 | 4275.5979 | 1964.0 | 2475.0 | 0.7935 | 1923.0 | 0.7770 | 1031.0 | 1047.0 | 1196.0 | 0.8754 | 0.8620 | 885.0 | 917.0 | 1267.0 | 0.7238 | 0.6985 |
| 0.0 | 22.0 | 902 | 1.7271 | 0.0057 | 6166.9583 | 4274.6097 | 1964.0 | 2475.0 | 0.7935 | 1924.0 | 0.7774 | 1033.0 | 1047.0 | 1196.0 | 0.8754 | 0.8637 | 884.0 | 917.0 | 1267.0 | 0.7238 | 0.6977 |
| 0.4356 | 23.0 | 943 | 1.7266 | 0.0057 | 6165.0094 | 4273.2589 | 1966.0 | 2475.0 | 0.7943 | 1923.0 | 0.7770 | 1030.0 | 1047.0 | 1196.0 | 0.8754 | 0.8612 | 886.0 | 919.0 | 1267.0 | 0.7253 | 0.6993 |
| 0.8713 | 24.0 | 984 | 1.7272 | 0.0057 | 6167.4313 | 4274.9376 | 1965.0 | 2475.0 | 0.7939 | 1926.0 | 0.7782 | 1032.0 | 1047.0 | 1196.0 | 0.8754 | 0.8629 | 887.0 | 918.0 | 1267.0 | 0.7245 | 0.7001 |
| 0.4356 | 25.0 | 1025 | 1.7285 | 0.0057 | 6171.7803 | 4277.9521 | 1963.0 | 2475.0 | 0.7931 | 1921.0 | 0.7762 | 1030.0 | 1047.0 | 1196.0 | 0.8754 | 0.8612 | 884.0 | 916.0 | 1267.0 | 0.7230 | 0.6977 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-kyohy420
Base model
meta-llama/Llama-3.2-1B