GSM8K-Binary_Llama-3.2-1B-n45gfm9o
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.4701
- Model Preparation Time: 0.0059
- Mdl: 5249.0917
- Accumulated Loss: 3638.3931
- Correct Preds: 1954.0
- Total Preds: 2475.0
- Accuracy: 0.7895
- Correct Gen Preds: 1866.0
- Gen Accuracy: 0.7539
- Correct Gen Preds 34192: 949.0
- Correct Preds 34192: 1014.0
- Total Labels 34192: 1196.0
- Accuracy 34192: 0.8478
- Gen Accuracy 34192: 0.7935
- Correct Gen Preds 41568: 908.0
- Correct Preds 41568: 940.0
- Total Labels 41568: 1267.0
- Accuracy 41568: 0.7419
- Gen Accuracy 41568: 0.7167
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 34192 | Correct Preds 34192 | Total Labels 34192 | Accuracy 34192 | Gen Accuracy 34192 | Correct Gen Preds 41568 | Correct Preds 41568 | Total Labels 41568 | Accuracy 41568 | Gen Accuracy 41568 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.4656 | 0.0059 | 5233.1723 | 3627.3586 | 1196.0 | 2475.0 | 0.4832 | 1204.0 | 0.4865 | 1196.0 | 1196.0 | 1196.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 0.774 | 1.0 | 36 | 0.7619 | 0.0059 | 2720.3405 | 1885.5964 | 1654.0 | 2475.0 | 0.6683 | 8.0 | 0.0032 | 0.0 | 565.0 | 1196.0 | 0.4724 | 0.0 | 0.0 | 1089.0 | 1267.0 | 0.8595 | 0.0 |
| 0.558 | 2.0 | 72 | 0.7026 | 0.0059 | 2508.8193 | 1738.9811 | 1704.0 | 2475.0 | 0.6885 | 8.0 | 0.0032 | 0.0 | 527.0 | 1196.0 | 0.4406 | 0.0 | 0.0 | 1177.0 | 1267.0 | 0.9290 | 0.0 |
| 0.3275 | 3.0 | 108 | 0.7695 | 0.0059 | 2747.8047 | 1904.6331 | 1687.0 | 2475.0 | 0.6816 | 16.0 | 0.0065 | 0.0 | 533.0 | 1196.0 | 0.4457 | 0.0 | 8.0 | 1154.0 | 1267.0 | 0.9108 | 0.0063 |
| 0.2039 | 4.0 | 144 | 0.6071 | 0.0059 | 2167.6033 | 1502.4681 | 1932.0 | 2475.0 | 0.7806 | 71.0 | 0.0287 | 2.0 | 916.0 | 1196.0 | 0.7659 | 0.0017 | 61.0 | 1016.0 | 1267.0 | 0.8019 | 0.0481 |
| 0.2072 | 5.0 | 180 | 1.0169 | 0.0059 | 3630.9726 | 2516.7984 | 1924.0 | 2475.0 | 0.7774 | 1267.0 | 0.5119 | 597.0 | 1044.0 | 1196.0 | 0.8729 | 0.4992 | 661.0 | 880.0 | 1267.0 | 0.6946 | 0.5217 |
| 0.066 | 6.0 | 216 | 1.0546 | 0.0059 | 3765.5184 | 2610.0584 | 1893.0 | 2475.0 | 0.7648 | 1017.0 | 0.4109 | 500.0 | 1079.0 | 1196.0 | 0.9022 | 0.4181 | 509.0 | 814.0 | 1267.0 | 0.6425 | 0.4017 |
| 0.0004 | 7.0 | 252 | 1.2220 | 0.0059 | 4363.2817 | 3024.3964 | 1906.0 | 2475.0 | 0.7701 | 1454.0 | 0.5875 | 544.0 | 850.0 | 1196.0 | 0.7107 | 0.4548 | 902.0 | 1056.0 | 1267.0 | 0.8335 | 0.7119 |
| 0.0021 | 8.0 | 288 | 1.7093 | 0.0059 | 6103.2758 | 4230.4684 | 1892.0 | 2475.0 | 0.7644 | 1727.0 | 0.6978 | 1018.0 | 1103.0 | 1196.0 | 0.9222 | 0.8512 | 701.0 | 789.0 | 1267.0 | 0.6227 | 0.5533 |
| 0.0002 | 9.0 | 324 | 1.5981 | 0.0059 | 5706.2934 | 3955.3012 | 1900.0 | 2475.0 | 0.7677 | 1761.0 | 0.7115 | 1001.0 | 1078.0 | 1196.0 | 0.9013 | 0.8370 | 752.0 | 822.0 | 1267.0 | 0.6488 | 0.5935 |
| 0.0001 | 10.0 | 360 | 1.4701 | 0.0059 | 5249.0917 | 3638.3931 | 1954.0 | 2475.0 | 0.7895 | 1866.0 | 0.7539 | 949.0 | 1014.0 | 1196.0 | 0.8478 | 0.7935 | 908.0 | 940.0 | 1267.0 | 0.7419 | 0.7167 |
| 0.0 | 11.0 | 396 | 1.4879 | 0.0059 | 5312.7263 | 3682.5013 | 1950.0 | 2475.0 | 0.7879 | 1869.0 | 0.7552 | 958.0 | 1020.0 | 1196.0 | 0.8528 | 0.8010 | 903.0 | 930.0 | 1267.0 | 0.7340 | 0.7127 |
| 0.0 | 12.0 | 432 | 1.4948 | 0.0059 | 5337.3050 | 3699.5379 | 1948.0 | 2475.0 | 0.7871 | 1867.0 | 0.7543 | 960.0 | 1022.0 | 1196.0 | 0.8545 | 0.8027 | 898.0 | 926.0 | 1267.0 | 0.7309 | 0.7088 |
| 0.0 | 13.0 | 468 | 1.5004 | 0.0059 | 5357.5988 | 3713.6045 | 1946.0 | 2475.0 | 0.7863 | 1866.0 | 0.7539 | 961.0 | 1024.0 | 1196.0 | 0.8562 | 0.8035 | 896.0 | 922.0 | 1267.0 | 0.7277 | 0.7072 |
| 0.7841 | 14.0 | 504 | 1.5063 | 0.0059 | 5378.5450 | 3728.1233 | 1948.0 | 2475.0 | 0.7871 | 1871.0 | 0.7560 | 966.0 | 1026.0 | 1196.0 | 0.8579 | 0.8077 | 896.0 | 922.0 | 1267.0 | 0.7277 | 0.7072 |
| 0.0 | 15.0 | 540 | 1.5092 | 0.0059 | 5388.9284 | 3735.3205 | 1945.0 | 2475.0 | 0.7859 | 1871.0 | 0.7560 | 970.0 | 1026.0 | 1196.0 | 0.8579 | 0.8110 | 893.0 | 919.0 | 1267.0 | 0.7253 | 0.7048 |
| 0.7841 | 16.0 | 576 | 1.5127 | 0.0059 | 5401.2640 | 3743.8709 | 1944.0 | 2475.0 | 0.7855 | 1869.0 | 0.7552 | 968.0 | 1025.0 | 1196.0 | 0.8570 | 0.8094 | 892.0 | 919.0 | 1267.0 | 0.7253 | 0.7040 |
| 0.0 | 17.0 | 612 | 1.5144 | 0.0059 | 5407.4933 | 3748.1887 | 1943.0 | 2475.0 | 0.7851 | 1875.0 | 0.7576 | 973.0 | 1026.0 | 1196.0 | 0.8579 | 0.8135 | 893.0 | 917.0 | 1267.0 | 0.7238 | 0.7048 |
| 0.0 | 18.0 | 648 | 1.5177 | 0.0059 | 5419.3768 | 3756.4257 | 1947.0 | 2475.0 | 0.7867 | 1872.0 | 0.7564 | 972.0 | 1029.0 | 1196.0 | 0.8604 | 0.8127 | 892.0 | 918.0 | 1267.0 | 0.7245 | 0.7040 |
| 0.0 | 19.0 | 684 | 1.5195 | 0.0059 | 5425.7845 | 3760.8672 | 1945.0 | 2475.0 | 0.7859 | 1877.0 | 0.7584 | 976.0 | 1028.0 | 1196.0 | 0.8595 | 0.8161 | 893.0 | 917.0 | 1267.0 | 0.7238 | 0.7048 |
| 0.0 | 20.0 | 720 | 1.5238 | 0.0059 | 5441.1232 | 3771.4992 | 1944.0 | 2475.0 | 0.7855 | 1875.0 | 0.7576 | 972.0 | 1027.0 | 1196.0 | 0.8587 | 0.8127 | 895.0 | 917.0 | 1267.0 | 0.7238 | 0.7064 |
| 0.0 | 21.0 | 756 | 1.5265 | 0.0059 | 5450.7090 | 3778.1436 | 1948.0 | 2475.0 | 0.7871 | 1881.0 | 0.76 | 979.0 | 1031.0 | 1196.0 | 0.8620 | 0.8186 | 893.0 | 917.0 | 1267.0 | 0.7238 | 0.7048 |
| 0.0 | 22.0 | 792 | 1.5266 | 0.0059 | 5450.9628 | 3778.3195 | 1944.0 | 2475.0 | 0.7855 | 1874.0 | 0.7572 | 974.0 | 1029.0 | 1196.0 | 0.8604 | 0.8144 | 891.0 | 915.0 | 1267.0 | 0.7222 | 0.7032 |
| 0.0 | 23.0 | 828 | 1.5306 | 0.0059 | 5465.2671 | 3788.2345 | 1947.0 | 2475.0 | 0.7867 | 1877.0 | 0.7584 | 979.0 | 1032.0 | 1196.0 | 0.8629 | 0.8186 | 890.0 | 915.0 | 1267.0 | 0.7222 | 0.7024 |
| 0.0 | 24.0 | 864 | 1.5319 | 0.0059 | 5469.8094 | 3791.3829 | 1950.0 | 2475.0 | 0.7879 | 1882.0 | 0.7604 | 981.0 | 1033.0 | 1196.0 | 0.8637 | 0.8202 | 893.0 | 917.0 | 1267.0 | 0.7238 | 0.7048 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-n45gfm9o
Base model
meta-llama/Llama-3.2-1B