GSM8K-Binary_Llama-3.2-1B-kcrbohqy
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.5026
- Model Preparation Time: 0.0059
- Mdl: 5365.4215
- Accumulated Loss: 3719.0268
- Correct Preds: 1963.0
- Total Preds: 2475.0
- Accuracy: 0.7931
- Correct Gen Preds: 1691.0
- Gen Accuracy: 0.6832
- Correct Gen Preds 34192: 871.0
- Correct Preds 34192: 1035.0
- Total Labels 34192: 1196.0
- Accuracy 34192: 0.8654
- Gen Accuracy 34192: 0.7283
- Correct Gen Preds 41568: 814.0
- Correct Preds 41568: 928.0
- Total Labels 41568: 1267.0
- Accuracy 41568: 0.7324
- Gen Accuracy 41568: 0.6425
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 34192 | Correct Preds 34192 | Total Labels 34192 | Accuracy 34192 | Gen Accuracy 34192 | Correct Gen Preds 41568 | Correct Preds 41568 | Total Labels 41568 | Accuracy 41568 | Gen Accuracy 41568 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.4656 | 0.0059 | 5233.1723 | 3627.3586 | 1196.0 | 2475.0 | 0.4832 | 1204.0 | 0.4865 | 1196.0 | 1196.0 | 1196.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 1.0257 | 1.0 | 47 | 0.6155 | 0.0059 | 2197.8915 | 1523.4623 | 1717.0 | 2475.0 | 0.6937 | 8.0 | 0.0032 | 0.0 | 1150.0 | 1196.0 | 0.9615 | 0.0 | 0.0 | 567.0 | 1267.0 | 0.4475 | 0.0 |
| 0.2829 | 2.0 | 94 | 0.5306 | 0.0059 | 1894.7016 | 1313.3071 | 1908.0 | 2475.0 | 0.7709 | 7.0 | 0.0028 | 0.0 | 989.0 | 1196.0 | 0.8269 | 0.0 | 0.0 | 919.0 | 1267.0 | 0.7253 | 0.0 |
| 0.0289 | 3.0 | 141 | 0.5964 | 0.0059 | 2129.4510 | 1476.0229 | 1931.0 | 2475.0 | 0.7802 | 7.0 | 0.0028 | 0.0 | 886.0 | 1196.0 | 0.7408 | 0.0 | 0.0 | 1045.0 | 1267.0 | 0.8248 | 0.0 |
| 0.2101 | 4.0 | 188 | 1.0279 | 0.0059 | 3670.3209 | 2544.0726 | 1902.0 | 2475.0 | 0.7685 | 689.0 | 0.2784 | 293.0 | 1091.0 | 1196.0 | 0.9122 | 0.2450 | 388.0 | 811.0 | 1267.0 | 0.6401 | 0.3062 |
| 0.7687 | 5.0 | 235 | 0.9640 | 0.0059 | 3442.2378 | 2385.9775 | 1948.0 | 2475.0 | 0.7871 | 446.0 | 0.1802 | 132.0 | 1004.0 | 1196.0 | 0.8395 | 0.1104 | 307.0 | 944.0 | 1267.0 | 0.7451 | 0.2423 |
| 0.0001 | 6.0 | 282 | 1.8360 | 0.0059 | 6555.6511 | 4544.0311 | 1870.0 | 2475.0 | 0.7556 | 1150.0 | 0.4646 | 673.0 | 1110.0 | 1196.0 | 0.9281 | 0.5627 | 470.0 | 760.0 | 1267.0 | 0.5998 | 0.3710 |
| 0.0114 | 7.0 | 329 | 1.5026 | 0.0059 | 5365.4215 | 3719.0268 | 1963.0 | 2475.0 | 0.7931 | 1691.0 | 0.6832 | 871.0 | 1035.0 | 1196.0 | 0.8654 | 0.7283 | 814.0 | 928.0 | 1267.0 | 0.7324 | 0.6425 |
| 0.0001 | 8.0 | 376 | 1.7366 | 0.0059 | 6200.7876 | 4298.0585 | 1937.0 | 2475.0 | 0.7826 | 1561.0 | 0.6307 | 686.0 | 938.0 | 1196.0 | 0.7843 | 0.5736 | 867.0 | 999.0 | 1267.0 | 0.7885 | 0.6843 |
| 0.0 | 9.0 | 423 | 1.9428 | 0.0059 | 6937.0874 | 4808.4226 | 1919.0 | 2475.0 | 0.7754 | 1722.0 | 0.6958 | 823.0 | 959.0 | 1196.0 | 0.8018 | 0.6881 | 890.0 | 960.0 | 1267.0 | 0.7577 | 0.7024 |
| 0.0 | 10.0 | 470 | 1.9499 | 0.0059 | 6962.3264 | 4825.9169 | 1923.0 | 2475.0 | 0.7770 | 1742.0 | 0.7038 | 785.0 | 911.0 | 1196.0 | 0.7617 | 0.6564 | 948.0 | 1012.0 | 1267.0 | 0.7987 | 0.7482 |
| 0.0 | 11.0 | 517 | 1.9904 | 0.0059 | 7107.2016 | 4926.3368 | 1928.0 | 2475.0 | 0.7790 | 1741.0 | 0.7034 | 869.0 | 1000.0 | 1196.0 | 0.8361 | 0.7266 | 863.0 | 928.0 | 1267.0 | 0.7324 | 0.6811 |
| 0.0 | 12.0 | 564 | 1.9920 | 0.0059 | 7112.9400 | 4930.3143 | 1931.0 | 2475.0 | 0.7802 | 1740.0 | 0.7030 | 871.0 | 1003.0 | 1196.0 | 0.8386 | 0.7283 | 860.0 | 928.0 | 1267.0 | 0.7324 | 0.6788 |
| 0.0 | 13.0 | 611 | 1.9893 | 0.0059 | 7103.2337 | 4923.5864 | 1931.0 | 2475.0 | 0.7802 | 1749.0 | 0.7067 | 872.0 | 1000.0 | 1196.0 | 0.8361 | 0.7291 | 869.0 | 931.0 | 1267.0 | 0.7348 | 0.6859 |
| 0.0 | 14.0 | 658 | 1.9887 | 0.0059 | 7100.8464 | 4921.9317 | 1930.0 | 2475.0 | 0.7798 | 1755.0 | 0.7091 | 874.0 | 1000.0 | 1196.0 | 0.8361 | 0.7308 | 872.0 | 930.0 | 1267.0 | 0.7340 | 0.6882 |
| 0.0 | 15.0 | 705 | 1.9885 | 0.0059 | 7100.2310 | 4921.5051 | 1933.0 | 2475.0 | 0.7810 | 1750.0 | 0.7071 | 873.0 | 1000.0 | 1196.0 | 0.8361 | 0.7299 | 868.0 | 933.0 | 1267.0 | 0.7364 | 0.6851 |
| 0.0 | 16.0 | 752 | 1.9883 | 0.0059 | 7099.4041 | 4920.9320 | 1931.0 | 2475.0 | 0.7802 | 1755.0 | 0.7091 | 876.0 | 999.0 | 1196.0 | 0.8353 | 0.7324 | 870.0 | 932.0 | 1267.0 | 0.7356 | 0.6867 |
| 0.0 | 17.0 | 799 | 1.9886 | 0.0059 | 7100.6278 | 4921.7802 | 1931.0 | 2475.0 | 0.7802 | 1755.0 | 0.7091 | 873.0 | 999.0 | 1196.0 | 0.8353 | 0.7299 | 874.0 | 932.0 | 1267.0 | 0.7356 | 0.6898 |
| 0.0 | 18.0 | 846 | 1.9860 | 0.0059 | 7091.2435 | 4915.2755 | 1934.0 | 2475.0 | 0.7814 | 1759.0 | 0.7107 | 877.0 | 999.0 | 1196.0 | 0.8353 | 0.7333 | 873.0 | 935.0 | 1267.0 | 0.7380 | 0.6890 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-kcrbohqy
Base model
meta-llama/Llama-3.2-1B