GSM8K-Binary_Llama-3.2-1B-trn9haqb
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.3861
- Model Preparation Time: 0.0059
- Mdl: 4949.1510
- Accumulated Loss: 3430.4901
- Correct Preds: 1952.0
- Total Preds: 2475.0
- Accuracy: 0.7887
- Correct Gen Preds: 1959.0
- Gen Accuracy: 0.7915
- Correct Gen Preds 34192: 979.0
- Correct Preds 34192: 980.0
- Total Labels 34192: 1196.0
- Accuracy 34192: 0.8194
- Gen Accuracy 34192: 0.8186
- Correct Gen Preds 41568: 971.0
- Correct Preds 41568: 972.0
- Total Labels 41568: 1267.0
- Accuracy 41568: 0.7672
- Gen Accuracy 41568: 0.7664
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 34192 | Correct Preds 34192 | Total Labels 34192 | Accuracy 34192 | Gen Accuracy 34192 | Correct Gen Preds 41568 | Correct Preds 41568 | Total Labels 41568 | Accuracy 41568 | Gen Accuracy 41568 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.4656 | 0.0059 | 5233.1723 | 3627.3586 | 1196.0 | 2475.0 | 0.4832 | 1204.0 | 0.4865 | 1196.0 | 1196.0 | 1196.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 0.5562 | 1.0 | 33 | 0.5791 | 0.0059 | 2067.7129 | 1433.2294 | 1800.0 | 2475.0 | 0.7273 | 136.0 | 0.0549 | 0.0 | 1001.0 | 1196.0 | 0.8370 | 0.0 | 129.0 | 799.0 | 1267.0 | 0.6306 | 0.1018 |
| 0.2241 | 2.0 | 66 | 0.6917 | 0.0059 | 2469.9161 | 1712.0154 | 1750.0 | 2475.0 | 0.7071 | 25.0 | 0.0101 | 2.0 | 1150.0 | 1196.0 | 0.9615 | 0.0017 | 16.0 | 600.0 | 1267.0 | 0.4736 | 0.0126 |
| 0.2445 | 3.0 | 99 | 0.5807 | 0.0059 | 2073.3593 | 1437.1432 | 1901.0 | 2475.0 | 0.7681 | 672.0 | 0.2715 | 56.0 | 817.0 | 1196.0 | 0.6831 | 0.0468 | 608.0 | 1084.0 | 1267.0 | 0.8556 | 0.4799 |
| 0.3792 | 4.0 | 132 | 1.0107 | 0.0059 | 3608.9732 | 2501.5496 | 1808.0 | 2475.0 | 0.7305 | 752.0 | 0.3038 | 361.0 | 1124.0 | 1196.0 | 0.9398 | 0.3018 | 383.0 | 684.0 | 1267.0 | 0.5399 | 0.3023 |
| 0.5311 | 5.0 | 165 | 1.0453 | 0.0059 | 3732.4069 | 2587.1073 | 1949.0 | 2475.0 | 0.7875 | 1925.0 | 0.7778 | 965.0 | 987.0 | 1196.0 | 0.8253 | 0.8069 | 951.0 | 962.0 | 1267.0 | 0.7593 | 0.7506 |
| 0.0003 | 6.0 | 198 | 1.1808 | 0.0059 | 4216.3306 | 2922.5376 | 1929.0 | 2475.0 | 0.7794 | 1889.0 | 0.7632 | 997.0 | 1034.0 | 1196.0 | 0.8645 | 0.8336 | 885.0 | 895.0 | 1267.0 | 0.7064 | 0.6985 |
| 0.4714 | 7.0 | 231 | 1.7950 | 0.0059 | 6409.2129 | 4442.5278 | 1910.0 | 2475.0 | 0.7717 | 1900.0 | 0.7677 | 1077.0 | 1092.0 | 1196.0 | 0.9130 | 0.9005 | 815.0 | 818.0 | 1267.0 | 0.6456 | 0.6433 |
| 0.0002 | 8.0 | 264 | 1.3861 | 0.0059 | 4949.1510 | 3430.4901 | 1952.0 | 2475.0 | 0.7887 | 1959.0 | 0.7915 | 979.0 | 980.0 | 1196.0 | 0.8194 | 0.8186 | 971.0 | 972.0 | 1267.0 | 0.7672 | 0.7664 |
| 0.0001 | 9.0 | 297 | 1.8078 | 0.0059 | 6455.1510 | 4474.3697 | 1889.0 | 2475.0 | 0.7632 | 1895.0 | 0.7657 | 1088.0 | 1089.0 | 1196.0 | 0.9105 | 0.9097 | 799.0 | 800.0 | 1267.0 | 0.6314 | 0.6306 |
| 0.0 | 10.0 | 330 | 1.6442 | 0.0059 | 5870.8161 | 4069.3396 | 1937.0 | 2475.0 | 0.7826 | 1944.0 | 0.7855 | 1059.0 | 1059.0 | 1196.0 | 0.8855 | 0.8855 | 877.0 | 878.0 | 1267.0 | 0.6930 | 0.6922 |
| 0.0 | 11.0 | 363 | 1.6431 | 0.0059 | 5866.8306 | 4066.5771 | 1938.0 | 2475.0 | 0.7830 | 1946.0 | 0.7863 | 1058.0 | 1058.0 | 1196.0 | 0.8846 | 0.8846 | 880.0 | 880.0 | 1267.0 | 0.6946 | 0.6946 |
| 0.0 | 12.0 | 396 | 1.6410 | 0.0059 | 5859.5168 | 4061.5076 | 1934.0 | 2475.0 | 0.7814 | 1941.0 | 0.7842 | 1055.0 | 1055.0 | 1196.0 | 0.8821 | 0.8821 | 878.0 | 879.0 | 1267.0 | 0.6938 | 0.6930 |
| 0.0 | 13.0 | 429 | 1.6420 | 0.0059 | 5863.0062 | 4063.9262 | 1935.0 | 2475.0 | 0.7818 | 1943.0 | 0.7851 | 1056.0 | 1056.0 | 1196.0 | 0.8829 | 0.8829 | 879.0 | 879.0 | 1267.0 | 0.6938 | 0.6938 |
| 0.0 | 14.0 | 462 | 1.6393 | 0.0059 | 5853.5075 | 4057.3422 | 1936.0 | 2475.0 | 0.7822 | 1944.0 | 0.7855 | 1055.0 | 1055.0 | 1196.0 | 0.8821 | 0.8821 | 881.0 | 881.0 | 1267.0 | 0.6953 | 0.6953 |
| 0.4705 | 15.0 | 495 | 1.6394 | 0.0059 | 5853.9322 | 4057.6366 | 1935.0 | 2475.0 | 0.7818 | 1943.0 | 0.7851 | 1054.0 | 1054.0 | 1196.0 | 0.8813 | 0.8813 | 881.0 | 881.0 | 1267.0 | 0.6953 | 0.6953 |
| 0.0 | 16.0 | 528 | 1.6388 | 0.0059 | 5851.4802 | 4055.9370 | 1936.0 | 2475.0 | 0.7822 | 1944.0 | 0.7855 | 1055.0 | 1055.0 | 1196.0 | 0.8821 | 0.8821 | 881.0 | 881.0 | 1267.0 | 0.6953 | 0.6953 |
| 0.0 | 17.0 | 561 | 1.6396 | 0.0059 | 5854.5643 | 4058.0747 | 1937.0 | 2475.0 | 0.7826 | 1945.0 | 0.7859 | 1054.0 | 1054.0 | 1196.0 | 0.8813 | 0.8813 | 883.0 | 883.0 | 1267.0 | 0.6969 | 0.6969 |
| 0.4705 | 18.0 | 594 | 1.6388 | 0.0059 | 5851.7692 | 4056.1373 | 1937.0 | 2475.0 | 0.7826 | 1945.0 | 0.7859 | 1053.0 | 1053.0 | 1196.0 | 0.8804 | 0.8804 | 884.0 | 884.0 | 1267.0 | 0.6977 | 0.6977 |
| 0.0 | 19.0 | 627 | 1.6396 | 0.0059 | 5854.6347 | 4058.1235 | 1935.0 | 2475.0 | 0.7818 | 1943.0 | 0.7851 | 1052.0 | 1052.0 | 1196.0 | 0.8796 | 0.8796 | 883.0 | 883.0 | 1267.0 | 0.6969 | 0.6969 |
| 0.0 | 20.0 | 660 | 1.6372 | 0.0059 | 5845.9689 | 4052.1169 | 1936.0 | 2475.0 | 0.7822 | 1944.0 | 0.7855 | 1052.0 | 1052.0 | 1196.0 | 0.8796 | 0.8796 | 884.0 | 884.0 | 1267.0 | 0.6977 | 0.6977 |
| 0.0 | 21.0 | 693 | 1.6389 | 0.0059 | 5852.0283 | 4056.3169 | 1935.0 | 2475.0 | 0.7818 | 1943.0 | 0.7851 | 1052.0 | 1052.0 | 1196.0 | 0.8796 | 0.8796 | 883.0 | 883.0 | 1267.0 | 0.6969 | 0.6969 |
| 0.0 | 22.0 | 726 | 1.6393 | 0.0059 | 5853.4144 | 4057.2777 | 1936.0 | 2475.0 | 0.7822 | 1944.0 | 0.7855 | 1053.0 | 1053.0 | 1196.0 | 0.8804 | 0.8804 | 883.0 | 883.0 | 1267.0 | 0.6969 | 0.6969 |
| 0.0 | 23.0 | 759 | 1.6391 | 0.0059 | 5852.5099 | 4056.6507 | 1934.0 | 2475.0 | 0.7814 | 1942.0 | 0.7846 | 1051.0 | 1051.0 | 1196.0 | 0.8788 | 0.8788 | 883.0 | 883.0 | 1267.0 | 0.6969 | 0.6969 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-trn9haqb
Base model
meta-llama/Llama-3.2-1B