GSM8K-Binary_Llama-3.2-1B-eb5nfly3
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.7169
- Model Preparation Time: 0.0058
- Mdl: 2559.7869
- Accumulated Loss: 1774.3091
- Correct Preds: 1961.0
- Total Preds: 2475.0
- Accuracy: 0.7923
- Correct Gen Preds: 731.0
- Gen Accuracy: 0.2954
- Correct Gen Preds 34192: 204.0
- Correct Preds 34192: 933.0
- Total Labels 34192: 1196.0
- Accuracy 34192: 0.7801
- Gen Accuracy 34192: 0.1706
- Correct Gen Preds 41568: 518.0
- Correct Preds 41568: 1028.0
- Total Labels 41568: 1267.0
- Accuracy 41568: 0.8114
- Gen Accuracy 41568: 0.4088
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 34192 | Correct Preds 34192 | Total Labels 34192 | Accuracy 34192 | Gen Accuracy 34192 | Correct Gen Preds 41568 | Correct Preds 41568 | Total Labels 41568 | Accuracy 41568 | Gen Accuracy 41568 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.4656 | 0.0058 | 5233.1723 | 3627.3586 | 1196.0 | 2475.0 | 0.4832 | 1204.0 | 0.4865 | 1196.0 | 1196.0 | 1196.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 0.4267 | 1.0 | 44 | 0.6381 | 0.0058 | 2278.4224 | 1579.2820 | 1751.0 | 2475.0 | 0.7075 | 9.0 | 0.0036 | 0.0 | 626.0 | 1196.0 | 0.5234 | 0.0 | 0.0 | 1125.0 | 1267.0 | 0.8879 | 0.0 |
| 0.2951 | 2.0 | 88 | 0.5564 | 0.0058 | 1986.8987 | 1377.2132 | 1855.0 | 2475.0 | 0.7495 | 8.0 | 0.0032 | 0.0 | 1112.0 | 1196.0 | 0.9298 | 0.0 | 0.0 | 743.0 | 1267.0 | 0.5864 | 0.0 |
| 0.2168 | 3.0 | 132 | 0.5809 | 0.0058 | 2074.3206 | 1437.8095 | 1900.0 | 2475.0 | 0.7677 | 8.0 | 0.0032 | 0.0 | 793.0 | 1196.0 | 0.6630 | 0.0 | 0.0 | 1107.0 | 1267.0 | 0.8737 | 0.0 |
| 0.1239 | 4.0 | 176 | 0.7169 | 0.0058 | 2559.7869 | 1774.3091 | 1961.0 | 2475.0 | 0.7923 | 731.0 | 0.2954 | 204.0 | 933.0 | 1196.0 | 0.7801 | 0.1706 | 518.0 | 1028.0 | 1267.0 | 0.8114 | 0.4088 |
| 0.2313 | 5.0 | 220 | 0.7906 | 0.0058 | 2823.1453 | 1956.8552 | 1944.0 | 2475.0 | 0.7855 | 442.0 | 0.1786 | 84.0 | 955.0 | 1196.0 | 0.7985 | 0.0702 | 349.0 | 989.0 | 1267.0 | 0.7806 | 0.2755 |
| 0.0149 | 6.0 | 264 | 1.4312 | 0.0058 | 5110.2939 | 3542.1858 | 1910.0 | 2475.0 | 0.7717 | 1166.0 | 0.4711 | 395.0 | 905.0 | 1196.0 | 0.7567 | 0.3303 | 763.0 | 1005.0 | 1267.0 | 0.7932 | 0.6022 |
| 0.0 | 7.0 | 308 | 1.8767 | 0.0058 | 6701.2420 | 4644.9470 | 1920.0 | 2475.0 | 0.7758 | 1771.0 | 0.7156 | 984.0 | 1065.0 | 1196.0 | 0.8905 | 0.8227 | 780.0 | 855.0 | 1267.0 | 0.6748 | 0.6156 |
| 0.0 | 8.0 | 352 | 1.8809 | 0.0058 | 6716.1771 | 4655.2992 | 1940.0 | 2475.0 | 0.7838 | 1924.0 | 0.7774 | 980.0 | 996.0 | 1196.0 | 0.8328 | 0.8194 | 936.0 | 944.0 | 1267.0 | 0.7451 | 0.7388 |
| 0.0 | 9.0 | 396 | 1.8805 | 0.0058 | 6714.6723 | 4654.2562 | 1952.0 | 2475.0 | 0.7887 | 1934.0 | 0.7814 | 1020.0 | 1026.0 | 1196.0 | 0.8579 | 0.8528 | 906.0 | 926.0 | 1267.0 | 0.7309 | 0.7151 |
| 0.0 | 10.0 | 440 | 2.1020 | 0.0058 | 7505.4300 | 5202.3676 | 1942.0 | 2475.0 | 0.7846 | 1917.0 | 0.7745 | 1035.0 | 1043.0 | 1196.0 | 0.8721 | 0.8654 | 874.0 | 899.0 | 1267.0 | 0.7096 | 0.6898 |
| 0.0 | 11.0 | 484 | 2.2177 | 0.0058 | 7918.5831 | 5488.7435 | 1945.0 | 2475.0 | 0.7859 | 1917.0 | 0.7745 | 1048.0 | 1058.0 | 1196.0 | 0.8846 | 0.8763 | 861.0 | 887.0 | 1267.0 | 0.7001 | 0.6796 |
| 0.0 | 12.0 | 528 | 2.2185 | 0.0058 | 7921.6693 | 5490.8827 | 1943.0 | 2475.0 | 0.7851 | 1916.0 | 0.7741 | 1049.0 | 1058.0 | 1196.0 | 0.8846 | 0.8771 | 859.0 | 885.0 | 1267.0 | 0.6985 | 0.6780 |
| 0.0 | 13.0 | 572 | 2.2196 | 0.0058 | 7925.2973 | 5493.3975 | 1939.0 | 2475.0 | 0.7834 | 1912.0 | 0.7725 | 1046.0 | 1056.0 | 1196.0 | 0.8829 | 0.8746 | 858.0 | 883.0 | 1267.0 | 0.6969 | 0.6772 |
| 0.0 | 14.0 | 616 | 2.2197 | 0.0058 | 7925.9213 | 5493.8300 | 1941.0 | 2475.0 | 0.7842 | 1913.0 | 0.7729 | 1046.0 | 1056.0 | 1196.0 | 0.8829 | 0.8746 | 859.0 | 885.0 | 1267.0 | 0.6985 | 0.6780 |
| 0.6919 | 15.0 | 660 | 2.2190 | 0.0058 | 7923.3928 | 5492.0774 | 1944.0 | 2475.0 | 0.7855 | 1915.0 | 0.7737 | 1046.0 | 1056.0 | 1196.0 | 0.8829 | 0.8746 | 861.0 | 888.0 | 1267.0 | 0.7009 | 0.6796 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 3
Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-eb5nfly3
Base model
meta-llama/Llama-3.2-1B