GSM8K-Binary_Llama-3.2-1B-f8096090
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.6336
- Model Preparation Time: 0.0058
- Mdl: 2262.3589
- Accumulated Loss: 1568.1477
- Correct Preds: 1973.0
- Total Preds: 2475.0
- Accuracy: 0.7972
- Correct Gen Preds: 369.0
- Gen Accuracy: 0.1491
- Correct Gen Preds 34192: 0.0
- Correct Preds 34192: 974.0
- Total Labels 34192: 1196.0
- Accuracy 34192: 0.8144
- Gen Accuracy 34192: 0.0
- Correct Gen Preds 41568: 362.0
- Correct Preds 41568: 999.0
- Total Labels 41568: 1267.0
- Accuracy 41568: 0.7885
- Gen Accuracy 41568: 0.2857
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 34192 | Correct Preds 34192 | Total Labels 34192 | Accuracy 34192 | Gen Accuracy 34192 | Correct Gen Preds 41568 | Correct Preds 41568 | Total Labels 41568 | Accuracy 41568 | Gen Accuracy 41568 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.4656 | 0.0058 | 5233.1723 | 3627.3586 | 1196.0 | 2475.0 | 0.4832 | 1204.0 | 0.4865 | 1196.0 | 1196.0 | 1196.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 0.5859 | 1.0 | 52 | 0.5818 | 0.0058 | 2077.5047 | 1440.0165 | 1847.0 | 2475.0 | 0.7463 | 8.0 | 0.0032 | 0.0 | 857.0 | 1196.0 | 0.7166 | 0.0 | 0.0 | 990.0 | 1267.0 | 0.7814 | 0.0 |
| 0.6145 | 2.0 | 104 | 0.5168 | 0.0058 | 1845.2524 | 1279.0315 | 1948.0 | 2475.0 | 0.7871 | 69.0 | 0.0279 | 0.0 | 1063.0 | 1196.0 | 0.8888 | 0.0 | 61.0 | 885.0 | 1267.0 | 0.6985 | 0.0481 |
| 0.2879 | 3.0 | 156 | 0.5778 | 0.0058 | 2063.1398 | 1430.0595 | 1868.0 | 2475.0 | 0.7547 | 53.0 | 0.0214 | 0.0 | 1106.0 | 1196.0 | 0.9247 | 0.0 | 46.0 | 762.0 | 1267.0 | 0.6014 | 0.0363 |
| 0.0501 | 4.0 | 208 | 0.6336 | 0.0058 | 2262.3589 | 1568.1477 | 1973.0 | 2475.0 | 0.7972 | 369.0 | 0.1491 | 0.0 | 974.0 | 1196.0 | 0.8144 | 0.0 | 362.0 | 999.0 | 1267.0 | 0.7885 | 0.2857 |
| 0.3604 | 5.0 | 260 | 1.7321 | 0.0058 | 6184.7525 | 4286.9438 | 1864.0 | 2475.0 | 0.7531 | 1135.0 | 0.4586 | 634.0 | 1105.0 | 1196.0 | 0.9239 | 0.5301 | 494.0 | 759.0 | 1267.0 | 0.5991 | 0.3899 |
| 0.0662 | 6.0 | 312 | 1.2469 | 0.0058 | 4452.3018 | 3086.1004 | 1972.0 | 2475.0 | 0.7968 | 1028.0 | 0.4154 | 359.0 | 1028.0 | 1196.0 | 0.8595 | 0.3002 | 661.0 | 944.0 | 1267.0 | 0.7451 | 0.5217 |
| 0.0 | 7.0 | 364 | 1.4682 | 0.0058 | 5242.5624 | 3633.8673 | 1970.0 | 2475.0 | 0.7960 | 1223.0 | 0.4941 | 464.0 | 1033.0 | 1196.0 | 0.8637 | 0.3880 | 751.0 | 937.0 | 1267.0 | 0.7395 | 0.5927 |
| 0.0003 | 8.0 | 416 | 1.9052 | 0.0058 | 6802.8127 | 4715.3504 | 1925.0 | 2475.0 | 0.7778 | 1504.0 | 0.6077 | 583.0 | 948.0 | 1196.0 | 0.7926 | 0.4875 | 914.0 | 977.0 | 1267.0 | 0.7711 | 0.7214 |
| 0.5881 | 9.0 | 468 | 1.9828 | 0.0058 | 7079.8847 | 4907.4021 | 1957.0 | 2475.0 | 0.7907 | 1879.0 | 0.7592 | 920.0 | 983.0 | 1196.0 | 0.8219 | 0.7692 | 952.0 | 974.0 | 1267.0 | 0.7687 | 0.7514 |
| 0.0 | 10.0 | 520 | 1.9968 | 0.0058 | 7129.8865 | 4942.0607 | 1957.0 | 2475.0 | 0.7907 | 1886.0 | 0.7620 | 913.0 | 972.0 | 1196.0 | 0.8127 | 0.7634 | 966.0 | 985.0 | 1267.0 | 0.7774 | 0.7624 |
| 0.5881 | 11.0 | 572 | 2.0014 | 0.0058 | 7146.2344 | 4953.3922 | 1959.0 | 2475.0 | 0.7915 | 1892.0 | 0.7644 | 918.0 | 972.0 | 1196.0 | 0.8127 | 0.7676 | 967.0 | 987.0 | 1267.0 | 0.7790 | 0.7632 |
| 0.0 | 12.0 | 624 | 2.0068 | 0.0058 | 7165.7013 | 4966.8857 | 1959.0 | 2475.0 | 0.7915 | 1890.0 | 0.7636 | 916.0 | 972.0 | 1196.0 | 0.8127 | 0.7659 | 967.0 | 987.0 | 1267.0 | 0.7790 | 0.7632 |
| 0.5882 | 13.0 | 676 | 2.0059 | 0.0058 | 7162.3520 | 4964.5641 | 1959.0 | 2475.0 | 0.7915 | 1893.0 | 0.7648 | 919.0 | 973.0 | 1196.0 | 0.8135 | 0.7684 | 967.0 | 986.0 | 1267.0 | 0.7782 | 0.7632 |
| 0.0 | 14.0 | 728 | 2.0106 | 0.0058 | 7179.0242 | 4976.1204 | 1958.0 | 2475.0 | 0.7911 | 1891.0 | 0.7640 | 918.0 | 972.0 | 1196.0 | 0.8127 | 0.7676 | 966.0 | 986.0 | 1267.0 | 0.7782 | 0.7624 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-f8096090
Base model
meta-llama/Llama-3.2-1B