GSM8K-Binary_Llama-3.2-1B-qpuz1des
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.5202
- Model Preparation Time: 0.0056
- Mdl: 5428.0840
- Accumulated Loss: 3762.4611
- Correct Preds: 1781.0
- Total Preds: 2475.0
- Accuracy: 0.7196
- Correct Gen Preds: 1626.0
- Gen Accuracy: 0.6570
- Correct Gen Preds 34192: 823.0
- Correct Preds 34192: 922.0
- Total Labels 34192: 1196.0
- Accuracy 34192: 0.7709
- Gen Accuracy 34192: 0.6881
- Correct Gen Preds 41568: 795.0
- Correct Preds 41568: 859.0
- Total Labels 41568: 1267.0
- Accuracy 41568: 0.6780
- Gen Accuracy 41568: 0.6275
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 32
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 34192 | Correct Preds 34192 | Total Labels 34192 | Accuracy 34192 | Gen Accuracy 34192 | Correct Gen Preds 41568 | Correct Preds 41568 | Total Labels 41568 | Accuracy 41568 | Gen Accuracy 41568 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.4656 | 0.0056 | 5233.1723 | 3627.3586 | 1196.0 | 2475.0 | 0.4832 | 1204.0 | 0.4865 | 1196.0 | 1196.0 | 1196.0 | 1.0 | 1.0 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 0.7213 | 1.0 | 4 | 1.6676 | 0.0056 | 5954.2770 | 4127.1903 | 1196.0 | 2475.0 | 0.4832 | 659.0 | 0.2663 | 651.0 | 1196.0 | 1196.0 | 1.0 | 0.5443 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 1.531 | 2.0 | 8 | 1.8297 | 0.0056 | 6533.4012 | 4528.6086 | 1196.0 | 2475.0 | 0.4832 | 8.0 | 0.0032 | 0.0 | 1196.0 | 1196.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 0.6658 | 3.0 | 12 | 0.8914 | 0.0056 | 3182.8541 | 2206.1864 | 1196.0 | 2475.0 | 0.4832 | 8.0 | 0.0032 | 0.0 | 1196.0 | 1196.0 | 1.0 | 0.0 | 0.0 | 0.0 | 1267.0 | 0.0 | 0.0 |
| 0.6605 | 4.0 | 16 | 0.9261 | 0.0056 | 3306.8536 | 2292.1362 | 1287.0 | 2475.0 | 0.52 | 81.0 | 0.0327 | 0.0 | 23.0 | 1196.0 | 0.0192 | 0.0 | 73.0 | 1264.0 | 1267.0 | 0.9976 | 0.0576 |
| 0.4214 | 5.0 | 20 | 0.6675 | 0.0056 | 2383.2606 | 1651.9503 | 1609.0 | 2475.0 | 0.6501 | 16.0 | 0.0065 | 0.0 | 576.0 | 1196.0 | 0.4816 | 0.0 | 8.0 | 1033.0 | 1267.0 | 0.8153 | 0.0063 |
| 0.2547 | 6.0 | 24 | 0.9045 | 0.0056 | 3229.5997 | 2238.5879 | 1563.0 | 2475.0 | 0.6315 | 88.0 | 0.0356 | 0.0 | 509.0 | 1196.0 | 0.4256 | 0.0 | 80.0 | 1054.0 | 1267.0 | 0.8319 | 0.0631 |
| 0.7195 | 7.0 | 28 | 0.8937 | 0.0056 | 3191.0282 | 2211.8522 | 1726.0 | 2475.0 | 0.6974 | 206.0 | 0.0832 | 26.0 | 938.0 | 1196.0 | 0.7843 | 0.0217 | 172.0 | 788.0 | 1267.0 | 0.6219 | 0.1358 |
| 0.0823 | 8.0 | 32 | 1.1088 | 0.0056 | 3959.1638 | 2744.2832 | 1662.0 | 2475.0 | 0.6715 | 458.0 | 0.1851 | 42.0 | 664.0 | 1196.0 | 0.5552 | 0.0351 | 408.0 | 998.0 | 1267.0 | 0.7877 | 0.3220 |
| 0.0095 | 9.0 | 36 | 1.3722 | 0.0056 | 4899.6036 | 3396.1464 | 1597.0 | 2475.0 | 0.6453 | 496.0 | 0.2004 | 123.0 | 701.0 | 1196.0 | 0.5861 | 0.1028 | 365.0 | 896.0 | 1267.0 | 0.7072 | 0.2881 |
| 0.6595 | 10.0 | 40 | 1.5864 | 0.0056 | 5664.5748 | 3926.3841 | 1596.0 | 2475.0 | 0.6448 | 1038.0 | 0.4194 | 235.0 | 484.0 | 1196.0 | 0.4047 | 0.1965 | 795.0 | 1112.0 | 1267.0 | 0.8777 | 0.6275 |
| 0.0032 | 11.0 | 44 | 2.6428 | 0.0056 | 9436.7009 | 6541.0226 | 1607.0 | 2475.0 | 0.6493 | 1448.0 | 0.5851 | 1061.0 | 1143.0 | 1196.0 | 0.9557 | 0.8871 | 379.0 | 464.0 | 1267.0 | 0.3662 | 0.2991 |
| 0.0073 | 12.0 | 48 | 1.2794 | 0.0056 | 4568.1914 | 3166.4290 | 1676.0 | 2475.0 | 0.6772 | 1246.0 | 0.5034 | 386.0 | 669.0 | 1196.0 | 0.5594 | 0.3227 | 852.0 | 1007.0 | 1267.0 | 0.7948 | 0.6725 |
| 0.6595 | 13.0 | 52 | 1.7627 | 0.0056 | 6293.8866 | 4362.5897 | 1689.0 | 2475.0 | 0.6824 | 724.0 | 0.2925 | 326.0 | 1020.0 | 1196.0 | 0.8528 | 0.2726 | 390.0 | 669.0 | 1267.0 | 0.5280 | 0.3078 |
| 0.0004 | 14.0 | 56 | 1.6203 | 0.0056 | 5785.5413 | 4010.2316 | 1772.0 | 2475.0 | 0.7160 | 1422.0 | 0.5745 | 705.0 | 933.0 | 1196.0 | 0.7801 | 0.5895 | 709.0 | 839.0 | 1267.0 | 0.6622 | 0.5596 |
| 0.0 | 15.0 | 60 | 1.5202 | 0.0056 | 5428.0840 | 3762.4611 | 1781.0 | 2475.0 | 0.7196 | 1626.0 | 0.6570 | 823.0 | 922.0 | 1196.0 | 0.7709 | 0.6881 | 795.0 | 859.0 | 1267.0 | 0.6780 | 0.6275 |
| 0.0 | 16.0 | 64 | 1.6540 | 0.0056 | 5905.7779 | 4093.5733 | 1745.0 | 2475.0 | 0.7051 | 1648.0 | 0.6659 | 880.0 | 937.0 | 1196.0 | 0.7834 | 0.7358 | 760.0 | 808.0 | 1267.0 | 0.6377 | 0.5998 |
| 0.0 | 17.0 | 68 | 1.9777 | 0.0056 | 7061.7979 | 4894.8653 | 1734.0 | 2475.0 | 0.7006 | 1657.0 | 0.6695 | 977.0 | 1019.0 | 1196.0 | 0.8520 | 0.8169 | 672.0 | 715.0 | 1267.0 | 0.5643 | 0.5304 |
| 0.0 | 18.0 | 72 | 2.0402 | 0.0056 | 7284.7708 | 5049.4183 | 1728.0 | 2475.0 | 0.6982 | 1671.0 | 0.6752 | 974.0 | 1009.0 | 1196.0 | 0.8436 | 0.8144 | 689.0 | 719.0 | 1267.0 | 0.5675 | 0.5438 |
| 0.6534 | 19.0 | 76 | 2.0422 | 0.0056 | 7292.1840 | 5054.5568 | 1724.0 | 2475.0 | 0.6966 | 1682.0 | 0.6796 | 965.0 | 995.0 | 1196.0 | 0.8319 | 0.8069 | 709.0 | 729.0 | 1267.0 | 0.5754 | 0.5596 |
| 0.0 | 20.0 | 80 | 2.0333 | 0.0056 | 7260.3879 | 5032.5174 | 1724.0 | 2475.0 | 0.6966 | 1686.0 | 0.6812 | 954.0 | 983.0 | 1196.0 | 0.8219 | 0.7977 | 724.0 | 741.0 | 1267.0 | 0.5848 | 0.5714 |
| 1.3069 | 21.0 | 84 | 2.0283 | 0.0056 | 7242.2577 | 5019.9505 | 1717.0 | 2475.0 | 0.6937 | 1682.0 | 0.6796 | 945.0 | 972.0 | 1196.0 | 0.8127 | 0.7901 | 729.0 | 745.0 | 1267.0 | 0.5880 | 0.5754 |
| 0.0 | 22.0 | 88 | 2.0281 | 0.0056 | 7241.5495 | 5019.4596 | 1718.0 | 2475.0 | 0.6941 | 1685.0 | 0.6808 | 943.0 | 968.0 | 1196.0 | 0.8094 | 0.7885 | 734.0 | 750.0 | 1267.0 | 0.5919 | 0.5793 |
| 0.0 | 23.0 | 92 | 2.0230 | 0.0056 | 7223.3657 | 5006.8555 | 1718.0 | 2475.0 | 0.6941 | 1684.0 | 0.6804 | 941.0 | 967.0 | 1196.0 | 0.8085 | 0.7868 | 735.0 | 751.0 | 1267.0 | 0.5927 | 0.5801 |
| 0.0 | 24.0 | 96 | 2.0215 | 0.0056 | 7218.1816 | 5003.2623 | 1719.0 | 2475.0 | 0.6945 | 1688.0 | 0.6820 | 943.0 | 966.0 | 1196.0 | 0.8077 | 0.7885 | 737.0 | 753.0 | 1267.0 | 0.5943 | 0.5817 |
| 0.6534 | 25.0 | 100 | 2.0177 | 0.0056 | 7204.5124 | 4993.7874 | 1720.0 | 2475.0 | 0.6949 | 1689.0 | 0.6824 | 940.0 | 963.0 | 1196.0 | 0.8052 | 0.7860 | 741.0 | 757.0 | 1267.0 | 0.5975 | 0.5848 |
| 0.0 | 26.0 | 104 | 2.0187 | 0.0056 | 7208.0183 | 4996.2176 | 1712.0 | 2475.0 | 0.6917 | 1679.0 | 0.6784 | 932.0 | 956.0 | 1196.0 | 0.7993 | 0.7793 | 739.0 | 756.0 | 1267.0 | 0.5967 | 0.5833 |
| 0.6534 | 27.0 | 108 | 2.0148 | 0.0056 | 7194.3356 | 4986.7335 | 1714.0 | 2475.0 | 0.6925 | 1684.0 | 0.6804 | 935.0 | 958.0 | 1196.0 | 0.8010 | 0.7818 | 741.0 | 756.0 | 1267.0 | 0.5967 | 0.5848 |
| 0.0 | 28.0 | 112 | 2.0173 | 0.0056 | 7203.1509 | 4992.8437 | 1719.0 | 2475.0 | 0.6945 | 1689.0 | 0.6824 | 933.0 | 955.0 | 1196.0 | 0.7985 | 0.7801 | 748.0 | 764.0 | 1267.0 | 0.6030 | 0.5904 |
| 0.0 | 29.0 | 116 | 2.0147 | 0.0056 | 7193.9318 | 4986.4536 | 1721.0 | 2475.0 | 0.6954 | 1687.0 | 0.6816 | 933.0 | 958.0 | 1196.0 | 0.8010 | 0.7801 | 746.0 | 763.0 | 1267.0 | 0.6022 | 0.5888 |
| 0.0 | 30.0 | 120 | 2.0154 | 0.0056 | 7196.4785 | 4988.2188 | 1718.0 | 2475.0 | 0.6941 | 1687.0 | 0.6816 | 930.0 | 954.0 | 1196.0 | 0.7977 | 0.7776 | 749.0 | 764.0 | 1267.0 | 0.6030 | 0.5912 |
| 0.0 | 31.0 | 124 | 2.0145 | 0.0056 | 7193.1358 | 4985.9018 | 1712.0 | 2475.0 | 0.6917 | 1680.0 | 0.6788 | 927.0 | 950.0 | 1196.0 | 0.7943 | 0.7751 | 745.0 | 762.0 | 1267.0 | 0.6014 | 0.5880 |
| 0.0 | 32.0 | 128 | 2.0136 | 0.0056 | 7189.8018 | 4983.5909 | 1715.0 | 2475.0 | 0.6929 | 1686.0 | 0.6812 | 931.0 | 952.0 | 1196.0 | 0.7960 | 0.7784 | 747.0 | 763.0 | 1267.0 | 0.6022 | 0.5896 |
| 0.0 | 33.0 | 132 | 2.0152 | 0.0056 | 7195.4569 | 4987.5107 | 1713.0 | 2475.0 | 0.6921 | 1683.0 | 0.68 | 928.0 | 951.0 | 1196.0 | 0.7952 | 0.7759 | 747.0 | 762.0 | 1267.0 | 0.6014 | 0.5896 |
| 0.6534 | 34.0 | 136 | 2.0129 | 0.0056 | 7187.5236 | 4982.0117 | 1715.0 | 2475.0 | 0.6929 | 1683.0 | 0.68 | 926.0 | 948.0 | 1196.0 | 0.7926 | 0.7742 | 749.0 | 767.0 | 1267.0 | 0.6054 | 0.5912 |
| 0.0 | 35.0 | 140 | 2.0173 | 0.0056 | 7203.1617 | 4992.8512 | 1717.0 | 2475.0 | 0.6937 | 1686.0 | 0.6812 | 924.0 | 947.0 | 1196.0 | 0.7918 | 0.7726 | 754.0 | 770.0 | 1267.0 | 0.6077 | 0.5951 |
| 0.6534 | 36.0 | 144 | 2.0156 | 0.0056 | 7197.1919 | 4988.7132 | 1717.0 | 2475.0 | 0.6937 | 1689.0 | 0.6824 | 930.0 | 951.0 | 1196.0 | 0.7952 | 0.7776 | 751.0 | 766.0 | 1267.0 | 0.6046 | 0.5927 |
| 0.6534 | 37.0 | 148 | 2.0148 | 0.0056 | 7194.3476 | 4986.7418 | 1716.0 | 2475.0 | 0.6933 | 1687.0 | 0.6816 | 928.0 | 949.0 | 1196.0 | 0.7935 | 0.7759 | 751.0 | 767.0 | 1267.0 | 0.6054 | 0.5927 |
| 0.6534 | 38.0 | 152 | 2.0180 | 0.0056 | 7205.7094 | 4994.6171 | 1718.0 | 2475.0 | 0.6941 | 1687.0 | 0.6816 | 926.0 | 948.0 | 1196.0 | 0.7926 | 0.7742 | 753.0 | 770.0 | 1267.0 | 0.6077 | 0.5943 |
| 0.6534 | 39.0 | 156 | 2.0146 | 0.0056 | 7193.5903 | 4986.2168 | 1719.0 | 2475.0 | 0.6945 | 1691.0 | 0.6832 | 929.0 | 949.0 | 1196.0 | 0.7935 | 0.7768 | 754.0 | 770.0 | 1267.0 | 0.6077 | 0.5951 |
| 0.0 | 40.0 | 160 | 2.0167 | 0.0056 | 7201.0656 | 4991.3983 | 1717.0 | 2475.0 | 0.6937 | 1687.0 | 0.6816 | 928.0 | 950.0 | 1196.0 | 0.7943 | 0.7759 | 751.0 | 767.0 | 1267.0 | 0.6054 | 0.5927 |
| 0.6535 | 41.0 | 164 | 2.0178 | 0.0056 | 7205.0332 | 4994.1485 | 1718.0 | 2475.0 | 0.6941 | 1685.0 | 0.6808 | 926.0 | 948.0 | 1196.0 | 0.7926 | 0.7742 | 751.0 | 770.0 | 1267.0 | 0.6077 | 0.5927 |
| 0.0 | 42.0 | 168 | 2.0133 | 0.0056 | 7188.8401 | 4982.9242 | 1718.0 | 2475.0 | 0.6941 | 1688.0 | 0.6820 | 927.0 | 949.0 | 1196.0 | 0.7935 | 0.7751 | 753.0 | 769.0 | 1267.0 | 0.6069 | 0.5943 |
| 0.0 | 43.0 | 172 | 2.0141 | 0.0056 | 7191.6229 | 4984.8531 | 1715.0 | 2475.0 | 0.6929 | 1685.0 | 0.6808 | 926.0 | 948.0 | 1196.0 | 0.7926 | 0.7742 | 751.0 | 767.0 | 1267.0 | 0.6054 | 0.5927 |
| 0.6534 | 44.0 | 176 | 2.0150 | 0.0056 | 7194.8591 | 4987.0963 | 1717.0 | 2475.0 | 0.6937 | 1687.0 | 0.6816 | 926.0 | 948.0 | 1196.0 | 0.7926 | 0.7742 | 753.0 | 769.0 | 1267.0 | 0.6069 | 0.5943 |
| 0.0 | 45.0 | 180 | 2.0143 | 0.0056 | 7192.3952 | 4985.3885 | 1723.0 | 2475.0 | 0.6962 | 1694.0 | 0.6844 | 927.0 | 949.0 | 1196.0 | 0.7935 | 0.7751 | 759.0 | 774.0 | 1267.0 | 0.6109 | 0.5991 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-qpuz1des
Base model
meta-llama/Llama-3.2-1B