GSM8K-Binary_Llama-3.2-1B-qpuz1des

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5202
  • Model Preparation Time: 0.0056
  • Mdl: 5428.0840
  • Accumulated Loss: 3762.4611
  • Correct Preds: 1781.0
  • Total Preds: 2475.0
  • Accuracy: 0.7196
  • Correct Gen Preds: 1626.0
  • Gen Accuracy: 0.6570
  • Correct Gen Preds 34192: 823.0
  • Correct Preds 34192: 922.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.7709
  • Gen Accuracy 34192: 0.6881
  • Correct Gen Preds 41568: 795.0
  • Correct Preds 41568: 859.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.6780
  • Gen Accuracy 41568: 0.6275

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0056 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
0.7213 1.0 4 1.6676 0.0056 5954.2770 4127.1903 1196.0 2475.0 0.4832 659.0 0.2663 651.0 1196.0 1196.0 1.0 0.5443 0.0 0.0 1267.0 0.0 0.0
1.531 2.0 8 1.8297 0.0056 6533.4012 4528.6086 1196.0 2475.0 0.4832 8.0 0.0032 0.0 1196.0 1196.0 1.0 0.0 0.0 0.0 1267.0 0.0 0.0
0.6658 3.0 12 0.8914 0.0056 3182.8541 2206.1864 1196.0 2475.0 0.4832 8.0 0.0032 0.0 1196.0 1196.0 1.0 0.0 0.0 0.0 1267.0 0.0 0.0
0.6605 4.0 16 0.9261 0.0056 3306.8536 2292.1362 1287.0 2475.0 0.52 81.0 0.0327 0.0 23.0 1196.0 0.0192 0.0 73.0 1264.0 1267.0 0.9976 0.0576
0.4214 5.0 20 0.6675 0.0056 2383.2606 1651.9503 1609.0 2475.0 0.6501 16.0 0.0065 0.0 576.0 1196.0 0.4816 0.0 8.0 1033.0 1267.0 0.8153 0.0063
0.2547 6.0 24 0.9045 0.0056 3229.5997 2238.5879 1563.0 2475.0 0.6315 88.0 0.0356 0.0 509.0 1196.0 0.4256 0.0 80.0 1054.0 1267.0 0.8319 0.0631
0.7195 7.0 28 0.8937 0.0056 3191.0282 2211.8522 1726.0 2475.0 0.6974 206.0 0.0832 26.0 938.0 1196.0 0.7843 0.0217 172.0 788.0 1267.0 0.6219 0.1358
0.0823 8.0 32 1.1088 0.0056 3959.1638 2744.2832 1662.0 2475.0 0.6715 458.0 0.1851 42.0 664.0 1196.0 0.5552 0.0351 408.0 998.0 1267.0 0.7877 0.3220
0.0095 9.0 36 1.3722 0.0056 4899.6036 3396.1464 1597.0 2475.0 0.6453 496.0 0.2004 123.0 701.0 1196.0 0.5861 0.1028 365.0 896.0 1267.0 0.7072 0.2881
0.6595 10.0 40 1.5864 0.0056 5664.5748 3926.3841 1596.0 2475.0 0.6448 1038.0 0.4194 235.0 484.0 1196.0 0.4047 0.1965 795.0 1112.0 1267.0 0.8777 0.6275
0.0032 11.0 44 2.6428 0.0056 9436.7009 6541.0226 1607.0 2475.0 0.6493 1448.0 0.5851 1061.0 1143.0 1196.0 0.9557 0.8871 379.0 464.0 1267.0 0.3662 0.2991
0.0073 12.0 48 1.2794 0.0056 4568.1914 3166.4290 1676.0 2475.0 0.6772 1246.0 0.5034 386.0 669.0 1196.0 0.5594 0.3227 852.0 1007.0 1267.0 0.7948 0.6725
0.6595 13.0 52 1.7627 0.0056 6293.8866 4362.5897 1689.0 2475.0 0.6824 724.0 0.2925 326.0 1020.0 1196.0 0.8528 0.2726 390.0 669.0 1267.0 0.5280 0.3078
0.0004 14.0 56 1.6203 0.0056 5785.5413 4010.2316 1772.0 2475.0 0.7160 1422.0 0.5745 705.0 933.0 1196.0 0.7801 0.5895 709.0 839.0 1267.0 0.6622 0.5596
0.0 15.0 60 1.5202 0.0056 5428.0840 3762.4611 1781.0 2475.0 0.7196 1626.0 0.6570 823.0 922.0 1196.0 0.7709 0.6881 795.0 859.0 1267.0 0.6780 0.6275
0.0 16.0 64 1.6540 0.0056 5905.7779 4093.5733 1745.0 2475.0 0.7051 1648.0 0.6659 880.0 937.0 1196.0 0.7834 0.7358 760.0 808.0 1267.0 0.6377 0.5998
0.0 17.0 68 1.9777 0.0056 7061.7979 4894.8653 1734.0 2475.0 0.7006 1657.0 0.6695 977.0 1019.0 1196.0 0.8520 0.8169 672.0 715.0 1267.0 0.5643 0.5304
0.0 18.0 72 2.0402 0.0056 7284.7708 5049.4183 1728.0 2475.0 0.6982 1671.0 0.6752 974.0 1009.0 1196.0 0.8436 0.8144 689.0 719.0 1267.0 0.5675 0.5438
0.6534 19.0 76 2.0422 0.0056 7292.1840 5054.5568 1724.0 2475.0 0.6966 1682.0 0.6796 965.0 995.0 1196.0 0.8319 0.8069 709.0 729.0 1267.0 0.5754 0.5596
0.0 20.0 80 2.0333 0.0056 7260.3879 5032.5174 1724.0 2475.0 0.6966 1686.0 0.6812 954.0 983.0 1196.0 0.8219 0.7977 724.0 741.0 1267.0 0.5848 0.5714
1.3069 21.0 84 2.0283 0.0056 7242.2577 5019.9505 1717.0 2475.0 0.6937 1682.0 0.6796 945.0 972.0 1196.0 0.8127 0.7901 729.0 745.0 1267.0 0.5880 0.5754
0.0 22.0 88 2.0281 0.0056 7241.5495 5019.4596 1718.0 2475.0 0.6941 1685.0 0.6808 943.0 968.0 1196.0 0.8094 0.7885 734.0 750.0 1267.0 0.5919 0.5793
0.0 23.0 92 2.0230 0.0056 7223.3657 5006.8555 1718.0 2475.0 0.6941 1684.0 0.6804 941.0 967.0 1196.0 0.8085 0.7868 735.0 751.0 1267.0 0.5927 0.5801
0.0 24.0 96 2.0215 0.0056 7218.1816 5003.2623 1719.0 2475.0 0.6945 1688.0 0.6820 943.0 966.0 1196.0 0.8077 0.7885 737.0 753.0 1267.0 0.5943 0.5817
0.6534 25.0 100 2.0177 0.0056 7204.5124 4993.7874 1720.0 2475.0 0.6949 1689.0 0.6824 940.0 963.0 1196.0 0.8052 0.7860 741.0 757.0 1267.0 0.5975 0.5848
0.0 26.0 104 2.0187 0.0056 7208.0183 4996.2176 1712.0 2475.0 0.6917 1679.0 0.6784 932.0 956.0 1196.0 0.7993 0.7793 739.0 756.0 1267.0 0.5967 0.5833
0.6534 27.0 108 2.0148 0.0056 7194.3356 4986.7335 1714.0 2475.0 0.6925 1684.0 0.6804 935.0 958.0 1196.0 0.8010 0.7818 741.0 756.0 1267.0 0.5967 0.5848
0.0 28.0 112 2.0173 0.0056 7203.1509 4992.8437 1719.0 2475.0 0.6945 1689.0 0.6824 933.0 955.0 1196.0 0.7985 0.7801 748.0 764.0 1267.0 0.6030 0.5904
0.0 29.0 116 2.0147 0.0056 7193.9318 4986.4536 1721.0 2475.0 0.6954 1687.0 0.6816 933.0 958.0 1196.0 0.8010 0.7801 746.0 763.0 1267.0 0.6022 0.5888
0.0 30.0 120 2.0154 0.0056 7196.4785 4988.2188 1718.0 2475.0 0.6941 1687.0 0.6816 930.0 954.0 1196.0 0.7977 0.7776 749.0 764.0 1267.0 0.6030 0.5912
0.0 31.0 124 2.0145 0.0056 7193.1358 4985.9018 1712.0 2475.0 0.6917 1680.0 0.6788 927.0 950.0 1196.0 0.7943 0.7751 745.0 762.0 1267.0 0.6014 0.5880
0.0 32.0 128 2.0136 0.0056 7189.8018 4983.5909 1715.0 2475.0 0.6929 1686.0 0.6812 931.0 952.0 1196.0 0.7960 0.7784 747.0 763.0 1267.0 0.6022 0.5896
0.0 33.0 132 2.0152 0.0056 7195.4569 4987.5107 1713.0 2475.0 0.6921 1683.0 0.68 928.0 951.0 1196.0 0.7952 0.7759 747.0 762.0 1267.0 0.6014 0.5896
0.6534 34.0 136 2.0129 0.0056 7187.5236 4982.0117 1715.0 2475.0 0.6929 1683.0 0.68 926.0 948.0 1196.0 0.7926 0.7742 749.0 767.0 1267.0 0.6054 0.5912
0.0 35.0 140 2.0173 0.0056 7203.1617 4992.8512 1717.0 2475.0 0.6937 1686.0 0.6812 924.0 947.0 1196.0 0.7918 0.7726 754.0 770.0 1267.0 0.6077 0.5951
0.6534 36.0 144 2.0156 0.0056 7197.1919 4988.7132 1717.0 2475.0 0.6937 1689.0 0.6824 930.0 951.0 1196.0 0.7952 0.7776 751.0 766.0 1267.0 0.6046 0.5927
0.6534 37.0 148 2.0148 0.0056 7194.3476 4986.7418 1716.0 2475.0 0.6933 1687.0 0.6816 928.0 949.0 1196.0 0.7935 0.7759 751.0 767.0 1267.0 0.6054 0.5927
0.6534 38.0 152 2.0180 0.0056 7205.7094 4994.6171 1718.0 2475.0 0.6941 1687.0 0.6816 926.0 948.0 1196.0 0.7926 0.7742 753.0 770.0 1267.0 0.6077 0.5943
0.6534 39.0 156 2.0146 0.0056 7193.5903 4986.2168 1719.0 2475.0 0.6945 1691.0 0.6832 929.0 949.0 1196.0 0.7935 0.7768 754.0 770.0 1267.0 0.6077 0.5951
0.0 40.0 160 2.0167 0.0056 7201.0656 4991.3983 1717.0 2475.0 0.6937 1687.0 0.6816 928.0 950.0 1196.0 0.7943 0.7759 751.0 767.0 1267.0 0.6054 0.5927
0.6535 41.0 164 2.0178 0.0056 7205.0332 4994.1485 1718.0 2475.0 0.6941 1685.0 0.6808 926.0 948.0 1196.0 0.7926 0.7742 751.0 770.0 1267.0 0.6077 0.5927
0.0 42.0 168 2.0133 0.0056 7188.8401 4982.9242 1718.0 2475.0 0.6941 1688.0 0.6820 927.0 949.0 1196.0 0.7935 0.7751 753.0 769.0 1267.0 0.6069 0.5943
0.0 43.0 172 2.0141 0.0056 7191.6229 4984.8531 1715.0 2475.0 0.6929 1685.0 0.6808 926.0 948.0 1196.0 0.7926 0.7742 751.0 767.0 1267.0 0.6054 0.5927
0.6534 44.0 176 2.0150 0.0056 7194.8591 4987.0963 1717.0 2475.0 0.6937 1687.0 0.6816 926.0 948.0 1196.0 0.7926 0.7742 753.0 769.0 1267.0 0.6069 0.5943
0.0 45.0 180 2.0143 0.0056 7192.3952 4985.3885 1723.0 2475.0 0.6962 1694.0 0.6844 927.0 949.0 1196.0 0.7935 0.7751 759.0 774.0 1267.0 0.6109 0.5991

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-qpuz1des

Finetuned
(903)
this model