GSM8K-Binary_Llama-3.2-1B-8kwse8de

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4787
  • Model Preparation Time: 0.0059
  • Mdl: 5279.8389
  • Accumulated Loss: 3659.7055
  • Correct Preds: 1822.0
  • Total Preds: 2475.0
  • Accuracy: 0.7362
  • Correct Gen Preds: 1743.0
  • Gen Accuracy: 0.7042
  • Correct Gen Preds 34192: 834.0
  • Correct Preds 34192: 870.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.7274
  • Gen Accuracy 34192: 0.6973
  • Correct Gen Preds 41568: 900.0
  • Correct Preds 41568: 952.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.7514
  • Gen Accuracy 41568: 0.7103

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0059 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
0.7404 1.0 5 0.7519 0.0059 2684.8085 1860.9674 1301.0 2475.0 0.5257 9.0 0.0036 0.0 97.0 1196.0 0.0811 0.0 1.0 1204.0 1267.0 0.9503 0.0008
1.4345 2.0 10 0.6475 0.0059 2312.1054 1602.6293 1678.0 2475.0 0.6780 8.0 0.0032 0.0 926.0 1196.0 0.7742 0.0 0.0 752.0 1267.0 0.5935 0.0
0.3056 3.0 15 0.6249 0.0059 2231.2720 1546.5999 1767.0 2475.0 0.7139 9.0 0.0036 1.0 906.0 1196.0 0.7575 0.0008 0.0 861.0 1267.0 0.6796 0.0
0.3324 4.0 20 0.6716 0.0059 2398.2346 1662.3295 1794.0 2475.0 0.7248 125.0 0.0505 9.0 831.0 1196.0 0.6948 0.0075 108.0 963.0 1267.0 0.7601 0.0852
0.7534 5.0 25 1.2676 0.0059 4526.0621 3137.2272 1499.0 2475.0 0.6057 932.0 0.3766 82.0 267.0 1196.0 0.2232 0.0686 842.0 1232.0 1267.0 0.9724 0.6646
0.2081 6.0 30 1.5980 0.0059 5705.7968 3954.9570 1505.0 2475.0 0.6081 703.0 0.2840 618.0 1175.0 1196.0 0.9824 0.5167 77.0 330.0 1267.0 0.2605 0.0608
0.082 7.0 35 1.1486 0.0059 4101.3733 2842.8553 1612.0 2475.0 0.6513 992.0 0.4008 120.0 449.0 1196.0 0.3754 0.1003 863.0 1163.0 1267.0 0.9179 0.6811
0.6616 8.0 40 1.2311 0.0059 4395.8751 3046.9884 1779.0 2475.0 0.7188 1492.0 0.6028 826.0 1015.0 1196.0 0.8487 0.6906 657.0 764.0 1267.0 0.6030 0.5185
0.0017 9.0 45 1.6432 0.0059 5867.3174 4066.9145 1756.0 2475.0 0.7095 1610.0 0.6505 923.0 1023.0 1196.0 0.8554 0.7717 678.0 733.0 1267.0 0.5785 0.5351
0.0001 10.0 50 2.1381 0.0059 7634.3190 5291.7067 1718.0 2475.0 0.6941 1546.0 0.6246 983.0 1082.0 1196.0 0.9047 0.8219 554.0 636.0 1267.0 0.5020 0.4373
0.0001 11.0 55 1.4472 0.0059 5167.3448 3581.7305 1813.0 2475.0 0.7325 1610.0 0.6505 792.0 898.0 1196.0 0.7508 0.6622 809.0 915.0 1267.0 0.7222 0.6385
0.0 12.0 60 1.4471 0.0059 5167.1333 3581.5839 1815.0 2475.0 0.7333 1670.0 0.6747 770.0 835.0 1196.0 0.6982 0.6438 891.0 980.0 1267.0 0.7735 0.7032
0.0 13.0 65 1.4645 0.0059 5229.3257 3624.6924 1820.0 2475.0 0.7354 1726.0 0.6974 812.0 852.0 1196.0 0.7124 0.6789 905.0 968.0 1267.0 0.7640 0.7143
1.3069 14.0 70 1.4787 0.0059 5279.8389 3659.7055 1822.0 2475.0 0.7362 1743.0 0.7042 834.0 870.0 1196.0 0.7274 0.6973 900.0 952.0 1267.0 0.7514 0.7103
0.6534 15.0 75 1.4931 0.0059 5331.4229 3695.4608 1820.0 2475.0 0.7354 1757.0 0.7099 859.0 888.0 1196.0 0.7425 0.7182 889.0 932.0 1267.0 0.7356 0.7017
0.6535 16.0 80 1.5030 0.0059 5366.7260 3719.9310 1818.0 2475.0 0.7345 1766.0 0.7135 869.0 893.0 1196.0 0.7467 0.7266 888.0 925.0 1267.0 0.7301 0.7009
0.0 17.0 85 1.5122 0.0059 5399.3942 3742.5749 1820.0 2475.0 0.7354 1767.0 0.7139 874.0 898.0 1196.0 0.7508 0.7308 884.0 922.0 1267.0 0.7277 0.6977
0.0 18.0 90 1.5168 0.0059 5415.9772 3754.0693 1822.0 2475.0 0.7362 1772.0 0.7160 879.0 902.0 1196.0 0.7542 0.7349 884.0 920.0 1267.0 0.7261 0.6977
0.0 19.0 95 1.5232 0.0059 5438.9175 3769.9703 1822.0 2475.0 0.7362 1774.0 0.7168 881.0 903.0 1196.0 0.7550 0.7366 884.0 919.0 1267.0 0.7253 0.6977
0.0 20.0 100 1.5241 0.0059 5442.2286 3772.2654 1819.0 2475.0 0.7349 1771.0 0.7156 884.0 905.0 1196.0 0.7567 0.7391 878.0 914.0 1267.0 0.7214 0.6930
0.0 21.0 105 1.5278 0.0059 5455.2160 3781.2676 1821.0 2475.0 0.7358 1778.0 0.7184 884.0 905.0 1196.0 0.7567 0.7391 885.0 916.0 1267.0 0.7230 0.6985
0.6535 22.0 110 1.5296 0.0059 5461.6471 3785.7253 1819.0 2475.0 0.7349 1776.0 0.7176 887.0 907.0 1196.0 0.7584 0.7416 880.0 912.0 1267.0 0.7198 0.6946
0.0 23.0 115 1.5328 0.0059 5473.0012 3793.5954 1821.0 2475.0 0.7358 1782.0 0.72 888.0 907.0 1196.0 0.7584 0.7425 885.0 914.0 1267.0 0.7214 0.6985
0.0 24.0 120 1.5339 0.0059 5477.0890 3796.4288 1821.0 2475.0 0.7358 1778.0 0.7184 889.0 910.0 1196.0 0.7609 0.7433 880.0 911.0 1267.0 0.7190 0.6946
1.3069 25.0 125 1.5357 0.0059 5483.3601 3800.7756 1818.0 2475.0 0.7345 1777.0 0.7180 886.0 907.0 1196.0 0.7584 0.7408 882.0 911.0 1267.0 0.7190 0.6961
0.0 26.0 130 1.5390 0.0059 5495.1006 3808.9135 1820.0 2475.0 0.7354 1779.0 0.7188 888.0 909.0 1196.0 0.7600 0.7425 882.0 911.0 1267.0 0.7190 0.6961
0.6534 27.0 135 1.5373 0.0059 5489.3342 3804.9165 1820.0 2475.0 0.7354 1782.0 0.72 889.0 908.0 1196.0 0.7592 0.7433 884.0 912.0 1267.0 0.7198 0.6977
0.0 28.0 140 1.5419 0.0059 5505.6494 3816.2253 1822.0 2475.0 0.7362 1780.0 0.7192 890.0 911.0 1196.0 0.7617 0.7441 881.0 911.0 1267.0 0.7190 0.6953
0.0 29.0 145 1.5433 0.0059 5510.5924 3819.6516 1821.0 2475.0 0.7358 1779.0 0.7188 889.0 910.0 1196.0 0.7609 0.7433 881.0 911.0 1267.0 0.7190 0.6953
0.0 30.0 150 1.5439 0.0059 5512.6644 3821.0878 1819.0 2475.0 0.7349 1777.0 0.7180 889.0 909.0 1196.0 0.7600 0.7433 879.0 910.0 1267.0 0.7182 0.6938
0.0 31.0 155 1.5443 0.0059 5514.1591 3822.1238 1820.0 2475.0 0.7354 1781.0 0.7196 890.0 911.0 1196.0 0.7617 0.7441 882.0 909.0 1267.0 0.7174 0.6961
0.6534 32.0 160 1.5471 0.0059 5524.2001 3829.0837 1820.0 2475.0 0.7354 1776.0 0.7176 891.0 912.0 1196.0 0.7625 0.7450 876.0 908.0 1267.0 0.7167 0.6914
0.0 33.0 165 1.5472 0.0059 5524.7178 3829.4426 1821.0 2475.0 0.7358 1778.0 0.7184 891.0 912.0 1196.0 0.7625 0.7450 878.0 909.0 1267.0 0.7174 0.6930
0.0 34.0 170 1.5496 0.0059 5533.2649 3835.3670 1817.0 2475.0 0.7341 1777.0 0.7180 890.0 911.0 1196.0 0.7617 0.7441 878.0 906.0 1267.0 0.7151 0.6930
0.0 35.0 175 1.5519 0.0059 5541.3527 3840.9730 1820.0 2475.0 0.7354 1780.0 0.7192 890.0 910.0 1196.0 0.7609 0.7441 881.0 910.0 1267.0 0.7182 0.6953
0.0 36.0 180 1.5514 0.0059 5539.5094 3839.6954 1820.0 2475.0 0.7354 1781.0 0.7196 891.0 912.0 1196.0 0.7625 0.7450 881.0 908.0 1267.0 0.7167 0.6953
0.0 37.0 185 1.5539 0.0059 5548.5974 3845.9946 1819.0 2475.0 0.7349 1780.0 0.7192 891.0 912.0 1196.0 0.7625 0.7450 880.0 907.0 1267.0 0.7159 0.6946
0.0 38.0 190 1.5534 0.0059 5546.8413 3844.7774 1819.0 2475.0 0.7349 1781.0 0.7196 892.0 912.0 1196.0 0.7625 0.7458 880.0 907.0 1267.0 0.7159 0.6946
0.0 39.0 195 1.5541 0.0059 5549.1300 3846.3638 1820.0 2475.0 0.7354 1781.0 0.7196 892.0 912.0 1196.0 0.7625 0.7458 880.0 908.0 1267.0 0.7167 0.6946
0.0 40.0 200 1.5561 0.0059 5556.3793 3851.3886 1821.0 2475.0 0.7358 1785.0 0.7212 894.0 914.0 1196.0 0.7642 0.7475 882.0 907.0 1267.0 0.7159 0.6961
0.6534 41.0 205 1.5581 0.0059 5563.5837 3856.3823 1815.0 2475.0 0.7333 1778.0 0.7184 891.0 910.0 1196.0 0.7609 0.7450 878.0 905.0 1267.0 0.7143 0.6930
0.6534 42.0 210 1.5582 0.0059 5563.8211 3856.5469 1819.0 2475.0 0.7349 1784.0 0.7208 893.0 913.0 1196.0 0.7634 0.7467 882.0 906.0 1267.0 0.7151 0.6961
0.6534 43.0 215 1.5591 0.0059 5566.9433 3858.7111 1819.0 2475.0 0.7349 1784.0 0.7208 895.0 915.0 1196.0 0.7651 0.7483 880.0 904.0 1267.0 0.7135 0.6946
0.0 44.0 220 1.5600 0.0059 5570.2078 3860.9738 1818.0 2475.0 0.7345 1779.0 0.7188 893.0 913.0 1196.0 0.7634 0.7467 877.0 905.0 1267.0 0.7143 0.6922

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
3
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-8kwse8de

Finetuned
(903)
this model