GSM8K-Binary_Llama-3.2-1B-3bfons7b

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9066
  • Model Preparation Time: 0.0057
  • Mdl: 3237.1033
  • Accumulated Loss: 2243.7890
  • Correct Preds: 1438.0
  • Total Preds: 2475.0
  • Accuracy: 0.5810
  • Correct Gen Preds: 35.0
  • Gen Accuracy: 0.0141
  • Correct Gen Preds 34192: 0.0
  • Correct Preds 34192: 831.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.6948
  • Gen Accuracy 34192: 0.0
  • Correct Gen Preds 41568: 28.0
  • Correct Preds 41568: 607.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.4791
  • Gen Accuracy 41568: 0.0221

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0057 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
1.2184 1.0 1 1.4656 0.0057 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
1.2181 2.0 2 4.9971 0.0057 17843.0063 12367.8295 1267.0 2475.0 0.5119 1274.0 0.5147 0.0 0.0 1196.0 0.0 0.0 1266.0 1267.0 1267.0 1.0 0.9992
5.5269 3.0 3 1.0091 0.0057 3603.0026 2497.4111 1267.0 2475.0 0.5119 7.0 0.0028 0.0 0.0 1196.0 0.0 0.0 0.0 1267.0 1267.0 1.0 0.0
1.0263 4.0 4 1.9729 0.0057 7044.4372 4882.8318 1196.0 2475.0 0.4832 8.0 0.0032 0.0 1196.0 1196.0 1.0 0.0 0.0 0.0 1267.0 0.0 0.0
1.6974 5.0 5 1.0763 0.0057 3843.2050 2663.9067 1195.0 2475.0 0.4828 7.0 0.0028 0.0 1195.0 1196.0 0.9992 0.0 0.0 0.0 1267.0 0.0 0.0
0.8356 6.0 6 0.8186 0.0057 2922.9166 2026.0114 1265.0 2475.0 0.5111 7.0 0.0028 0.0 60.0 1196.0 0.0502 0.0 0.0 1205.0 1267.0 0.9511 0.0
0.7306 7.0 7 0.7934 0.0057 2833.1042 1963.7582 1334.0 2475.0 0.5390 7.0 0.0028 0.0 949.0 1196.0 0.7935 0.0 0.0 385.0 1267.0 0.3039 0.0
0.5692 8.0 8 0.7837 0.0057 2798.4240 1939.7197 1320.0 2475.0 0.5333 7.0 0.0028 0.0 1114.0 1196.0 0.9314 0.0 0.0 206.0 1267.0 0.1626 0.0
0.3958 9.0 9 0.7704 0.0057 2750.8055 1906.7131 1410.0 2475.0 0.5697 7.0 0.0028 0.0 1008.0 1196.0 0.8428 0.0 0.0 402.0 1267.0 0.3173 0.0
0.1932 10.0 10 1.0048 0.0057 3587.9800 2486.9982 1330.0 2475.0 0.5374 7.0 0.0028 0.0 1142.0 1196.0 0.9548 0.0 0.0 188.0 1267.0 0.1484 0.0
0.0723 11.0 11 0.9755 0.0057 3483.1862 2414.3607 1351.0 2475.0 0.5459 8.0 0.0032 0.0 1075.0 1196.0 0.8988 0.0 1.0 276.0 1267.0 0.2178 0.0008
0.0181 12.0 12 0.9066 0.0057 3237.1033 2243.7890 1438.0 2475.0 0.5810 35.0 0.0141 0.0 831.0 1196.0 0.6948 0.0 28.0 607.0 1267.0 0.4791 0.0221
0.0044 13.0 13 1.4092 0.0057 5031.9556 3487.8858 1386.0 2475.0 0.56 100.0 0.0404 28.0 1040.0 1196.0 0.8696 0.0234 65.0 346.0 1267.0 0.2731 0.0513
0.0006 14.0 14 2.0506 0.0057 7321.8602 5075.1267 1376.0 2475.0 0.5560 335.0 0.1354 228.0 1132.0 1196.0 0.9465 0.1906 100.0 244.0 1267.0 0.1926 0.0789
0.0002 15.0 15 2.6608 0.0057 9500.8931 6585.5173 1357.0 2475.0 0.5483 678.0 0.2739 557.0 1173.0 1196.0 0.9808 0.4657 114.0 184.0 1267.0 0.1452 0.0900
0.0001 16.0 16 3.1937 0.0057 11403.7545 7904.4803 1343.0 2475.0 0.5426 981.0 0.3964 856.0 1185.0 1196.0 0.9908 0.7157 118.0 158.0 1267.0 0.1247 0.0931
0.0001 17.0 17 3.6210 0.0057 12929.2533 8961.8755 1331.0 2475.0 0.5378 1158.0 0.4679 1030.0 1186.0 1196.0 0.9916 0.8612 121.0 145.0 1267.0 0.1144 0.0955
0.0001 18.0 18 3.9355 0.0057 14052.2862 9740.3026 1324.0 2475.0 0.5349 1247.0 0.5038 1122.0 1187.0 1196.0 0.9925 0.9381 118.0 137.0 1267.0 0.1081 0.0931
0.0 19.0 19 4.1504 0.0057 14819.8661 10272.3484 1325.0 2475.0 0.5354 1291.0 0.5216 1159.0 1187.0 1196.0 0.9925 0.9691 124.0 138.0 1267.0 0.1089 0.0979
0.0 20.0 20 4.2889 0.0057 15314.2863 10615.0543 1323.0 2475.0 0.5345 1301.0 0.5257 1169.0 1187.0 1196.0 0.9925 0.9774 124.0 136.0 1267.0 0.1073 0.0979
0.0 21.0 21 4.3808 0.0057 15642.3416 10842.4450 1323.0 2475.0 0.5345 1311.0 0.5297 1178.0 1187.0 1196.0 0.9925 0.9849 126.0 136.0 1267.0 0.1073 0.0994
0.0 22.0 22 4.4451 0.0057 15871.9519 11001.5987 1326.0 2475.0 0.5358 1323.0 0.5345 1183.0 1187.0 1196.0 0.9925 0.9891 131.0 139.0 1267.0 0.1097 0.1034
0.0 23.0 23 4.4902 0.0057 16033.0571 11113.2683 1327.0 2475.0 0.5362 1327.0 0.5362 1183.0 1187.0 1196.0 0.9925 0.9891 135.0 140.0 1267.0 0.1105 0.1066
0.0 24.0 24 4.5216 0.0057 16145.1152 11190.9411 1329.0 2475.0 0.5370 1330.0 0.5374 1183.0 1187.0 1196.0 0.9925 0.9891 138.0 142.0 1267.0 0.1121 0.1089
0.0 25.0 25 4.5424 0.0057 16219.4313 11242.4531 1330.0 2475.0 0.5374 1331.0 0.5378 1183.0 1187.0 1196.0 0.9925 0.9891 139.0 143.0 1267.0 0.1129 0.1097
0.0 26.0 26 4.5522 0.0057 16254.5512 11266.7964 1331.0 2475.0 0.5378 1332.0 0.5382 1183.0 1187.0 1196.0 0.9925 0.9891 140.0 144.0 1267.0 0.1137 0.1105
0.0 27.0 27 4.5629 0.0057 16292.5871 11293.1608 1332.0 2475.0 0.5382 1334.0 0.5390 1182.0 1186.0 1196.0 0.9916 0.9883 143.0 146.0 1267.0 0.1152 0.1129
0.0 28.0 28 4.5677 0.0057 16309.7312 11305.0442 1332.0 2475.0 0.5382 1334.0 0.5390 1183.0 1187.0 1196.0 0.9925 0.9891 142.0 145.0 1267.0 0.1144 0.1121
0.0 29.0 29 4.5732 0.0057 16329.2607 11318.5810 1333.0 2475.0 0.5386 1335.0 0.5394 1183.0 1187.0 1196.0 0.9925 0.9891 143.0 146.0 1267.0 0.1152 0.1129
0.0 30.0 30 4.5757 0.0057 16338.4857 11324.9753 1333.0 2475.0 0.5386 1336.0 0.5398 1183.0 1186.0 1196.0 0.9916 0.9891 144.0 147.0 1267.0 0.1160 0.1137
0.0 31.0 31 4.5783 0.0057 16347.4855 11331.2135 1333.0 2475.0 0.5386 1336.0 0.5398 1183.0 1186.0 1196.0 0.9916 0.9891 144.0 147.0 1267.0 0.1160 0.1137
0.0 32.0 32 4.5766 0.0057 16341.7025 11327.2050 1334.0 2475.0 0.5390 1337.0 0.5402 1183.0 1186.0 1196.0 0.9916 0.9891 145.0 148.0 1267.0 0.1168 0.1144
0.0 33.0 33 4.5759 0.0057 16338.9663 11325.3084 1335.0 2475.0 0.5394 1339.0 0.5410 1184.0 1186.0 1196.0 0.9916 0.9900 146.0 149.0 1267.0 0.1176 0.1152
0.0 34.0 34 4.5764 0.0057 16340.7048 11326.5134 1332.0 2475.0 0.5382 1336.0 0.5398 1183.0 1185.0 1196.0 0.9908 0.9891 144.0 147.0 1267.0 0.1160 0.1137
0.0 35.0 35 4.5782 0.0057 16347.3521 11331.1210 1333.0 2475.0 0.5386 1339.0 0.5410 1185.0 1186.0 1196.0 0.9916 0.9908 145.0 147.0 1267.0 0.1160 0.1144
0.0 36.0 36 4.5762 0.0057 16340.1836 11326.1522 1333.0 2475.0 0.5386 1337.0 0.5402 1184.0 1186.0 1196.0 0.9916 0.9900 144.0 147.0 1267.0 0.1160 0.1137
0.0 37.0 37 4.5778 0.0057 16345.7491 11330.0099 1333.0 2475.0 0.5386 1337.0 0.5402 1183.0 1185.0 1196.0 0.9908 0.9891 145.0 148.0 1267.0 0.1168 0.1144
0.0 38.0 38 4.5758 0.0057 16338.7762 11325.1766 1335.0 2475.0 0.5394 1339.0 0.5410 1183.0 1186.0 1196.0 0.9916 0.9891 147.0 149.0 1267.0 0.1176 0.1160
0.0 39.0 39 4.5744 0.0057 16333.6144 11321.5988 1334.0 2475.0 0.5390 1338.0 0.5406 1183.0 1185.0 1196.0 0.9908 0.9891 146.0 149.0 1267.0 0.1176 0.1152
0.0 40.0 40 4.5746 0.0057 16334.3188 11322.0870 1334.0 2475.0 0.5390 1337.0 0.5402 1183.0 1186.0 1196.0 0.9916 0.9891 145.0 148.0 1267.0 0.1168 0.1144
0.0 41.0 41 4.5741 0.0057 16332.7662 11321.0109 1335.0 2475.0 0.5394 1339.0 0.5410 1183.0 1185.0 1196.0 0.9908 0.9891 147.0 150.0 1267.0 0.1184 0.1160
0.0 42.0 42 4.5751 0.0057 16336.1639 11323.3659 1334.0 2475.0 0.5390 1338.0 0.5406 1183.0 1185.0 1196.0 0.9908 0.9891 146.0 149.0 1267.0 0.1176 0.1152

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-3bfons7b

Finetuned
(899)
this model