GSM8K-Binary_Llama-3.2-1B-f8096090

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6336
  • Model Preparation Time: 0.0058
  • Mdl: 2262.3589
  • Accumulated Loss: 1568.1477
  • Correct Preds: 1973.0
  • Total Preds: 2475.0
  • Accuracy: 0.7972
  • Correct Gen Preds: 369.0
  • Gen Accuracy: 0.1491
  • Correct Gen Preds 34192: 0.0
  • Correct Preds 34192: 974.0
  • Total Labels 34192: 1196.0
  • Accuracy 34192: 0.8144
  • Gen Accuracy 34192: 0.0
  • Correct Gen Preds 41568: 362.0
  • Correct Preds 41568: 999.0
  • Total Labels 41568: 1267.0
  • Accuracy 41568: 0.7885
  • Gen Accuracy 41568: 0.2857

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.001
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 34192 Correct Preds 34192 Total Labels 34192 Accuracy 34192 Gen Accuracy 34192 Correct Gen Preds 41568 Correct Preds 41568 Total Labels 41568 Accuracy 41568 Gen Accuracy 41568
No log 0 0 1.4656 0.0058 5233.1723 3627.3586 1196.0 2475.0 0.4832 1204.0 0.4865 1196.0 1196.0 1196.0 1.0 1.0 0.0 0.0 1267.0 0.0 0.0
0.5859 1.0 52 0.5818 0.0058 2077.5047 1440.0165 1847.0 2475.0 0.7463 8.0 0.0032 0.0 857.0 1196.0 0.7166 0.0 0.0 990.0 1267.0 0.7814 0.0
0.6145 2.0 104 0.5168 0.0058 1845.2524 1279.0315 1948.0 2475.0 0.7871 69.0 0.0279 0.0 1063.0 1196.0 0.8888 0.0 61.0 885.0 1267.0 0.6985 0.0481
0.2879 3.0 156 0.5778 0.0058 2063.1398 1430.0595 1868.0 2475.0 0.7547 53.0 0.0214 0.0 1106.0 1196.0 0.9247 0.0 46.0 762.0 1267.0 0.6014 0.0363
0.0501 4.0 208 0.6336 0.0058 2262.3589 1568.1477 1973.0 2475.0 0.7972 369.0 0.1491 0.0 974.0 1196.0 0.8144 0.0 362.0 999.0 1267.0 0.7885 0.2857
0.3604 5.0 260 1.7321 0.0058 6184.7525 4286.9438 1864.0 2475.0 0.7531 1135.0 0.4586 634.0 1105.0 1196.0 0.9239 0.5301 494.0 759.0 1267.0 0.5991 0.3899
0.0662 6.0 312 1.2469 0.0058 4452.3018 3086.1004 1972.0 2475.0 0.7968 1028.0 0.4154 359.0 1028.0 1196.0 0.8595 0.3002 661.0 944.0 1267.0 0.7451 0.5217
0.0 7.0 364 1.4682 0.0058 5242.5624 3633.8673 1970.0 2475.0 0.7960 1223.0 0.4941 464.0 1033.0 1196.0 0.8637 0.3880 751.0 937.0 1267.0 0.7395 0.5927
0.0003 8.0 416 1.9052 0.0058 6802.8127 4715.3504 1925.0 2475.0 0.7778 1504.0 0.6077 583.0 948.0 1196.0 0.7926 0.4875 914.0 977.0 1267.0 0.7711 0.7214
0.5881 9.0 468 1.9828 0.0058 7079.8847 4907.4021 1957.0 2475.0 0.7907 1879.0 0.7592 920.0 983.0 1196.0 0.8219 0.7692 952.0 974.0 1267.0 0.7687 0.7514
0.0 10.0 520 1.9968 0.0058 7129.8865 4942.0607 1957.0 2475.0 0.7907 1886.0 0.7620 913.0 972.0 1196.0 0.8127 0.7634 966.0 985.0 1267.0 0.7774 0.7624
0.5881 11.0 572 2.0014 0.0058 7146.2344 4953.3922 1959.0 2475.0 0.7915 1892.0 0.7644 918.0 972.0 1196.0 0.8127 0.7676 967.0 987.0 1267.0 0.7790 0.7632
0.0 12.0 624 2.0068 0.0058 7165.7013 4966.8857 1959.0 2475.0 0.7915 1890.0 0.7636 916.0 972.0 1196.0 0.8127 0.7659 967.0 987.0 1267.0 0.7790 0.7632
0.5882 13.0 676 2.0059 0.0058 7162.3520 4964.5641 1959.0 2475.0 0.7915 1893.0 0.7648 919.0 973.0 1196.0 0.8135 0.7684 967.0 986.0 1267.0 0.7782 0.7632
0.0 14.0 728 2.0106 0.0058 7179.0242 4976.1204 1958.0 2475.0 0.7911 1891.0 0.7640 918.0 972.0 1196.0 0.8127 0.7676 966.0 986.0 1267.0 0.7782 0.7624

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/GSM8K-Binary_Llama-3.2-1B-f8096090

Finetuned
(903)
this model