BoolQ_Llama-3.2-1B-50ztosrf

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9785
  • Model Preparation Time: 0.0058
  • Mdl: 4616.0341
  • Accumulated Loss: 3199.5911
  • Correct Preds: 2776.0
  • Total Preds: 3270.0
  • Accuracy: 0.8489
  • Correct Gen Preds: 2773.0
  • Gen Accuracy: 0.8480
  • Correct Gen Preds 9642: 1791.0
  • Correct Preds 9642: 1798.0
  • Total Labels 9642: 2026.0
  • Accuracy 9642: 0.8875
  • Gen Accuracy 9642: 0.8840
  • Correct Gen Preds 2822: 973.0
  • Correct Preds 2822: 978.0
  • Total Labels 2822: 1231.0
  • Accuracy 2822: 0.7945
  • Gen Accuracy 2822: 0.7904

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 120
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_ratio: 0.001
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Mdl Accumulated Loss Correct Preds Total Preds Accuracy Correct Gen Preds Gen Accuracy Correct Gen Preds 9642 Correct Preds 9642 Total Labels 9642 Accuracy 9642 Gen Accuracy 9642 Correct Gen Preds 2822 Correct Preds 2822 Total Labels 2822 Accuracy 2822 Gen Accuracy 2822
No log 0 0 0.7080 0.0058 3339.8933 2315.0376 2032.0 3270.0 0.6214 2040.0 0.6239 2007.0 2008.0 2026.0 0.9911 0.9906 24.0 24.0 1231.0 0.0195 0.0195
0.3209 1.0 295 0.4495 0.0058 2120.4374 1469.7752 2653.0 3270.0 0.8113 2047.0 0.6260 1160.0 1557.0 2026.0 0.7685 0.5726 877.0 1096.0 1231.0 0.8903 0.7124
0.2469 2.0 590 0.4858 0.0058 2291.6124 1588.4247 2711.0 3270.0 0.8291 2070.0 0.6330 1100.0 1638.0 2026.0 0.8085 0.5429 963.0 1073.0 1231.0 0.8716 0.7823
0.001 3.0 885 0.9723 0.0058 4586.9576 3179.4368 2740.0 3270.0 0.8379 2711.0 0.8291 1752.0 1781.0 2026.0 0.8791 0.8648 950.0 959.0 1231.0 0.7790 0.7717
0.0001 4.0 1180 0.9785 0.0058 4616.0341 3199.5911 2776.0 3270.0 0.8489 2773.0 0.8480 1791.0 1798.0 2026.0 0.8875 0.8840 973.0 978.0 1231.0 0.7945 0.7904
0.0 5.0 1475 1.0825 0.0058 5106.9664 3539.8794 2764.0 3270.0 0.8453 2752.0 0.8416 1812.0 1828.0 2026.0 0.9023 0.8944 931.0 936.0 1231.0 0.7604 0.7563
0.0 6.0 1770 1.1417 0.0058 5385.9903 3733.2840 2768.0 3270.0 0.8465 2745.0 0.8394 1811.0 1832.0 2026.0 0.9042 0.8939 927.0 936.0 1231.0 0.7604 0.7530
0.0 7.0 2065 1.2458 0.0058 5877.3016 4073.8351 2762.0 3270.0 0.8446 2744.0 0.8391 1787.0 1807.0 2026.0 0.8919 0.8820 950.0 955.0 1231.0 0.7758 0.7717
0.0 8.0 2360 1.2477 0.0058 5886.0491 4079.8983 2762.0 3270.0 0.8446 2744.0 0.8391 1802.0 1821.0 2026.0 0.8988 0.8894 935.0 941.0 1231.0 0.7644 0.7595
0.6191 9.0 2655 1.2608 0.0058 5947.9186 4122.7830 2768.0 3270.0 0.8465 2749.0 0.8407 1805.0 1824.0 2026.0 0.9003 0.8909 937.0 944.0 1231.0 0.7669 0.7612
0.0 10.0 2950 1.2320 0.0058 5812.1306 4028.6620 2765.0 3270.0 0.8456 2746.0 0.8398 1807.0 1826.0 2026.0 0.9013 0.8919 932.0 939.0 1231.0 0.7628 0.7571
0.0001 11.0 3245 1.2418 0.0058 5858.3967 4060.7312 2767.0 3270.0 0.8462 2745.0 0.8394 1807.0 1829.0 2026.0 0.9028 0.8919 931.0 938.0 1231.0 0.7620 0.7563
0.0 12.0 3540 1.2715 0.0058 5998.6521 4157.9488 2765.0 3270.0 0.8456 2750.0 0.8410 1800.0 1817.0 2026.0 0.8968 0.8885 943.0 948.0 1231.0 0.7701 0.7660
0.0 13.0 3835 1.2808 0.0058 6042.3338 4188.2267 2760.0 3270.0 0.8440 2745.0 0.8394 1799.0 1816.0 2026.0 0.8963 0.8880 939.0 944.0 1231.0 0.7669 0.7628
0.0 14.0 4130 1.2792 0.0058 6034.8824 4183.0618 2765.0 3270.0 0.8456 2751.0 0.8413 1800.0 1818.0 2026.0 0.8973 0.8885 944.0 947.0 1231.0 0.7693 0.7669

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/BoolQ_Llama-3.2-1B-50ztosrf

Finetuned
(899)
this model