BoolQ_Llama-3.2-1B-xemwi1ki

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1526
Model Preparation Time: 0.0056
Mdl: 5437.5984
Accumulated Loss: 3769.0560
Correct Preds: 2788.0
Total Preds: 3270.0
Accuracy: 0.8526
Correct Gen Preds: 2792.0
Gen Accuracy: 0.8538
Correct Gen Preds 9642: 1819.0
Correct Preds 9642: 1819.0
Total Labels 9642: 2026.0
Accuracy 9642: 0.8978
Gen Accuracy 9642: 0.8978
Correct Gen Preds 2822: 969.0
Correct Preds 2822: 969.0
Total Labels 2822: 1231.0
Accuracy 2822: 0.7872
Gen Accuracy 2822: 0.7872

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 120
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.001
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Mdl	Accumulated Loss	Correct Preds	Total Preds	Accuracy	Correct Gen Preds	Gen Accuracy	Correct Gen Preds 9642	Correct Preds 9642	Total Labels 9642	Accuracy 9642	Gen Accuracy 9642	Correct Gen Preds 2822	Correct Preds 2822	Total Labels 2822	Accuracy 2822	Gen Accuracy 2822
No log	0	0	0.7080	0.0056	3339.8933	2315.0376	2032.0	3270.0	0.6214	2040.0	0.6239	2007.0	2008.0	2026.0	0.9911	0.9906	24.0	24.0	1231.0	0.0195	0.0195
0.3269	1.0	280	0.4252	0.0056	2005.8025	1390.3164	2742.0	3270.0	0.8385	2747.0	0.8401	1852.0	1855.0	2026.0	0.9156	0.9141	885.0	887.0	1231.0	0.7206	0.7189
0.0893	2.0	560	0.4500	0.0056	2123.1104	1471.6280	2759.0	3270.0	0.8437	2597.0	0.7942	1714.0	1823.0	2026.0	0.8998	0.8460	875.0	936.0	1231.0	0.7604	0.7108
0.001	3.0	840	0.8164	0.0056	3851.5423	2669.6857	2759.0	3270.0	0.8437	2739.0	0.8376	1764.0	1785.0	2026.0	0.8810	0.8707	967.0	974.0	1231.0	0.7912	0.7855
0.0101	4.0	1120	0.9761	0.0056	4604.8533	3191.8411	2776.0	3270.0	0.8489	2781.0	0.8505	1869.0	1870.0	2026.0	0.9230	0.9225	906.0	906.0	1231.0	0.7360	0.7360
0.0	5.0	1400	1.1266	0.0056	5314.7413	3683.8980	2771.0	3270.0	0.8474	2774.0	0.8483	1790.0	1791.0	2026.0	0.8840	0.8835	980.0	980.0	1231.0	0.7961	0.7961
0.0	6.0	1680	1.0680	0.0056	5038.6131	3492.5005	2772.0	3270.0	0.8477	2776.0	0.8489	1855.0	1855.0	2026.0	0.9156	0.9156	917.0	917.0	1231.0	0.7449	0.7449
0.0001	7.0	1960	1.1526	0.0056	5437.5984	3769.0560	2788.0	3270.0	0.8526	2792.0	0.8538	1819.0	1819.0	2026.0	0.8978	0.8978	969.0	969.0	1231.0	0.7872	0.7872
0.0	8.0	2240	1.1742	0.0056	5539.3681	3839.5974	2783.0	3270.0	0.8511	2788.0	0.8526	1819.0	1819.0	2026.0	0.8978	0.8978	964.0	964.0	1231.0	0.7831	0.7831
0.0	9.0	2520	1.1807	0.0056	5570.2856	3861.0278	2786.0	3270.0	0.8520	2791.0	0.8535	1820.0	1820.0	2026.0	0.8983	0.8983	966.0	966.0	1231.0	0.7847	0.7847
0.0	10.0	2800	1.2039	0.0056	5679.6573	3936.8384	2783.0	3270.0	0.8511	2787.0	0.8523	1818.0	1818.0	2026.0	0.8973	0.8973	965.0	965.0	1231.0	0.7839	0.7839
0.0	11.0	3080	1.2093	0.0056	5704.9015	3954.3364	2784.0	3270.0	0.8514	2789.0	0.8529	1821.0	1821.0	2026.0	0.8988	0.8988	963.0	963.0	1231.0	0.7823	0.7823
0.0	12.0	3360	1.2143	0.0056	5728.6013	3970.7638	2785.0	3270.0	0.8517	2790.0	0.8532	1820.0	1820.0	2026.0	0.8983	0.8983	965.0	965.0	1231.0	0.7839	0.7839
0.0	13.0	3640	1.2183	0.0056	5747.3020	3983.7262	2785.0	3270.0	0.8517	2789.0	0.8529	1820.0	1820.0	2026.0	0.8983	0.8983	965.0	965.0	1231.0	0.7839	0.7839
0.0001	14.0	3920	1.2219	0.0056	5764.5638	3995.6911	2786.0	3270.0	0.8520	2791.0	0.8535	1821.0	1821.0	2026.0	0.8988	0.8988	965.0	965.0	1231.0	0.7839	0.7839
0.0	15.0	4200	1.2225	0.0056	5767.2744	3997.5700	2786.0	3270.0	0.8520	2791.0	0.8535	1820.0	1820.0	2026.0	0.8983	0.8983	966.0	966.0	1231.0	0.7847	0.7847
0.0	16.0	4480	1.2261	0.0056	5784.1350	4009.2569	2786.0	3270.0	0.8520	2791.0	0.8535	1819.0	1819.0	2026.0	0.8978	0.8978	967.0	967.0	1231.0	0.7855	0.7855
0.0	17.0	4760	1.2237	0.0056	5772.7257	4001.3485	2785.0	3270.0	0.8517	2790.0	0.8532	1820.0	1820.0	2026.0	0.8983	0.8983	965.0	965.0	1231.0	0.7839	0.7839

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 2

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for donoway/BoolQ_Llama-3.2-1B-xemwi1ki

Base model

meta-llama/Llama-3.2-1B

Finetuned

(899)

this model