BoolQ_Llama-3.2-1B-6vpqysw0

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.2092
Model Preparation Time: 0.0059
Mdl: 5704.7597
Accumulated Loss: 3954.2381
Correct Preds: 2727.0
Total Preds: 3270.0
Accuracy: 0.8339
Correct Gen Preds: 2725.0
Gen Accuracy: 0.8333
Correct Gen Preds 9642: 1785.0
Correct Preds 9642: 1793.0
Total Labels 9642: 2026.0
Accuracy 9642: 0.8850
Gen Accuracy 9642: 0.8810
Correct Gen Preds 2822: 930.0
Correct Preds 2822: 934.0
Total Labels 2822: 1231.0
Accuracy 2822: 0.7587
Gen Accuracy 2822: 0.7555

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 120
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Mdl	Accumulated Loss	Correct Preds	Total Preds	Accuracy	Correct Gen Preds	Gen Accuracy	Correct Gen Preds 9642	Correct Preds 9642	Total Labels 9642	Accuracy 9642	Gen Accuracy 9642	Correct Gen Preds 2822	Correct Preds 2822	Total Labels 2822	Accuracy 2822	Gen Accuracy 2822
No log	0	0	0.7080	0.0059	3339.8933	2315.0376	2032.0	3270.0	0.6214	2040.0	0.6239	2007.0	2008.0	2026.0	0.9911	0.9906	24.0	24.0	1231.0	0.0195	0.0195
0.8113	1.0	182	0.4900	0.0059	2311.6028	1602.2809	2620.0	3270.0	0.8012	2628.0	0.8037	1864.0	1864.0	2026.0	0.9200	0.9200	755.0	756.0	1231.0	0.6141	0.6133
0.2213	2.0	364	0.4649	0.0059	2193.4330	1520.3719	2703.0	3270.0	0.8266	2713.0	0.8297	1866.0	1866.0	2026.0	0.9210	0.9210	837.0	837.0	1231.0	0.6799	0.6799
0.0057	3.0	546	0.7800	0.0059	3679.7187	2550.5866	2717.0	3270.0	0.8309	2675.0	0.8180	1764.0	1790.0	2026.0	0.8835	0.8707	901.0	927.0	1231.0	0.7530	0.7319
0.0028	4.0	728	0.8445	0.0059	3984.2582	2761.6773	2717.0	3270.0	0.8309	2705.0	0.8272	1749.0	1764.0	2026.0	0.8707	0.8633	946.0	953.0	1231.0	0.7742	0.7685
0.0001	5.0	910	1.2163	0.0059	5737.9374	3977.2352	2696.0	3270.0	0.8245	2693.0	0.8235	1713.0	1725.0	2026.0	0.8514	0.8455	970.0	971.0	1231.0	0.7888	0.7880
0.0001	6.0	1092	1.2338	0.0059	5820.5512	4034.4986	2693.0	3270.0	0.8235	2702.0	0.8263	1704.0	1704.0	2026.0	0.8411	0.8411	988.0	989.0	1231.0	0.8034	0.8026
0.0001	7.0	1274	1.2092	0.0059	5704.7597	3954.2381	2727.0	3270.0	0.8339	2725.0	0.8333	1785.0	1793.0	2026.0	0.8850	0.8810	930.0	934.0	1231.0	0.7587	0.7555
0.0	8.0	1456	1.2871	0.0059	6071.8908	4208.7140	2714.0	3270.0	0.8300	2719.0	0.8315	1759.0	1763.0	2026.0	0.8702	0.8682	950.0	951.0	1231.0	0.7725	0.7717
0.0	9.0	1638	1.3767	0.0059	6494.8751	4501.9043	2711.0	3270.0	0.8291	2718.0	0.8312	1719.0	1721.0	2026.0	0.8495	0.8485	989.0	990.0	1231.0	0.8042	0.8034
0.0	10.0	1820	1.4464	0.0059	6823.7329	4729.8512	2710.0	3270.0	0.8287	2718.0	0.8312	1729.0	1729.0	2026.0	0.8534	0.8534	980.0	981.0	1231.0	0.7969	0.7961
0.0	11.0	2002	1.4349	0.0059	6769.4073	4692.1956	2711.0	3270.0	0.8291	2721.0	0.8321	1745.0	1745.0	2026.0	0.8613	0.8613	966.0	966.0	1231.0	0.7847	0.7847
0.0	12.0	2184	1.4399	0.0059	6792.7164	4708.3522	2711.0	3270.0	0.8291	2721.0	0.8321	1747.0	1747.0	2026.0	0.8623	0.8623	964.0	964.0	1231.0	0.7831	0.7831
0.0	13.0	2366	1.4395	0.0059	6790.8932	4707.0885	2711.0	3270.0	0.8291	2721.0	0.8321	1747.0	1747.0	2026.0	0.8623	0.8623	964.0	964.0	1231.0	0.7831	0.7831
0.0	14.0	2548	1.4433	0.0059	6808.7297	4719.4518	2712.0	3270.0	0.8294	2721.0	0.8321	1747.0	1747.0	2026.0	0.8623	0.8623	965.0	965.0	1231.0	0.7839	0.7839
0.9048	15.0	2730	1.4455	0.0059	6819.4436	4726.8781	2711.0	3270.0	0.8291	2721.0	0.8321	1747.0	1747.0	2026.0	0.8623	0.8623	964.0	964.0	1231.0	0.7831	0.7831
0.0	16.0	2912	1.4467	0.0059	6825.1244	4730.8158	2712.0	3270.0	0.8294	2722.0	0.8324	1745.0	1745.0	2026.0	0.8613	0.8613	967.0	967.0	1231.0	0.7855	0.7855
0.0	17.0	3094	1.4464	0.0059	6823.4547	4729.6584	2712.0	3270.0	0.8294	2722.0	0.8324	1748.0	1748.0	2026.0	0.8628	0.8628	964.0	964.0	1231.0	0.7831	0.7831
0.9048	18.0	3276	1.4450	0.0059	6816.8819	4725.1025	2715.0	3270.0	0.8303	2725.0	0.8333	1748.0	1748.0	2026.0	0.8628	0.8628	967.0	967.0	1231.0	0.7855	0.7855
0.0	19.0	3458	1.4473	0.0059	6828.0096	4732.8156	2709.0	3270.0	0.8284	2719.0	0.8315	1746.0	1746.0	2026.0	0.8618	0.8618	963.0	963.0	1231.0	0.7823	0.7823
0.0	20.0	3640	1.4481	0.0059	6831.3665	4735.1424	2715.0	3270.0	0.8303	2725.0	0.8333	1749.0	1749.0	2026.0	0.8633	0.8633	966.0	966.0	1231.0	0.7847	0.7847
0.0	21.0	3822	1.4496	0.0059	6838.7001	4740.2257	2711.0	3270.0	0.8291	2721.0	0.8321	1747.0	1747.0	2026.0	0.8623	0.8623	964.0	964.0	1231.0	0.7831	0.7831
0.0	22.0	4004	1.4488	0.0059	6834.8013	4737.5233	2711.0	3270.0	0.8291	2721.0	0.8321	1747.0	1747.0	2026.0	0.8623	0.8623	964.0	964.0	1231.0	0.7831	0.7831
0.0	23.0	4186	1.4504	0.0059	6842.4888	4742.8518	2712.0	3270.0	0.8294	2722.0	0.8324	1748.0	1748.0	2026.0	0.8628	0.8628	964.0	964.0	1231.0	0.7831	0.7831

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 3

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for donoway/BoolQ_Llama-3.2-1B-6vpqysw0

Base model

meta-llama/Llama-3.2-1B

Finetuned

(900)

this model