BoolQ_Llama-3.2-1B-n5s6b4x8

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.4568
Model Preparation Time: 0.0058
Mdl: 6872.6248
Accumulated Loss: 4763.7405
Correct Preds: 2760.0
Total Preds: 3270.0
Accuracy: 0.8440
Correct Gen Preds: 2757.0
Gen Accuracy: 0.8431
Correct Gen Preds 9642: 1814.0
Correct Preds 9642: 1823.0
Total Labels 9642: 2026.0
Accuracy 9642: 0.8998
Gen Accuracy 9642: 0.8954
Correct Gen Preds 2822: 934.0
Correct Preds 2822: 937.0
Total Labels 2822: 1231.0
Accuracy 2822: 0.7612
Gen Accuracy 2822: 0.7587

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 120
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Mdl	Accumulated Loss	Correct Preds	Total Preds	Accuracy	Correct Gen Preds	Gen Accuracy	Correct Gen Preds 9642	Correct Preds 9642	Total Labels 9642	Accuracy 9642	Gen Accuracy 9642	Correct Gen Preds 2822	Correct Preds 2822	Total Labels 2822	Accuracy 2822	Gen Accuracy 2822
No log	0	0	0.7080	0.0058	3339.8933	2315.0376	2032.0	3270.0	0.6214	2040.0	0.6239	2007.0	2008.0	2026.0	0.9911	0.9906	24.0	24.0	1231.0	0.0195	0.0195
0.3324	1.0	232	0.4426	0.0058	2088.1525	1447.3970	2676.0	3270.0	0.8183	2679.0	0.8193	1808.0	1809.0	2026.0	0.8929	0.8924	862.0	867.0	1231.0	0.7043	0.7002
0.4917	2.0	464	0.4887	0.0058	2305.5456	1598.0824	2697.0	3270.0	0.8248	2690.0	0.8226	1887.0	1893.0	2026.0	0.9344	0.9314	797.0	804.0	1231.0	0.6531	0.6474
0.0028	3.0	696	0.7594	0.0058	3582.3567	2483.1004	2676.0	3270.0	0.8183	2659.0	0.8131	1641.0	1663.0	2026.0	0.8208	0.8100	1009.0	1013.0	1231.0	0.8229	0.8197
0.0011	4.0	928	0.9795	0.0058	4620.9576	3203.0038	2731.0	3270.0	0.8352	2723.0	0.8327	1788.0	1799.0	2026.0	0.8880	0.8825	927.0	932.0	1231.0	0.7571	0.7530
0.1389	5.0	1160	1.0611	0.0058	5005.9836	3469.8834	2739.0	3270.0	0.8376	2737.0	0.8370	1847.0	1853.0	2026.0	0.9146	0.9116	882.0	886.0	1231.0	0.7197	0.7165
0.0002	6.0	1392	1.1056	0.0058	5215.9426	3615.4159	2749.0	3270.0	0.8407	2751.0	0.8413	1881.0	1885.0	2026.0	0.9304	0.9284	862.0	864.0	1231.0	0.7019	0.7002
0.0001	7.0	1624	1.2332	0.0058	5817.6850	4032.5120	2754.0	3270.0	0.8422	2738.0	0.8373	1806.0	1824.0	2026.0	0.9003	0.8914	923.0	930.0	1231.0	0.7555	0.7498
0.0	8.0	1856	1.2209	0.0058	5759.8281	3992.4086	2754.0	3270.0	0.8422	2753.0	0.8419	1788.0	1798.0	2026.0	0.8875	0.8825	956.0	956.0	1231.0	0.7766	0.7766
0.0	9.0	2088	1.4452	0.0058	6817.7274	4725.6885	2750.0	3270.0	0.8410	2746.0	0.8398	1815.0	1825.0	2026.0	0.9008	0.8959	922.0	925.0	1231.0	0.7514	0.7490
0.0	10.0	2320	1.4119	0.0058	6660.5648	4616.7517	2752.0	3270.0	0.8416	2749.0	0.8407	1797.0	1807.0	2026.0	0.8919	0.8870	943.0	945.0	1231.0	0.7677	0.7660
0.0	11.0	2552	1.4389	0.0058	6788.4022	4705.3618	2753.0	3270.0	0.8419	2751.0	0.8413	1813.0	1822.0	2026.0	0.8993	0.8949	929.0	931.0	1231.0	0.7563	0.7547
0.0	12.0	2784	1.4300	0.0058	6746.3247	4676.1959	2755.0	3270.0	0.8425	2752.0	0.8416	1812.0	1821.0	2026.0	0.8988	0.8944	931.0	934.0	1231.0	0.7587	0.7563
0.0	13.0	3016	1.4335	0.0058	6762.4940	4687.4036	2756.0	3270.0	0.8428	2750.0	0.8410	1806.0	1819.0	2026.0	0.8978	0.8914	935.0	937.0	1231.0	0.7612	0.7595
0.0	14.0	3248	1.4568	0.0058	6872.6248	4763.7405	2760.0	3270.0	0.8440	2757.0	0.8431	1814.0	1823.0	2026.0	0.8998	0.8954	934.0	937.0	1231.0	0.7612	0.7587
0.0	15.0	3480	1.4631	0.0058	6902.2813	4784.2968	2750.0	3270.0	0.8410	2739.0	0.8376	1792.0	1809.0	2026.0	0.8929	0.8845	938.0	941.0	1231.0	0.7644	0.7620
0.0	16.0	3712	1.4765	0.0058	6965.4556	4828.0859	2754.0	3270.0	0.8422	2743.0	0.8388	1797.0	1814.0	2026.0	0.8954	0.8870	937.0	940.0	1231.0	0.7636	0.7612
0.0	17.0	3944	1.4796	0.0058	6980.1585	4838.2772	2751.0	3270.0	0.8413	2745.0	0.8394	1799.0	1812.0	2026.0	0.8944	0.8880	937.0	939.0	1231.0	0.7628	0.7612
0.0	18.0	4176	1.4793	0.0058	6978.7939	4837.3313	2755.0	3270.0	0.8425	2748.0	0.8404	1799.0	1813.0	2026.0	0.8949	0.8880	940.0	942.0	1231.0	0.7652	0.7636
0.0	19.0	4408	1.4822	0.0058	6992.2377	4846.6498	2752.0	3270.0	0.8416	2742.0	0.8385	1798.0	1815.0	2026.0	0.8959	0.8875	935.0	937.0	1231.0	0.7612	0.7595
0.0	20.0	4640	1.4798	0.0058	6980.8944	4838.7873	2753.0	3270.0	0.8419	2745.0	0.8394	1798.0	1812.0	2026.0	0.8944	0.8875	938.0	941.0	1231.0	0.7644	0.7620
0.0	21.0	4872	1.4847	0.0058	7004.0401	4854.8307	2755.0	3270.0	0.8425	2748.0	0.8404	1801.0	1815.0	2026.0	0.8959	0.8889	938.0	940.0	1231.0	0.7636	0.7620
0.0	22.0	5104	1.4801	0.0058	6982.3382	4839.7880	2754.0	3270.0	0.8422	2746.0	0.8398	1797.0	1812.0	2026.0	0.8944	0.8870	940.0	942.0	1231.0	0.7652	0.7636
0.0	23.0	5336	1.4791	0.0058	6977.9730	4836.7623	2756.0	3270.0	0.8428	2747.0	0.8401	1801.0	1816.0	2026.0	0.8963	0.8889	937.0	940.0	1231.0	0.7636	0.7612
0.0	24.0	5568	1.4821	0.0058	6991.9891	4846.4775	2751.0	3270.0	0.8413	2743.0	0.8388	1797.0	1812.0	2026.0	0.8944	0.8870	937.0	939.0	1231.0	0.7628	0.7612
0.0	25.0	5800	1.4844	0.0058	7003.0013	4854.1106	2754.0	3270.0	0.8422	2746.0	0.8398	1799.0	1812.0	2026.0	0.8944	0.8880	938.0	942.0	1231.0	0.7652	0.7620
0.0	26.0	6032	1.4848	0.0058	7004.8082	4855.3631	2760.0	3270.0	0.8440	2750.0	0.8410	1800.0	1816.0	2026.0	0.8963	0.8885	941.0	944.0	1231.0	0.7669	0.7644

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 5

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for donoway/BoolQ_Llama-3.2-1B-n5s6b4x8

Base model

meta-llama/Llama-3.2-1B

Finetuned

(899)

this model