BoolQ_Llama-3.2-1B-131yj8sj

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.4452
Model Preparation Time: 0.0057
Mdl: 6818.1174
Accumulated Loss: 4725.9588
Correct Preds: 2702.0
Total Preds: 3270.0
Accuracy: 0.8263
Correct Gen Preds: 2701.0
Gen Accuracy: 0.8260
Correct Gen Preds 9642: 1791.0
Correct Preds 9642: 1798.0
Total Labels 9642: 2026.0
Accuracy 9642: 0.8875
Gen Accuracy 9642: 0.8840
Correct Gen Preds 2822: 901.0
Correct Preds 2822: 904.0
Total Labels 2822: 1231.0
Accuracy 2822: 0.7344
Gen Accuracy 2822: 0.7319

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 120
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Mdl	Accumulated Loss	Correct Preds	Total Preds	Accuracy	Correct Gen Preds	Gen Accuracy	Correct Gen Preds 9642	Correct Preds 9642	Total Labels 9642	Accuracy 9642	Gen Accuracy 9642	Correct Gen Preds 2822	Correct Preds 2822	Total Labels 2822	Accuracy 2822	Gen Accuracy 2822
No log	0	0	0.7080	0.0057	3339.8933	2315.0376	2032.0	3270.0	0.6214	2040.0	0.6239	2007.0	2008.0	2026.0	0.9911	0.9906	24.0	24.0	1231.0	0.0195	0.0195
0.2476	1.0	143	0.4988	0.0057	2353.0385	1631.0020	2591.0	3270.0	0.7924	2599.0	0.7948	1843.0	1843.0	2026.0	0.9097	0.9097	747.0	748.0	1231.0	0.6076	0.6068
0.0885	2.0	286	0.5426	0.0057	2559.9190	1774.4006	2626.0	3270.0	0.8031	2626.0	0.8031	1900.0	1906.0	2026.0	0.9408	0.9378	717.0	720.0	1231.0	0.5849	0.5825
0.0086	3.0	429	0.7471	0.0057	3524.5342	2443.0209	2655.0	3270.0	0.8119	2625.0	0.8028	1638.0	1667.0	2026.0	0.8228	0.8085	978.0	988.0	1231.0	0.8026	0.7945
0.0002	4.0	572	1.1866	0.0057	5597.8044	3880.1023	2662.0	3270.0	0.8141	2663.0	0.8144	1703.0	1707.0	2026.0	0.8425	0.8406	953.0	955.0	1231.0	0.7758	0.7742
0.0115	5.0	715	1.3058	0.0057	6160.2400	4269.9530	2673.0	3270.0	0.8174	2664.0	0.8147	1791.0	1797.0	2026.0	0.8870	0.8840	864.0	876.0	1231.0	0.7116	0.7019
0.0	6.0	858	1.4452	0.0057	6818.1174	4725.9588	2702.0	3270.0	0.8263	2701.0	0.8260	1791.0	1798.0	2026.0	0.8875	0.8840	901.0	904.0	1231.0	0.7344	0.7319
0.0	7.0	1001	1.4433	0.0057	6808.9128	4719.5787	2698.0	3270.0	0.8251	2704.0	0.8269	1812.0	1814.0	2026.0	0.8954	0.8944	883.0	884.0	1231.0	0.7181	0.7173
0.0	8.0	1144	1.3856	0.0057	6536.7240	4530.9118	2691.0	3270.0	0.8229	2694.0	0.8239	1768.0	1772.0	2026.0	0.8746	0.8727	917.0	919.0	1231.0	0.7465	0.7449
0.9802	9.0	1287	1.4773	0.0057	6969.2721	4830.7313	2692.0	3270.0	0.8232	2698.0	0.8251	1793.0	1795.0	2026.0	0.8860	0.8850	897.0	897.0	1231.0	0.7287	0.7287
0.0	10.0	1430	1.5437	0.0057	7282.6372	5047.9395	2695.0	3270.0	0.8242	2701.0	0.8260	1775.0	1777.0	2026.0	0.8771	0.8761	917.0	918.0	1231.0	0.7457	0.7449
0.0	11.0	1573	1.5490	0.0057	7307.5108	5065.1805	2690.0	3270.0	0.8226	2696.0	0.8245	1771.0	1773.0	2026.0	0.8751	0.8741	916.0	917.0	1231.0	0.7449	0.7441
0.0	12.0	1716	1.5529	0.0057	7325.9736	5077.9779	2692.0	3270.0	0.8232	2697.0	0.8248	1773.0	1775.0	2026.0	0.8761	0.8751	916.0	917.0	1231.0	0.7449	0.7441
0.0	13.0	1859	1.5565	0.0057	7343.1664	5089.8951	2691.0	3270.0	0.8229	2696.0	0.8245	1771.0	1773.0	2026.0	0.8751	0.8741	917.0	918.0	1231.0	0.7457	0.7449
0.0	14.0	2002	1.5552	0.0057	7336.7036	5085.4154	2692.0	3270.0	0.8232	2697.0	0.8248	1772.0	1774.0	2026.0	0.8756	0.8746	917.0	918.0	1231.0	0.7457	0.7449
0.9802	15.0	2145	1.5579	0.0057	7349.6490	5094.3885	2695.0	3270.0	0.8242	2700.0	0.8257	1774.0	1776.0	2026.0	0.8766	0.8756	918.0	919.0	1231.0	0.7465	0.7457
0.0	16.0	2288	1.5570	0.0057	7345.2574	5091.3444	2689.0	3270.0	0.8223	2694.0	0.8239	1770.0	1772.0	2026.0	0.8746	0.8736	916.0	917.0	1231.0	0.7449	0.7441
0.0	17.0	2431	1.5594	0.0057	7356.5874	5099.1978	2693.0	3270.0	0.8235	2699.0	0.8254	1772.0	1774.0	2026.0	0.8756	0.8746	918.0	919.0	1231.0	0.7465	0.7457
0.0	18.0	2574	1.5588	0.0057	7354.0051	5097.4079	2693.0	3270.0	0.8235	2699.0	0.8254	1773.0	1775.0	2026.0	0.8761	0.8751	917.0	918.0	1231.0	0.7457	0.7449
0.0	19.0	2717	1.5574	0.0057	7347.1134	5092.6310	2694.0	3270.0	0.8239	2700.0	0.8257	1775.0	1777.0	2026.0	0.8771	0.8761	916.0	917.0	1231.0	0.7449	0.7441
0.0	20.0	2860	1.5598	0.0057	7358.7582	5100.7025	2694.0	3270.0	0.8239	2699.0	0.8254	1776.0	1778.0	2026.0	0.8776	0.8766	915.0	916.0	1231.0	0.7441	0.7433
0.0	21.0	3003	1.5610	0.0057	7364.2419	5104.5035	2693.0	3270.0	0.8235	2699.0	0.8254	1773.0	1775.0	2026.0	0.8761	0.8751	917.0	918.0	1231.0	0.7457	0.7449
0.0	22.0	3146	1.5590	0.0057	7354.8963	5098.0257	2695.0	3270.0	0.8242	2700.0	0.8257	1775.0	1777.0	2026.0	0.8771	0.8761	917.0	918.0	1231.0	0.7457	0.7449
0.0	23.0	3289	1.5609	0.0057	7363.6331	5104.0815	2692.0	3270.0	0.8232	2698.0	0.8251	1773.0	1775.0	2026.0	0.8761	0.8751	916.0	917.0	1231.0	0.7449	0.7441
0.0	24.0	3432	1.5620	0.0057	7368.7476	5107.6266	2694.0	3270.0	0.8239	2699.0	0.8254	1775.0	1777.0	2026.0	0.8771	0.8761	916.0	917.0	1231.0	0.7449	0.7441
0.0	25.0	3575	1.5613	0.0057	7365.4606	5105.3482	2693.0	3270.0	0.8235	2699.0	0.8254	1774.0	1776.0	2026.0	0.8766	0.8756	916.0	917.0	1231.0	0.7449	0.7441
0.0	26.0	3718	1.5604	0.0057	7361.4952	5102.5996	2692.0	3270.0	0.8232	2697.0	0.8248	1773.0	1775.0	2026.0	0.8761	0.8751	916.0	917.0	1231.0	0.7449	0.7441

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 2

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for donoway/BoolQ_Llama-3.2-1B-131yj8sj

Base model

meta-llama/Llama-3.2-1B

Finetuned

(899)

this model