BoolQ_Llama-3.2-1B-ql30kat9

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.0151
Model Preparation Time: 0.0058
Mdl: 4788.6201
Accumulated Loss: 3319.2185
Correct Preds: 2773.0
Total Preds: 3270.0
Accuracy: 0.8480
Correct Gen Preds: 2627.0
Gen Accuracy: 0.8034
Correct Gen Preds 9642: 1742.0
Correct Preds 9642: 1840.0
Total Labels 9642: 2026.0
Accuracy 9642: 0.9082
Gen Accuracy 9642: 0.8598
Correct Gen Preds 2822: 876.0
Correct Preds 2822: 933.0
Total Labels 2822: 1231.0
Accuracy 2822: 0.7579
Gen Accuracy 2822: 0.7116

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 120
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.001
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Mdl	Accumulated Loss	Correct Preds	Total Preds	Accuracy	Correct Gen Preds	Gen Accuracy	Correct Gen Preds 9642	Correct Preds 9642	Total Labels 9642	Accuracy 9642	Gen Accuracy 9642	Correct Gen Preds 2822	Correct Preds 2822	Total Labels 2822	Accuracy 2822	Gen Accuracy 2822
No log	0	0	0.7080	0.0058	3339.8933	2315.0376	2032.0	3270.0	0.6214	2040.0	0.6239	2007.0	2008.0	2026.0	0.9911	0.9906	24.0	24.0	1231.0	0.0195	0.0195
0.4686	1.0	264	0.4596	0.0058	2168.4484	1503.0539	2651.0	3270.0	0.8107	357.0	0.1092	141.0	1569.0	2026.0	0.7744	0.0696	209.0	1082.0	1231.0	0.8790	0.1698
0.6637	2.0	528	0.4609	0.0058	2174.1167	1506.9829	2767.0	3270.0	0.8462	1779.0	0.5440	1210.0	1820.0	2026.0	0.8983	0.5972	561.0	947.0	1231.0	0.7693	0.4557
0.255	3.0	792	0.7320	0.0058	3453.2028	2393.5778	2711.0	3270.0	0.8291	2235.0	0.6835	1399.0	1771.0	2026.0	0.8741	0.6905	828.0	940.0	1231.0	0.7636	0.6726
0.0004	4.0	1056	0.9760	0.0058	4604.4062	3191.5312	2741.0	3270.0	0.8382	2540.0	0.7768	1736.0	1864.0	2026.0	0.9200	0.8569	795.0	877.0	1231.0	0.7124	0.6458
0.0001	5.0	1320	1.0151	0.0058	4788.6201	3319.2185	2773.0	3270.0	0.8480	2627.0	0.8034	1742.0	1840.0	2026.0	0.9082	0.8598	876.0	933.0	1231.0	0.7579	0.7116
0.0	6.0	1584	1.1466	0.0058	5409.4124	3749.5189	2763.0	3270.0	0.8450	2640.0	0.8073	1697.0	1792.0	2026.0	0.8845	0.8376	935.0	971.0	1231.0	0.7888	0.7595
0.0001	7.0	1848	1.1966	0.0058	5644.9637	3912.7906	2758.0	3270.0	0.8434	2661.0	0.8138	1712.0	1784.0	2026.0	0.8806	0.8450	941.0	974.0	1231.0	0.7912	0.7644
0.0	8.0	2112	1.2063	0.0058	5690.8587	3944.6026	2765.0	3270.0	0.8456	2653.0	0.8113	1706.0	1792.0	2026.0	0.8845	0.8421	939.0	973.0	1231.0	0.7904	0.7628
0.0	9.0	2376	1.2089	0.0058	5703.1426	3953.1172	2760.0	3270.0	0.8440	2647.0	0.8095	1691.0	1780.0	2026.0	0.8786	0.8346	948.0	980.0	1231.0	0.7961	0.7701
0.0	10.0	2640	1.2415	0.0058	5856.9258	4059.7116	2759.0	3270.0	0.8437	2657.0	0.8125	1703.0	1783.0	2026.0	0.8801	0.8406	946.0	976.0	1231.0	0.7929	0.7685
0.4201	11.0	2904	1.2627	0.0058	5956.7524	4128.9061	2768.0	3270.0	0.8465	2670.0	0.8165	1715.0	1789.0	2026.0	0.8830	0.8465	947.0	979.0	1231.0	0.7953	0.7693
0.0001	12.0	3168	1.2632	0.0058	5959.0585	4130.5046	2761.0	3270.0	0.8443	2656.0	0.8122	1702.0	1785.0	2026.0	0.8810	0.8401	946.0	976.0	1231.0	0.7929	0.7685
0.0	13.0	3432	1.2741	0.0058	6010.4943	4166.1572	2764.0	3270.0	0.8453	2676.0	0.8183	1726.0	1795.0	2026.0	0.8860	0.8519	942.0	969.0	1231.0	0.7872	0.7652
0.0	14.0	3696	1.2920	0.0058	6095.3461	4224.9719	2764.0	3270.0	0.8453	2684.0	0.8208	1737.0	1798.0	2026.0	0.8875	0.8574	939.0	966.0	1231.0	0.7847	0.7628
0.0	15.0	3960	1.3017	0.0058	6140.9792	4256.6024	2765.0	3270.0	0.8456	2684.0	0.8208	1723.0	1783.0	2026.0	0.8801	0.8504	953.0	982.0	1231.0	0.7977	0.7742
0.0	16.0	4224	1.3052	0.0058	6157.3538	4267.9524	2765.0	3270.0	0.8456	2681.0	0.8199	1724.0	1786.0	2026.0	0.8815	0.8509	949.0	979.0	1231.0	0.7953	0.7709

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 2

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for donoway/BoolQ_Llama-3.2-1B-ql30kat9

Base model

meta-llama/Llama-3.2-1B

Finetuned

(899)

this model