BoolQ_Llama-3.2-1B-50ztosrf

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.9785
Model Preparation Time: 0.0058
Mdl: 4616.0341
Accumulated Loss: 3199.5911
Correct Preds: 2776.0
Total Preds: 3270.0
Accuracy: 0.8489
Correct Gen Preds: 2773.0
Gen Accuracy: 0.8480
Correct Gen Preds 9642: 1791.0
Correct Preds 9642: 1798.0
Total Labels 9642: 2026.0
Accuracy 9642: 0.8875
Gen Accuracy 9642: 0.8840
Correct Gen Preds 2822: 973.0
Correct Preds 2822: 978.0
Total Labels 2822: 1231.0
Accuracy 2822: 0.7945
Gen Accuracy 2822: 0.7904

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 120
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.001
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Mdl	Accumulated Loss	Correct Preds	Total Preds	Accuracy	Correct Gen Preds	Gen Accuracy	Correct Gen Preds 9642	Correct Preds 9642	Total Labels 9642	Accuracy 9642	Gen Accuracy 9642	Correct Gen Preds 2822	Correct Preds 2822	Total Labels 2822	Accuracy 2822	Gen Accuracy 2822
No log	0	0	0.7080	0.0058	3339.8933	2315.0376	2032.0	3270.0	0.6214	2040.0	0.6239	2007.0	2008.0	2026.0	0.9911	0.9906	24.0	24.0	1231.0	0.0195	0.0195
0.3209	1.0	295	0.4495	0.0058	2120.4374	1469.7752	2653.0	3270.0	0.8113	2047.0	0.6260	1160.0	1557.0	2026.0	0.7685	0.5726	877.0	1096.0	1231.0	0.8903	0.7124
0.2469	2.0	590	0.4858	0.0058	2291.6124	1588.4247	2711.0	3270.0	0.8291	2070.0	0.6330	1100.0	1638.0	2026.0	0.8085	0.5429	963.0	1073.0	1231.0	0.8716	0.7823
0.001	3.0	885	0.9723	0.0058	4586.9576	3179.4368	2740.0	3270.0	0.8379	2711.0	0.8291	1752.0	1781.0	2026.0	0.8791	0.8648	950.0	959.0	1231.0	0.7790	0.7717
0.0001	4.0	1180	0.9785	0.0058	4616.0341	3199.5911	2776.0	3270.0	0.8489	2773.0	0.8480	1791.0	1798.0	2026.0	0.8875	0.8840	973.0	978.0	1231.0	0.7945	0.7904
0.0	5.0	1475	1.0825	0.0058	5106.9664	3539.8794	2764.0	3270.0	0.8453	2752.0	0.8416	1812.0	1828.0	2026.0	0.9023	0.8944	931.0	936.0	1231.0	0.7604	0.7563
0.0	6.0	1770	1.1417	0.0058	5385.9903	3733.2840	2768.0	3270.0	0.8465	2745.0	0.8394	1811.0	1832.0	2026.0	0.9042	0.8939	927.0	936.0	1231.0	0.7604	0.7530
0.0	7.0	2065	1.2458	0.0058	5877.3016	4073.8351	2762.0	3270.0	0.8446	2744.0	0.8391	1787.0	1807.0	2026.0	0.8919	0.8820	950.0	955.0	1231.0	0.7758	0.7717
0.0	8.0	2360	1.2477	0.0058	5886.0491	4079.8983	2762.0	3270.0	0.8446	2744.0	0.8391	1802.0	1821.0	2026.0	0.8988	0.8894	935.0	941.0	1231.0	0.7644	0.7595
0.6191	9.0	2655	1.2608	0.0058	5947.9186	4122.7830	2768.0	3270.0	0.8465	2749.0	0.8407	1805.0	1824.0	2026.0	0.9003	0.8909	937.0	944.0	1231.0	0.7669	0.7612
0.0	10.0	2950	1.2320	0.0058	5812.1306	4028.6620	2765.0	3270.0	0.8456	2746.0	0.8398	1807.0	1826.0	2026.0	0.9013	0.8919	932.0	939.0	1231.0	0.7628	0.7571
0.0001	11.0	3245	1.2418	0.0058	5858.3967	4060.7312	2767.0	3270.0	0.8462	2745.0	0.8394	1807.0	1829.0	2026.0	0.9028	0.8919	931.0	938.0	1231.0	0.7620	0.7563
0.0	12.0	3540	1.2715	0.0058	5998.6521	4157.9488	2765.0	3270.0	0.8456	2750.0	0.8410	1800.0	1817.0	2026.0	0.8968	0.8885	943.0	948.0	1231.0	0.7701	0.7660
0.0	13.0	3835	1.2808	0.0058	6042.3338	4188.2267	2760.0	3270.0	0.8440	2745.0	0.8394	1799.0	1816.0	2026.0	0.8963	0.8880	939.0	944.0	1231.0	0.7669	0.7628
0.0	14.0	4130	1.2792	0.0058	6034.8824	4183.0618	2765.0	3270.0	0.8456	2751.0	0.8413	1800.0	1818.0	2026.0	0.8973	0.8885	944.0	947.0	1231.0	0.7693	0.7669

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 1

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for donoway/BoolQ_Llama-3.2-1B-50ztosrf

Base model

meta-llama/Llama-3.2-1B

Finetuned

(899)

this model