ARC-Easy_Llama-3.2-1B-4fpnn1i5

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.2479
Model Preparation Time: 0.0058
Mdl: 2670.8884
Accumulated Loss: 1851.3188
Correct Preds: 412.0
Total Preds: 570.0
Accuracy: 0.7228
Correct Gen Preds: 404.0
Gen Accuracy: 0.7088
Correct Gen Preds 32: 112.0
Correct Preds 32: 116.0
Total Labels 32: 158.0
Accuracy 32: 0.7342
Gen Accuracy 32: 0.7089
Correct Gen Preds 33: 108.0
Correct Preds 33: 109.0
Total Labels 33: 152.0
Accuracy 33: 0.7171
Gen Accuracy 33: 0.7105
Correct Gen Preds 34: 105.0
Correct Preds 34: 106.0
Total Labels 34: 142.0
Accuracy 34: 0.7465
Gen Accuracy 34: 0.7394
Correct Gen Preds 35: 79.0
Correct Preds 35: 81.0
Total Labels 35: 118.0
Accuracy 35: 0.6864
Gen Accuracy 35: 0.6695
Correct Gen Preds 36: 0.0
Correct Preds 36: 0.0
Total Labels 36: 0.0
Accuracy 36: 0.0
Gen Accuracy 36: 0.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 64
eval_batch_size: 112
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.001
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Mdl	Accumulated Loss	Correct Preds	Total Preds	Accuracy	Correct Gen Preds	Gen Accuracy	Correct Gen Preds 32	Correct Preds 32	Total Labels 32	Accuracy 32	Gen Accuracy 32	Correct Gen Preds 33	Correct Preds 33	Total Labels 33	Accuracy 33	Gen Accuracy 33	Correct Gen Preds 34	Correct Preds 34	Total Labels 34	Accuracy 34	Gen Accuracy 34	Correct Gen Preds 35	Correct Preds 35	Total Labels 35	Accuracy 35	Gen Accuracy 35
No log	0	0	1.5354	0.0058	1262.6022	875.1692	172.0	570.0	0.3018	170.0	0.2982	154.0	154.0	158.0	0.9747	0.9747	0.0	0.0	152.0	0.0	0.0	15.0	17.0	142.0	0.1197	0.1056	1.0	1.0	118.0	0.0085	0.0085
0.809	1.0	9	0.9572	0.0058	787.1470	545.6087	378.0	570.0	0.6632	377.0	0.6614	86.0	87.0	158.0	0.5506	0.5443	104.0	104.0	152.0	0.6842	0.6842	98.0	98.0	142.0	0.6901	0.6901	89.0	89.0	118.0	0.7542	0.7542
0.5794	2.0	18	0.9376	0.0058	771.0268	534.4351	391.0	570.0	0.6860	391.0	0.6860	90.0	90.0	158.0	0.5696	0.5696	107.0	107.0	152.0	0.7039	0.7039	115.0	115.0	142.0	0.8099	0.8099	79.0	79.0	118.0	0.6695	0.6695
0.3539	3.0	27	1.0844	0.0058	891.7207	618.0937	397.0	570.0	0.6965	393.0	0.6895	112.0	114.0	158.0	0.7215	0.7089	99.0	100.0	152.0	0.6579	0.6513	102.0	102.0	142.0	0.7183	0.7183	80.0	81.0	118.0	0.6864	0.6780
0.0158	4.0	36	1.3047	0.0058	1072.8846	743.6669	406.0	570.0	0.7123	399.0	0.7	109.0	111.0	158.0	0.7025	0.6899	101.0	104.0	152.0	0.6842	0.6645	109.0	109.0	142.0	0.7676	0.7676	80.0	82.0	118.0	0.6949	0.6780
0.1421	5.0	45	2.0123	0.0058	1654.7928	1147.0150	408.0	570.0	0.7158	405.0	0.7105	100.0	101.0	158.0	0.6392	0.6329	105.0	106.0	152.0	0.6974	0.6908	118.0	118.0	142.0	0.8310	0.8310	82.0	83.0	118.0	0.7034	0.6949
0.0005	6.0	54	2.1961	0.0058	1805.8913	1251.7484	400.0	570.0	0.7018	360.0	0.6316	95.0	116.0	158.0	0.7342	0.6013	92.0	96.0	152.0	0.6316	0.6053	102.0	108.0	142.0	0.7606	0.7183	71.0	80.0	118.0	0.6780	0.6017
0.0001	7.0	63	2.9584	0.0058	2432.8057	1686.2924	409.0	570.0	0.7175	394.0	0.6912	106.0	116.0	158.0	0.7342	0.6709	107.0	108.0	152.0	0.7105	0.7039	108.0	110.0	142.0	0.7746	0.7606	73.0	75.0	118.0	0.6356	0.6186
0.0	8.0	72	3.1749	0.0058	2610.8691	1809.7166	410.0	570.0	0.7193	402.0	0.7053	112.0	116.0	158.0	0.7342	0.7089	107.0	108.0	152.0	0.7105	0.7039	107.0	108.0	142.0	0.7606	0.7535	76.0	78.0	118.0	0.6610	0.6441
0.0	9.0	81	3.2479	0.0058	2670.8884	1851.3188	412.0	570.0	0.7228	404.0	0.7088	112.0	116.0	158.0	0.7342	0.7089	108.0	109.0	152.0	0.7171	0.7105	105.0	106.0	142.0	0.7465	0.7394	79.0	81.0	118.0	0.6864	0.6695
0.0	10.0	90	3.2412	0.0058	2665.3613	1847.4877	412.0	570.0	0.7228	403.0	0.7070	112.0	116.0	158.0	0.7342	0.7089	107.0	108.0	152.0	0.7105	0.7039	106.0	107.0	142.0	0.7535	0.7465	78.0	81.0	118.0	0.6864	0.6610
0.0	11.0	99	3.2733	0.0058	2691.7663	1865.7903	409.0	570.0	0.7175	400.0	0.7018	112.0	116.0	158.0	0.7342	0.7089	107.0	108.0	152.0	0.7105	0.7039	104.0	105.0	142.0	0.7394	0.7324	77.0	80.0	118.0	0.6780	0.6525
0.0	12.0	108	3.2483	0.0058	2671.1685	1851.5129	411.0	570.0	0.7211	405.0	0.7105	114.0	116.0	158.0	0.7342	0.7215	108.0	109.0	152.0	0.7171	0.7105	105.0	106.0	142.0	0.7465	0.7394	78.0	80.0	118.0	0.6780	0.6610
0.0	13.0	117	3.2543	0.0058	2676.1586	1854.9718	412.0	570.0	0.7228	404.0	0.7088	113.0	116.0	158.0	0.7342	0.7152	108.0	109.0	152.0	0.7171	0.7105	105.0	106.0	142.0	0.7465	0.7394	78.0	81.0	118.0	0.6864	0.6610

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 2

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for donoway/ARC-Easy_Llama-3.2-1B-4fpnn1i5

Base model

meta-llama/Llama-3.2-1B

Finetuned

(899)

this model