ARC-Easy_Llama-3.2-1B-vwg1l84h

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1988
Model Preparation Time: 0.0059
Mdl: 985.8352
Accumulated Loss: 683.3289
Correct Preds: 401.0
Total Preds: 570.0
Accuracy: 0.7035
Correct Gen Preds: 401.0
Gen Accuracy: 0.7035
Correct Gen Preds 32: 112.0
Correct Preds 32: 112.0
Total Labels 32: 158.0
Accuracy 32: 0.7089
Gen Accuracy 32: 0.7089
Correct Gen Preds 33: 119.0
Correct Preds 33: 119.0
Total Labels 33: 152.0
Accuracy 33: 0.7829
Gen Accuracy 33: 0.7829
Correct Gen Preds 34: 92.0
Correct Preds 34: 92.0
Total Labels 34: 142.0
Accuracy 34: 0.6479
Gen Accuracy 34: 0.6479
Correct Gen Preds 35: 78.0
Correct Preds 35: 78.0
Total Labels 35: 118.0
Accuracy 35: 0.6610
Gen Accuracy 35: 0.6610
Correct Gen Preds 36: 0.0
Correct Preds 36: 0.0
Total Labels 36: 0.0
Accuracy 36: 0.0
Gen Accuracy 36: 0.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 64
eval_batch_size: 112
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
lr_scheduler_warmup_ratio: 0.001
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Mdl	Accumulated Loss	Correct Preds	Total Preds	Accuracy	Correct Gen Preds	Gen Accuracy	Correct Gen Preds 32	Correct Preds 32	Total Labels 32	Accuracy 32	Gen Accuracy 32	Correct Gen Preds 33	Correct Preds 33	Total Labels 33	Accuracy 33	Gen Accuracy 33	Correct Gen Preds 34	Correct Preds 34	Total Labels 34	Accuracy 34	Gen Accuracy 34	Correct Gen Preds 35	Correct Preds 35	Total Labels 35	Accuracy 35	Gen Accuracy 35
No log	0	0	1.5354	0.0059	1262.6022	875.1692	172.0	570.0	0.3018	170.0	0.2982	154.0	154.0	158.0	0.9747	0.9747	0.0	0.0	152.0	0.0	0.0	15.0	17.0	142.0	0.1197	0.1056	1.0	1.0	118.0	0.0085	0.0085
1.0683	1.0	6	0.9749	0.0059	801.7082	555.7018	373.0	570.0	0.6544	228.0	0.4	48.0	107.0	158.0	0.6772	0.3038	48.0	85.0	152.0	0.5592	0.3158	71.0	97.0	142.0	0.6831	0.5	61.0	84.0	118.0	0.7119	0.5169
0.5277	2.0	12	1.1013	0.0059	905.6689	627.7619	395.0	570.0	0.6930	391.0	0.6860	95.0	97.0	158.0	0.6139	0.6013	120.0	122.0	152.0	0.8026	0.7895	98.0	98.0	142.0	0.6901	0.6901	78.0	78.0	118.0	0.6610	0.6610
0.3604	3.0	18	1.1988	0.0059	985.8352	683.3289	401.0	570.0	0.7035	401.0	0.7035	112.0	112.0	158.0	0.7089	0.7089	119.0	119.0	152.0	0.7829	0.7829	92.0	92.0	142.0	0.6479	0.6479	78.0	78.0	118.0	0.6610	0.6610
0.1435	4.0	24	1.7570	0.0059	1444.8662	1001.5049	388.0	570.0	0.6807	341.0	0.5982	70.0	93.0	158.0	0.5886	0.4430	113.0	126.0	152.0	0.8289	0.7434	81.0	87.0	142.0	0.6127	0.5704	77.0	82.0	118.0	0.6949	0.6525
0.0169	5.0	30	2.2097	0.0059	1817.1203	1259.5318	395.0	570.0	0.6930	340.0	0.5965	79.0	102.0	158.0	0.6456	0.5	99.0	121.0	152.0	0.7961	0.6513	88.0	94.0	142.0	0.6620	0.6197	74.0	78.0	118.0	0.6610	0.6271
0.0068	6.0	36	2.5137	0.0059	2067.1196	1432.8181	394.0	570.0	0.6912	391.0	0.6860	102.0	105.0	158.0	0.6646	0.6456	116.0	116.0	152.0	0.7632	0.7632	99.0	99.0	142.0	0.6972	0.6972	74.0	74.0	118.0	0.6271	0.6271
0.0001	7.0	42	3.2012	0.0059	2632.4308	1824.6620	399.0	570.0	0.7	368.0	0.6456	85.0	116.0	158.0	0.7342	0.5380	122.0	122.0	152.0	0.8026	0.8026	97.0	97.0	142.0	0.6831	0.6831	64.0	64.0	118.0	0.5424	0.5424
0.0005	8.0	48	3.2132	0.0059	2642.3392	1831.5300	400.0	570.0	0.7018	283.0	0.4965	15.0	110.0	158.0	0.6962	0.0949	109.0	125.0	152.0	0.8224	0.7171	97.0	101.0	142.0	0.7113	0.6831	62.0	64.0	118.0	0.5424	0.5254
0.0	9.0	54	3.6815	0.0059	3027.4329	2098.4566	395.0	570.0	0.6930	395.0	0.6930	107.0	107.0	158.0	0.6772	0.6772	122.0	122.0	152.0	0.8026	0.8026	101.0	101.0	142.0	0.7113	0.7113	65.0	65.0	118.0	0.5508	0.5508
0.0	10.0	60	3.7049	0.0059	3046.6547	2111.7801	394.0	570.0	0.6912	394.0	0.6912	106.0	106.0	158.0	0.6709	0.6709	121.0	121.0	152.0	0.7961	0.7961	102.0	102.0	142.0	0.7183	0.7183	65.0	65.0	118.0	0.5508	0.5508
0.0	11.0	66	3.6993	0.0059	3042.0740	2108.6050	392.0	570.0	0.6877	392.0	0.6877	106.0	106.0	158.0	0.6709	0.6709	121.0	121.0	152.0	0.7961	0.7961	101.0	101.0	142.0	0.7113	0.7113	64.0	64.0	118.0	0.5424	0.5424
0.0	12.0	72	3.7184	0.0059	3057.7976	2119.5038	393.0	570.0	0.6895	393.0	0.6895	104.0	104.0	158.0	0.6582	0.6582	122.0	122.0	152.0	0.8026	0.8026	102.0	102.0	142.0	0.7183	0.7183	65.0	65.0	118.0	0.5508	0.5508
0.0	13.0	78	3.7193	0.0059	3058.5522	2120.0269	394.0	570.0	0.6912	394.0	0.6912	106.0	106.0	158.0	0.6709	0.6709	120.0	120.0	152.0	0.7895	0.7895	103.0	103.0	142.0	0.7254	0.7254	65.0	65.0	118.0	0.5508	0.5508

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 2

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for donoway/ARC-Easy_Llama-3.2-1B-vwg1l84h

Base model

meta-llama/Llama-3.2-1B

Finetuned

(903)

this model