ARC-Challenge_Llama-3.2-1B-gl75gmoi

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.2627
Model Preparation Time: 0.0059
Mdl: 544.6818
Accumulated Loss: 377.5446
Correct Preds: 148.0
Total Preds: 299.0
Accuracy: 0.4950
Correct Gen Preds: 148.0
Gen Accuracy: 0.4950
Correct Gen Preds 32: 26.0
Correct Preds 32: 26.0
Total Labels 32: 64.0
Accuracy 32: 0.4062
Gen Accuracy 32: 0.4062
Correct Gen Preds 33: 43.0
Correct Preds 33: 43.0
Total Labels 33: 73.0
Accuracy 33: 0.5890
Gen Accuracy 33: 0.5890
Correct Gen Preds 34: 48.0
Correct Preds 34: 48.0
Total Labels 34: 78.0
Accuracy 34: 0.6154
Gen Accuracy 34: 0.6154
Correct Gen Preds 35: 31.0
Correct Preds 35: 31.0
Total Labels 35: 83.0
Accuracy 35: 0.3735
Gen Accuracy 35: 0.3735
Correct Gen Preds 36: 0.0
Correct Preds 36: 0.0
Total Labels 36: 1.0
Accuracy 36: 0.0
Gen Accuracy 36: 0.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 64
eval_batch_size: 112
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Mdl	Accumulated Loss	Correct Preds	Total Preds	Accuracy	Correct Gen Preds	Gen Accuracy	Correct Gen Preds 32	Correct Preds 32	Total Labels 32	Accuracy 32	Gen Accuracy 32	Correct Gen Preds 33	Correct Preds 33	Total Labels 33	Accuracy 33	Gen Accuracy 33	Correct Gen Preds 34	Correct Preds 34	Total Labels 34	Accuracy 34	Gen Accuracy 34	Correct Gen Preds 35	Correct Preds 35	Total Labels 35	Accuracy 35	Gen Accuracy 35	Correct Gen Preds 36	Correct Preds 36	Total Labels 36	Accuracy 36	Gen Accuracy 36
No log	0	0	1.6389	0.0059	706.9523	490.0220	66.0	299.0	0.2207	66.0	0.2207	62.0	62.0	64.0	0.9688	0.9688	0.0	0.0	73.0	0.0	0.0	4.0	4.0	78.0	0.0513	0.0513	0.0	0.0	83.0	0.0	0.0	0.0	0.0	1.0	0.0	0.0
1.3008	1.0	11	1.3153	0.0059	567.3964	393.2892	121.0	299.0	0.4047	121.0	0.4047	30.0	30.0	64.0	0.4688	0.4688	42.0	42.0	73.0	0.5753	0.5753	27.0	27.0	78.0	0.3462	0.3462	22.0	22.0	83.0	0.2651	0.2651	0.0	0.0	1.0	0.0	0.0
1.0812	2.0	22	1.2627	0.0059	544.6818	377.5446	148.0	299.0	0.4950	148.0	0.4950	26.0	26.0	64.0	0.4062	0.4062	43.0	43.0	73.0	0.5890	0.5890	48.0	48.0	78.0	0.6154	0.6154	31.0	31.0	83.0	0.3735	0.3735	0.0	0.0	1.0	0.0	0.0
0.5144	3.0	33	1.3902	0.0059	599.6881	415.6721	145.0	299.0	0.4849	124.0	0.4147	22.0	29.0	64.0	0.4531	0.3438	31.0	36.0	73.0	0.4932	0.4247	32.0	33.0	78.0	0.4231	0.4103	39.0	47.0	83.0	0.5663	0.4699	0.0	0.0	1.0	0.0	0.0
0.1445	4.0	44	2.1184	0.0059	913.7908	633.3915	145.0	299.0	0.4849	145.0	0.4849	32.0	32.0	64.0	0.5	0.5	45.0	45.0	73.0	0.6164	0.6164	36.0	36.0	78.0	0.4615	0.4615	32.0	32.0	83.0	0.3855	0.3855	0.0	0.0	1.0	0.0	0.0
0.064	5.0	55	3.0986	0.0059	1336.6142	926.4703	131.0	299.0	0.4381	126.0	0.4214	15.0	16.0	64.0	0.25	0.2344	42.0	43.0	73.0	0.5890	0.5753	42.0	44.0	78.0	0.5641	0.5385	27.0	28.0	83.0	0.3373	0.3253	0.0	0.0	1.0	0.0	0.0
0.0002	6.0	66	5.4531	0.0059	2352.2621	1630.4639	135.0	299.0	0.4515	135.0	0.4515	40.0	40.0	64.0	0.625	0.625	43.0	43.0	73.0	0.5890	0.5890	29.0	29.0	78.0	0.3718	0.3718	23.0	23.0	83.0	0.2771	0.2771	0.0	0.0	1.0	0.0	0.0
0.0	7.0	77	6.3729	0.0059	2749.0547	1905.4995	145.0	299.0	0.4849	143.0	0.4783	23.0	24.0	64.0	0.375	0.3594	44.0	44.0	73.0	0.6027	0.6027	39.0	39.0	78.0	0.5	0.5	36.0	37.0	83.0	0.4458	0.4337	1.0	1.0	1.0	1.0	1.0
0.0	8.0	88	6.2827	0.0059	2710.1294	1878.5186	143.0	299.0	0.4783	142.0	0.4749	22.0	22.0	64.0	0.3438	0.3438	42.0	42.0	73.0	0.5753	0.5753	40.0	40.0	78.0	0.5128	0.5128	37.0	38.0	83.0	0.4578	0.4458	1.0	1.0	1.0	1.0	1.0
0.0	9.0	99	6.5480	0.0059	2824.5661	1957.8401	143.0	299.0	0.4783	142.0	0.4749	21.0	21.0	64.0	0.3281	0.3281	45.0	45.0	73.0	0.6164	0.6164	39.0	39.0	78.0	0.5	0.5	36.0	37.0	83.0	0.4458	0.4337	1.0	1.0	1.0	1.0	1.0
0.0	10.0	110	6.6104	0.0059	2851.5202	1976.5232	143.0	299.0	0.4783	142.0	0.4749	22.0	22.0	64.0	0.3438	0.3438	44.0	44.0	73.0	0.6027	0.6027	40.0	40.0	78.0	0.5128	0.5128	35.0	36.0	83.0	0.4337	0.4217	1.0	1.0	1.0	1.0	1.0
0.0	11.0	121	6.6130	0.0059	2852.6343	1977.2954	142.0	299.0	0.4749	141.0	0.4716	21.0	21.0	64.0	0.3281	0.3281	45.0	45.0	73.0	0.6164	0.6164	39.0	39.0	78.0	0.5	0.5	35.0	36.0	83.0	0.4337	0.4217	1.0	1.0	1.0	1.0	1.0
0.0	12.0	132	6.6291	0.0059	2859.5758	1982.1069	145.0	299.0	0.4849	143.0	0.4783	21.0	21.0	64.0	0.3281	0.3281	46.0	46.0	73.0	0.6301	0.6301	39.0	40.0	78.0	0.5128	0.5	36.0	37.0	83.0	0.4458	0.4337	1.0	1.0	1.0	1.0	1.0
0.0	13.0	143	6.6132	0.0059	2852.6983	1977.3398	143.0	299.0	0.4783	142.0	0.4749	21.0	21.0	64.0	0.3281	0.3281	45.0	45.0	73.0	0.6164	0.6164	39.0	39.0	78.0	0.5	0.5	36.0	37.0	83.0	0.4458	0.4337	1.0	1.0	1.0	1.0	1.0
0.0	14.0	154	6.6296	0.0059	2859.7722	1982.2430	144.0	299.0	0.4816	143.0	0.4783	21.0	21.0	64.0	0.3281	0.3281	46.0	46.0	73.0	0.6301	0.6301	39.0	39.0	78.0	0.5	0.5	36.0	37.0	83.0	0.4458	0.4337	1.0	1.0	1.0	1.0	1.0
0.0	15.0	165	6.6354	0.0059	2862.2671	1983.9724	143.0	299.0	0.4783	142.0	0.4749	21.0	21.0	64.0	0.3281	0.3281	46.0	46.0	73.0	0.6301	0.6301	39.0	39.0	78.0	0.5	0.5	35.0	36.0	83.0	0.4337	0.4217	1.0	1.0	1.0	1.0	1.0
0.0	16.0	176	6.6301	0.0059	2859.9865	1982.3916	143.0	299.0	0.4783	142.0	0.4749	21.0	21.0	64.0	0.3281	0.3281	45.0	45.0	73.0	0.6164	0.6164	39.0	39.0	78.0	0.5	0.5	36.0	37.0	83.0	0.4458	0.4337	1.0	1.0	1.0	1.0	1.0
0.0	17.0	187	6.6555	0.0059	2870.9372	1989.9820	142.0	299.0	0.4749	141.0	0.4716	20.0	20.0	64.0	0.3125	0.3125	46.0	46.0	73.0	0.6301	0.6301	38.0	38.0	78.0	0.4872	0.4872	36.0	37.0	83.0	0.4458	0.4337	1.0	1.0	1.0	1.0	1.0
0.0	18.0	198	6.6384	0.0059	2863.5636	1984.8710	143.0	299.0	0.4783	142.0	0.4749	21.0	21.0	64.0	0.3281	0.3281	45.0	45.0	73.0	0.6164	0.6164	40.0	40.0	78.0	0.5128	0.5128	35.0	36.0	83.0	0.4337	0.4217	1.0	1.0	1.0	1.0	1.0

Framework versions

Transformers 4.51.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 2

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for donoway/ARC-Challenge_Llama-3.2-1B-gl75gmoi

Base model

meta-llama/Llama-3.2-1B

Finetuned

(899)

this model