Llama2-7B-lora-r-32-generic-step-1800-labels_40.0-full-precision-augmented

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.3169

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 100
training_steps: 1800
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
5.6939	0.0366	20	5.7515
5.5884	0.0731	40	5.5289
5.1869	0.1097	60	5.1224
4.658	0.1463	80	4.5947
4.1608	0.1828	100	4.1368
3.9157	0.2194	120	3.8696
3.6916	0.2559	140	3.6987
3.6126	0.2925	160	3.5645
3.4665	0.3291	180	3.4493
3.3712	0.3656	200	3.3593
3.2643	0.4022	220	3.2805
3.2325	0.4388	240	3.2083
3.1153	0.4753	260	3.1452
3.1052	0.5119	280	3.0897
3.0375	0.5484	300	3.0389
2.9925	0.5850	320	2.9858
2.9545	0.6216	340	2.9415
2.9091	0.6581	360	2.8987
2.8606	0.6947	380	2.8595
2.8293	0.7313	400	2.8257
2.792	0.7678	420	2.7936
2.7772	0.8044	440	2.7640
2.7299	0.8410	460	2.7350
2.7264	0.8775	480	2.7084
2.6989	0.9141	500	2.6863
2.6756	0.9506	520	2.6640
2.6504	0.9872	540	2.6420
2.5994	1.0238	560	2.6207
2.5254	1.0603	580	2.6024
2.5258	1.0969	600	2.5833
2.5727	1.1335	620	2.5637
2.5161	1.1700	640	2.5457
2.4896	1.2066	660	2.5283
2.4598	1.2431	680	2.5135
2.4517	1.2797	700	2.4966
2.7318	1.3163	720	2.4817
2.4482	1.3528	740	2.4653
2.6702	1.3894	760	2.4521
2.3552	1.4260	780	2.4390
2.379	1.4625	800	2.4263
2.4068	1.4991	820	2.4152
2.3495	1.5356	840	2.4022
2.359	1.5722	860	2.3902
2.3686	1.6088	880	2.3782
2.3941	1.6453	900	2.3686
2.3493	1.6819	920	2.3586
2.3237	1.7185	940	2.3479
2.2996	1.7550	960	2.3389
2.2836	1.7916	980	2.3300
2.3509	1.8282	1000	2.3206
2.2979	1.8647	1020	2.3125
2.5544	1.9013	1040	2.3047
2.2489	1.9378	1060	2.2955
2.2487	1.9744	1080	2.2869
2.2201	2.0110	1100	2.2797
2.2101	2.0475	1120	2.2739
2.1932	2.0841	1140	2.2669
2.2209	2.1207	1160	2.2602
2.182	2.1572	1180	2.2552
2.2438	2.1938	1200	2.2495
2.1787	2.2303	1220	2.2444
2.1677	2.2669	1240	2.2402
2.1961	2.3035	1260	2.2362
2.3865	2.3400	1280	2.2325
2.1632	2.3766	1300	2.2291
2.1337	2.4132	1320	2.2268
2.1925	2.4497	1340	2.2250
2.2056	2.4863	1360	2.2241
2.2987	2.5229	1380	2.2239
2.1611	2.5594	1400	2.2269
2.463	2.5960	1420	2.2286
2.1915	2.6325	1440	2.2409
2.2345	2.6691	1460	2.2528
2.2286	2.7057	1480	2.2645
2.1951	2.7422	1500	2.2778
2.2318	2.7788	1520	2.2939
2.2852	2.8154	1540	2.3061
2.3214	2.8519	1560	2.3138
2.2682	2.8885	1580	2.3169

Framework versions

PEFT 0.15.2
Transformers 4.45.2
Pytorch 2.5.0+cu121
Datasets 3.2.0
Tokenizers 0.20.3

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Siqi-Hu/Llama2-7B-lora-r-32-generic-step-1800-labels_40.0-full-precision-augmented

Base model

meta-llama/Llama-2-7b-hf

Adapter

(2343)

this model