Llama2-7B-lora-r-32-generic-step-1800-labels_40.0-full-precision-augmented

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3169

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • training_steps: 1800
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
5.6939 0.0366 20 5.7515
5.5884 0.0731 40 5.5289
5.1869 0.1097 60 5.1224
4.658 0.1463 80 4.5947
4.1608 0.1828 100 4.1368
3.9157 0.2194 120 3.8696
3.6916 0.2559 140 3.6987
3.6126 0.2925 160 3.5645
3.4665 0.3291 180 3.4493
3.3712 0.3656 200 3.3593
3.2643 0.4022 220 3.2805
3.2325 0.4388 240 3.2083
3.1153 0.4753 260 3.1452
3.1052 0.5119 280 3.0897
3.0375 0.5484 300 3.0389
2.9925 0.5850 320 2.9858
2.9545 0.6216 340 2.9415
2.9091 0.6581 360 2.8987
2.8606 0.6947 380 2.8595
2.8293 0.7313 400 2.8257
2.792 0.7678 420 2.7936
2.7772 0.8044 440 2.7640
2.7299 0.8410 460 2.7350
2.7264 0.8775 480 2.7084
2.6989 0.9141 500 2.6863
2.6756 0.9506 520 2.6640
2.6504 0.9872 540 2.6420
2.5994 1.0238 560 2.6207
2.5254 1.0603 580 2.6024
2.5258 1.0969 600 2.5833
2.5727 1.1335 620 2.5637
2.5161 1.1700 640 2.5457
2.4896 1.2066 660 2.5283
2.4598 1.2431 680 2.5135
2.4517 1.2797 700 2.4966
2.7318 1.3163 720 2.4817
2.4482 1.3528 740 2.4653
2.6702 1.3894 760 2.4521
2.3552 1.4260 780 2.4390
2.379 1.4625 800 2.4263
2.4068 1.4991 820 2.4152
2.3495 1.5356 840 2.4022
2.359 1.5722 860 2.3902
2.3686 1.6088 880 2.3782
2.3941 1.6453 900 2.3686
2.3493 1.6819 920 2.3586
2.3237 1.7185 940 2.3479
2.2996 1.7550 960 2.3389
2.2836 1.7916 980 2.3300
2.3509 1.8282 1000 2.3206
2.2979 1.8647 1020 2.3125
2.5544 1.9013 1040 2.3047
2.2489 1.9378 1060 2.2955
2.2487 1.9744 1080 2.2869
2.2201 2.0110 1100 2.2797
2.2101 2.0475 1120 2.2739
2.1932 2.0841 1140 2.2669
2.2209 2.1207 1160 2.2602
2.182 2.1572 1180 2.2552
2.2438 2.1938 1200 2.2495
2.1787 2.2303 1220 2.2444
2.1677 2.2669 1240 2.2402
2.1961 2.3035 1260 2.2362
2.3865 2.3400 1280 2.2325
2.1632 2.3766 1300 2.2291
2.1337 2.4132 1320 2.2268
2.1925 2.4497 1340 2.2250
2.2056 2.4863 1360 2.2241
2.2987 2.5229 1380 2.2239
2.1611 2.5594 1400 2.2269
2.463 2.5960 1420 2.2286
2.1915 2.6325 1440 2.2409
2.2345 2.6691 1460 2.2528
2.2286 2.7057 1480 2.2645
2.1951 2.7422 1500 2.2778
2.2318 2.7788 1520 2.2939
2.2852 2.8154 1540 2.3061
2.3214 2.8519 1560 2.3138
2.2682 2.8885 1580 2.3169

Framework versions

  • PEFT 0.15.2
  • Transformers 4.45.2
  • Pytorch 2.5.0+cu121
  • Datasets 3.2.0
  • Tokenizers 0.20.3
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Siqi-Hu/Llama2-7B-lora-r-32-generic-step-1800-labels_40.0-full-precision-augmented

Adapter
(2343)
this model