Llama2-7B-lora-r-32-generic-step-1800-labels_40.0-full-precision-augmented
This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.3169
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- training_steps: 1800
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 5.6939 | 0.0366 | 20 | 5.7515 |
| 5.5884 | 0.0731 | 40 | 5.5289 |
| 5.1869 | 0.1097 | 60 | 5.1224 |
| 4.658 | 0.1463 | 80 | 4.5947 |
| 4.1608 | 0.1828 | 100 | 4.1368 |
| 3.9157 | 0.2194 | 120 | 3.8696 |
| 3.6916 | 0.2559 | 140 | 3.6987 |
| 3.6126 | 0.2925 | 160 | 3.5645 |
| 3.4665 | 0.3291 | 180 | 3.4493 |
| 3.3712 | 0.3656 | 200 | 3.3593 |
| 3.2643 | 0.4022 | 220 | 3.2805 |
| 3.2325 | 0.4388 | 240 | 3.2083 |
| 3.1153 | 0.4753 | 260 | 3.1452 |
| 3.1052 | 0.5119 | 280 | 3.0897 |
| 3.0375 | 0.5484 | 300 | 3.0389 |
| 2.9925 | 0.5850 | 320 | 2.9858 |
| 2.9545 | 0.6216 | 340 | 2.9415 |
| 2.9091 | 0.6581 | 360 | 2.8987 |
| 2.8606 | 0.6947 | 380 | 2.8595 |
| 2.8293 | 0.7313 | 400 | 2.8257 |
| 2.792 | 0.7678 | 420 | 2.7936 |
| 2.7772 | 0.8044 | 440 | 2.7640 |
| 2.7299 | 0.8410 | 460 | 2.7350 |
| 2.7264 | 0.8775 | 480 | 2.7084 |
| 2.6989 | 0.9141 | 500 | 2.6863 |
| 2.6756 | 0.9506 | 520 | 2.6640 |
| 2.6504 | 0.9872 | 540 | 2.6420 |
| 2.5994 | 1.0238 | 560 | 2.6207 |
| 2.5254 | 1.0603 | 580 | 2.6024 |
| 2.5258 | 1.0969 | 600 | 2.5833 |
| 2.5727 | 1.1335 | 620 | 2.5637 |
| 2.5161 | 1.1700 | 640 | 2.5457 |
| 2.4896 | 1.2066 | 660 | 2.5283 |
| 2.4598 | 1.2431 | 680 | 2.5135 |
| 2.4517 | 1.2797 | 700 | 2.4966 |
| 2.7318 | 1.3163 | 720 | 2.4817 |
| 2.4482 | 1.3528 | 740 | 2.4653 |
| 2.6702 | 1.3894 | 760 | 2.4521 |
| 2.3552 | 1.4260 | 780 | 2.4390 |
| 2.379 | 1.4625 | 800 | 2.4263 |
| 2.4068 | 1.4991 | 820 | 2.4152 |
| 2.3495 | 1.5356 | 840 | 2.4022 |
| 2.359 | 1.5722 | 860 | 2.3902 |
| 2.3686 | 1.6088 | 880 | 2.3782 |
| 2.3941 | 1.6453 | 900 | 2.3686 |
| 2.3493 | 1.6819 | 920 | 2.3586 |
| 2.3237 | 1.7185 | 940 | 2.3479 |
| 2.2996 | 1.7550 | 960 | 2.3389 |
| 2.2836 | 1.7916 | 980 | 2.3300 |
| 2.3509 | 1.8282 | 1000 | 2.3206 |
| 2.2979 | 1.8647 | 1020 | 2.3125 |
| 2.5544 | 1.9013 | 1040 | 2.3047 |
| 2.2489 | 1.9378 | 1060 | 2.2955 |
| 2.2487 | 1.9744 | 1080 | 2.2869 |
| 2.2201 | 2.0110 | 1100 | 2.2797 |
| 2.2101 | 2.0475 | 1120 | 2.2739 |
| 2.1932 | 2.0841 | 1140 | 2.2669 |
| 2.2209 | 2.1207 | 1160 | 2.2602 |
| 2.182 | 2.1572 | 1180 | 2.2552 |
| 2.2438 | 2.1938 | 1200 | 2.2495 |
| 2.1787 | 2.2303 | 1220 | 2.2444 |
| 2.1677 | 2.2669 | 1240 | 2.2402 |
| 2.1961 | 2.3035 | 1260 | 2.2362 |
| 2.3865 | 2.3400 | 1280 | 2.2325 |
| 2.1632 | 2.3766 | 1300 | 2.2291 |
| 2.1337 | 2.4132 | 1320 | 2.2268 |
| 2.1925 | 2.4497 | 1340 | 2.2250 |
| 2.2056 | 2.4863 | 1360 | 2.2241 |
| 2.2987 | 2.5229 | 1380 | 2.2239 |
| 2.1611 | 2.5594 | 1400 | 2.2269 |
| 2.463 | 2.5960 | 1420 | 2.2286 |
| 2.1915 | 2.6325 | 1440 | 2.2409 |
| 2.2345 | 2.6691 | 1460 | 2.2528 |
| 2.2286 | 2.7057 | 1480 | 2.2645 |
| 2.1951 | 2.7422 | 1500 | 2.2778 |
| 2.2318 | 2.7788 | 1520 | 2.2939 |
| 2.2852 | 2.8154 | 1540 | 2.3061 |
| 2.3214 | 2.8519 | 1560 | 2.3138 |
| 2.2682 | 2.8885 | 1580 | 2.3169 |
Framework versions
- PEFT 0.15.2
- Transformers 4.45.2
- Pytorch 2.5.0+cu121
- Datasets 3.2.0
- Tokenizers 0.20.3
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Siqi-Hu/Llama2-7B-lora-r-32-generic-step-1800-labels_40.0-full-precision-augmented
Base model
meta-llama/Llama-2-7b-hf