Llama2-7B-lora-r-32-generic-step-1200-labels_40.0-full-precision-augmented
This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.4835
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- training_steps: 1200
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 5.7327 | 0.0366 | 20 | 5.7538 |
| 5.6227 | 0.0731 | 40 | 5.5363 |
| 5.1474 | 0.1097 | 60 | 5.1260 |
| 4.7472 | 0.1463 | 80 | 4.5942 |
| 4.1866 | 0.1828 | 100 | 4.1511 |
| 3.9473 | 0.2194 | 120 | 3.8806 |
| 3.7462 | 0.2559 | 140 | nan |
| 3.5305 | 0.2925 | 160 | 3.5819 |
| 3.444 | 0.3291 | 180 | nan |
| 3.4114 | 0.3656 | 200 | 3.3726 |
| 3.2924 | 0.4022 | 220 | 3.2893 |
| 3.2236 | 0.4388 | 240 | 3.2182 |
| 3.1278 | 0.4753 | 260 | 3.1529 |
| 3.1056 | 0.5119 | 280 | 3.0947 |
| 3.0014 | 0.5484 | 300 | 3.0417 |
| 2.9912 | 0.5850 | 320 | 2.9946 |
| 2.9492 | 0.6216 | 340 | 2.9528 |
| 2.8944 | 0.6581 | 360 | 2.9150 |
| 2.9287 | 0.6947 | 380 | 2.8802 |
| 2.7993 | 0.7313 | 400 | 2.8480 |
| 2.8089 | 0.7678 | 420 | 2.8192 |
| 2.7871 | 0.8044 | 440 | 2.7911 |
| 2.8095 | 0.8410 | 460 | 2.7663 |
| 2.7241 | 0.8775 | 480 | 2.7435 |
| 2.7645 | 0.9141 | 500 | 2.7219 |
| 2.667 | 0.9506 | 520 | 2.7013 |
| 2.6587 | 0.9872 | 540 | 2.6816 |
| 2.6643 | 1.0238 | 560 | 2.6648 |
| 2.5948 | 1.0603 | 580 | 2.6498 |
| 2.6232 | 1.0969 | 600 | 2.6338 |
| 2.534 | 1.1335 | 620 | 2.6205 |
| 2.5503 | 1.1700 | 640 | 2.6080 |
| 2.5235 | 1.2066 | 660 | 2.5954 |
| 2.5505 | 1.2431 | 680 | 2.5845 |
| 2.5494 | 1.2797 | 700 | 2.5739 |
| 2.5663 | 1.3163 | 720 | 2.5647 |
| 2.4893 | 1.3528 | 740 | 2.5553 |
| 2.5379 | 1.3894 | 760 | 2.5461 |
| 2.5001 | 1.4260 | 780 | 2.5385 |
| 2.5298 | 1.4625 | 800 | 2.5313 |
| 2.5129 | 1.4991 | 820 | 2.5247 |
| 2.4801 | 1.5356 | 840 | 2.5184 |
| 2.5417 | 1.5722 | 860 | 2.5135 |
| 2.479 | 1.6088 | 880 | 2.5085 |
| 2.4389 | 1.6453 | 900 | 2.5038 |
| 2.4477 | 1.6819 | 920 | 2.5006 |
| 2.4777 | 1.7185 | 940 | 2.4976 |
| 2.4562 | 1.7550 | 960 | 2.4944 |
| 2.4422 | 1.7916 | 980 | 2.4921 |
| 2.4323 | 1.8282 | 1000 | 2.4902 |
| 2.4461 | 1.8647 | 1020 | 2.4884 |
| 2.4254 | 1.9013 | 1040 | 2.4871 |
| 2.4201 | 1.9378 | 1060 | 2.4858 |
| 2.4952 | 1.9744 | 1080 | 2.4849 |
| 2.4548 | 2.0110 | 1100 | 2.4843 |
| 2.5034 | 2.0475 | 1120 | 2.4839 |
| 2.447 | 2.0841 | 1140 | 2.4837 |
| 2.4601 | 2.1207 | 1160 | 2.4835 |
| 2.3962 | 2.1572 | 1180 | 2.4835 |
| 2.4167 | 2.1938 | 1200 | 2.4835 |
Framework versions
- PEFT 0.15.2
- Transformers 4.45.2
- Pytorch 2.5.0+cu121
- Datasets 3.2.0
- Tokenizers 0.20.3
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Siqi-Hu/Llama2-7B-lora-r-32-generic-step-1200-labels_40.0-full-precision-augmented
Base model
meta-llama/Llama-2-7b-hf