Llama2-7B-lora-r-32-generic-step-1500-lr-1e-5-labels_40.0
This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.7705
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 20
- training_steps: 1500
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 5.4782 | 0.3653 | 20 | 5.3585 |
| 4.8311 | 0.7306 | 40 | 4.7012 |
| 4.3375 | 1.0959 | 60 | 4.1985 |
| 4.0234 | 1.4612 | 80 | 3.9271 |
| 3.8401 | 1.8265 | 100 | 3.7400 |
| 3.5928 | 2.1918 | 120 | 3.6001 |
| 3.5018 | 2.5571 | 140 | 3.4879 |
| 3.4263 | 2.9224 | 160 | 3.3941 |
| 3.3805 | 3.2877 | 180 | 3.3124 |
| 3.2756 | 3.6530 | 200 | 3.2443 |
| 3.1954 | 4.0183 | 220 | 3.1837 |
| 3.0846 | 4.3836 | 240 | 3.1316 |
| 3.0681 | 4.7489 | 260 | 3.0894 |
| 3.0526 | 5.1142 | 280 | 3.0510 |
| 3.0034 | 5.4795 | 300 | 3.0170 |
| 2.9394 | 5.8447 | 320 | 2.9865 |
| 2.9195 | 6.2100 | 340 | 2.9551 |
| 2.8637 | 6.5753 | 360 | 2.9321 |
| 2.8611 | 6.9406 | 380 | 2.9084 |
| 2.7788 | 7.3059 | 400 | 2.8912 |
| 2.8259 | 7.6712 | 420 | 2.8770 |
| 2.7499 | 8.0365 | 440 | 2.8581 |
| 2.7405 | 8.4018 | 460 | 2.8496 |
| 2.7046 | 8.7671 | 480 | 2.8354 |
| 2.7295 | 9.1324 | 500 | 2.8268 |
| 2.9773 | 9.4977 | 520 | 2.8175 |
| 2.6659 | 9.8630 | 540 | 2.8097 |
| 2.5387 | 10.2283 | 560 | 2.8067 |
| 2.5545 | 10.5936 | 580 | 2.7971 |
| 2.5904 | 10.9589 | 600 | 2.7906 |
| 2.5524 | 11.3242 | 620 | 2.7895 |
| 2.5515 | 11.6895 | 640 | 2.7849 |
| 2.5145 | 12.0548 | 660 | 2.7770 |
| 2.5058 | 12.4201 | 680 | 2.7793 |
| 2.4992 | 12.7854 | 700 | 2.7737 |
| 2.4222 | 13.1507 | 720 | 2.7724 |
| 2.4456 | 13.5160 | 740 | 2.7719 |
| 2.4771 | 13.8813 | 760 | 2.7665 |
| 2.377 | 14.2466 | 780 | 2.7711 |
| 2.3874 | 14.6119 | 800 | 2.7682 |
| 2.4323 | 14.9772 | 820 | 2.7650 |
| 2.6768 | 15.3425 | 840 | 2.7688 |
| 2.3755 | 15.7078 | 860 | 2.7677 |
| 2.3763 | 16.0731 | 880 | 2.7710 |
| 2.3402 | 16.4384 | 900 | 2.7730 |
| 2.4115 | 16.8037 | 920 | 2.7694 |
| 2.3664 | 17.1689 | 940 | 2.7708 |
| 2.3469 | 17.5342 | 960 | 2.7699 |
| 2.3707 | 17.8995 | 980 | 2.7678 |
| 2.3397 | 18.2648 | 1000 | 2.7718 |
| 2.355 | 18.6301 | 1020 | 2.7705 |
Framework versions
- PEFT 0.15.2
- Transformers 4.45.2
- Pytorch 2.5.0+cu121
- Datasets 3.2.0
- Tokenizers 0.20.3
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Siqi-Hu/Llama2-7B-lora-r-32-generic-step-1500-lr-1e-5-labels_40.0
Base model
meta-llama/Llama-2-7b-hf