Llama2-7B-lora-r-32-generic-step-1200-lr-1e-5-labels_40.0

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7836

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 20
  • training_steps: 1200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
5.5017 0.3653 20 5.4071
4.8201 0.7306 40 4.6726
4.2999 1.0959 60 4.1756
3.9694 1.4612 80 3.9211
3.7953 1.8265 100 3.7403
3.5871 2.1918 120 3.6054
3.5058 2.5571 140 3.4953
3.4235 2.9224 160 3.4057
3.3455 3.2877 180 3.3265
3.339 3.6530 200 3.2594
3.1913 4.0183 220 3.2011
3.1476 4.3836 240 3.1500
3.0893 4.7489 260 3.1085
3.0121 5.1142 280 3.0687
2.9902 5.4795 300 3.0374
2.9737 5.8447 320 3.0058
2.8389 6.2100 340 2.9831
2.9038 6.5753 360 2.9586
2.818 6.9406 380 2.9354
2.8141 7.3059 400 2.9206
2.7438 7.6712 420 2.9014
2.7746 8.0365 440 2.8862
2.7449 8.4018 460 2.8748
2.738 8.7671 480 2.8616
2.6404 9.1324 500 2.8548
2.6821 9.4977 520 2.8463
2.6512 9.8630 540 2.8365
2.6048 10.2283 560 2.8332
2.6179 10.5936 580 2.8247
2.6451 10.9589 600 2.8168
2.5769 11.3242 620 2.8157
2.6271 11.6895 640 2.8098
2.6242 12.0548 660 2.8061
2.5967 12.4201 680 2.8062
2.5563 12.7854 700 2.7994
2.4609 13.1507 720 2.7982
2.5047 13.5160 740 2.7984
2.5078 13.8813 760 2.7911
2.5034 14.2466 780 2.7917
2.5125 14.6119 800 2.7891
2.4899 14.9772 820 2.7882
2.4671 15.3425 840 2.7891
2.4907 15.7078 860 2.7873
2.5025 16.0731 880 2.7852
2.4492 16.4384 900 2.7864
2.761 16.8037 920 2.7858
2.4556 17.1689 940 2.7847
2.4823 17.5342 960 2.7845
2.4832 17.8995 980 2.7833
2.4917 18.2648 1000 2.7838
2.4228 18.6301 1020 2.7838
2.7161 18.9954 1040 2.7829
2.4172 19.3607 1060 2.7828
2.7152 19.7260 1080 2.7830
2.4624 20.0913 1100 2.7832
2.4348 20.4566 1120 2.7833
2.4513 20.8219 1140 2.7835
2.4233 21.1872 1160 2.7835
2.4601 21.5525 1180 2.7836
2.4652 21.9178 1200 2.7836

Framework versions

  • PEFT 0.15.2
  • Transformers 4.45.2
  • Pytorch 2.5.0+cu121
  • Datasets 3.2.0
  • Tokenizers 0.20.3
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Siqi-Hu/Llama2-7B-lora-r-32-generic-step-1200-lr-1e-5-labels_40.0

Adapter
(2344)
this model