LoRA: Low-Rank Adaptation of Large Language Models
Paper β’ 2106.09685 β’ Published β’ 60
This project demonstrates how to fine-tune the powerful Mistral-7B-Instruct-v0.1 model for text summarization using Parameter-Efficient Fine-Tuning (PEFT) and Quantization via qLoRA.
Check out the fine-tuned Mistral model code on Github:
https://github.com/Mayankpratapsingh022/Finetuning-LLMs
Mistral-7B-Instruct-v0.1qLoRAtransformers, peft, bitsandbytes, trlpip install accelerate peft bitsandbytes git+https://github.com/huggingface/transformers trl py7zr auto-gptq optimum
Login to Hugging Face:
from huggingface_hub import notebook_login
notebook_login()
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1")
quant_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True
)
model = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.1",
quantization_config=quant_config,
device_map="auto"
)
from peft import LoraConfig, get_peft_model
peft_config = LoraConfig(
r=16,
lora_alpha=16,
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM",
target_modules=["q_proj", "v_proj"]
)
model = get_peft_model(model, peft_config)
from transformers import TrainingArguments
training_args = TrainingArguments(
output_dir="mistral-finetuned-samsum",
per_device_train_batch_size=8,
learning_rate=2e-4,
num_train_epochs=1,
logging_steps=100,
fp16=True,
save_strategy="epoch",
push_to_hub=True
)
SFTTrainer
from trl import SFTTrainer
trainer = SFTTrainer(
model=model,
train_dataset=data,
tokenizer=tokenizer,
peft_config=peft_config,
args=training_args,
)
trainer.train()
# Save
!cp -r mistral-finetuned-samsum /content/drive/MyDrive/model
# Reload
from peft import AutoPeftModelForCausalLM
model = AutoPeftModelForCausalLM.from_pretrained(
"/content/drive/MyDrive/model/mistral-finetuned-samsum",
torch_dtype=torch.float16,
device_map="cuda"
)
| Term | Description |
|---|---|
| PEFT | Parameter-Efficient Fine-Tuning |
| LoRA | Low-Rank Adaptation for Transformers |
| Quantization | Reducing precision to 8/4/2-bit |
| qLoRA | Quantized LoRA |
| SFT | Supervised Fine-Tuning |