Mistral Hospitality QLoRA
A QLoRA fine-tune of Mistral-7B-Instruct-v0.3 for hotel booking dialogs and hospitality FAQ.
| Metric | Value |
|---|---|
| Eval Loss | 1.38 |
| Perplexity | 3.97 |
| ROUGE-1 | 0.42 |
| ROUGE-2 | 0.21 |
| ROUGE-L | 0.38 |
A merged (adapter-free) version is also available:
Hadix10/mistral-hospitality-merged— no PEFT dependency needed at inference time.
Model Details
- Base model: mistralai/Mistral-7B-Instruct-v0.3
- Method: QLoRA (4-bit NF4 base + bf16 LoRA adapters)
- Trainable params: ~0.3% of total
- Language: English
- License: MIT
- Author: Hadi Hijazi
Training
Data
| Dataset | Task | Examples | Split |
|---|---|---|---|
| SGD Hotels | Multi-turn booking dialog | ~6 000 | 90/10 |
| Bitext Hospitality | Single-turn FAQ / intent | ~25 000 | 90/10 |
- Capped at 2 500 train / 500 val examples after independent per-dataset splits (seed 42)
- Validation deduplicated against train to prevent data leakage
- Formatted with Mistral's native
[INST]...[/INST]template - Loss computed only on response tokens (completion-only via
DataCollatorForCompletionOnlyLM)
LoRA Configuration
| Parameter | Value |
|---|---|
| r | 8 |
| alpha | 16 |
| dropout | 0.05 |
| target_modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| bias | none |
| task_type | CAUSAL_LM |
Hyperparameters
| Parameter | Value |
|---|---|
| learning_rate | 2e-4 |
| epochs | 3 |
| batch_size (per device) | 2 |
| gradient_accumulation | 4 (effective batch = 8) |
| warmup_ratio | 0.03 |
| lr_scheduler | cosine |
| optimizer | paged_adamw_8bit |
| max_seq_length | 1024 |
| precision | bf16 |
| gradient_checkpointing | yes |
Hardware
Trained on Google Colab with an A100 40 GB GPU.
Evaluation
| Metric | Value |
|---|---|
| Eval Loss (completion-only) | 1.38 |
| Perplexity | 3.97 |
| ROUGE-1 | 0.42 |
| ROUGE-2 | 0.21 |
| ROUGE-L | 0.38 |
ROUGE scores computed on 200 samples with greedy decoding (temperature=0.1, do_sample=False, max_new_tokens=256).
Demo
Try the model live: Mistral Hospitality Assistant — runs on ZeroGPU (free A10G).
Before / After Examples
FAQ — "What are the check-in and check-out times?"
Base model: Check-in and check-out times vary depending on the hotel. Generally, check-in is around 3 PM and check-out is around 11 AM.
Fine-tuned: Our standard check-in time is 3:00 PM and check-out is at 11:00 AM. Early check-in and late check-out are available upon request, subject to availability. Would you like me to arrange this for your stay?
Booking dialog:
Base model: I can help you with that. What city are you looking to book in?
Fine-tuned: I found a great option for you — the Marriott Marquis in downtown Rome is available for May 12-14. It's a 4-star hotel with excellent reviews. Shall I go ahead and reserve a room?
Usage
Option 1: QLoRA Adapter (this repo)
Requires ~6 GB VRAM (4-bit base + LoRA adapter):
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
bnb = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
)
base = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.3",
quantization_config=bnb,
device_map="auto",
)
model = PeftModel.from_pretrained(base, "Hadix10/mistral-hospitality-qlora")
tokenizer = AutoTokenizer.from_pretrained("Hadix10/mistral-hospitality-qlora")
prompt = "[INST] What are the check-in and check-out times? [/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, temperature=0.7, do_sample=True, top_p=0.9)
print(tokenizer.decode(out[0], skip_special_tokens=True))
Option 2: Merged Model (no PEFT dependency)
A standalone merged model with the adapter baked into the base weights is available at Hadix10/mistral-hospitality-merged. No PEFT dependency needed at inference time — load it like any standard Hugging Face model:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Hadix10/mistral-hospitality-merged", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("Hadix10/mistral-hospitality-merged")
Intended Use
- Hospitality chatbots and virtual concierges
- Hotel booking assistants
- Customer service automation for the hospitality industry
Limitations
- Domain-specific — trained on hospitality data only
- English language only
- Not suitable for safety-critical applications without human oversight
- May hallucinate hotel names, dates, or policies
Repository
The full training pipeline, evaluation scripts, API server, and test suite are available on GitHub:
- Code: GitHub
- Demo: Mistral Hospitality Assistant
- Merged Model:
Hadix10/mistral-hospitality-merged
Features: QLoRA training, adapter merging, ROUGE + perplexity evaluation, LLM-as-judge scoring (Gemini), W&B logging, FastAPI server with SSE streaming, Gradio demo, and a full test suite.
- Downloads last month
- 5
Model tree for Hadix10/mistral-hospitality-qlora
Base model
mistralai/Mistral-7B-v0.3