Mistral Hospitality QLoRA

A QLoRA fine-tune of Mistral-7B-Instruct-v0.3 for hotel booking dialogs and hospitality FAQ.

Metric	Value
Eval Loss	1.38
Perplexity	3.97
ROUGE-1	0.42
ROUGE-2	0.21
ROUGE-L	0.38

Try the live demo · GitHub

A merged (adapter-free) version is also available: Hadix10/mistral-hospitality-merged — no PEFT dependency needed at inference time.

Model Details

Base model: mistralai/Mistral-7B-Instruct-v0.3
Method: QLoRA (4-bit NF4 base + bf16 LoRA adapters)
Trainable params: ~0.3% of total
Language: English
License: MIT
Author: Hadi Hijazi

Training

Data

Dataset	Task	Examples	Split
SGD Hotels	Multi-turn booking dialog	~6 000	90/10
Bitext Hospitality	Single-turn FAQ / intent	~25 000	90/10

Capped at 2 500 train / 500 val examples after independent per-dataset splits (seed 42)
Validation deduplicated against train to prevent data leakage
Formatted with Mistral's native [INST]...[/INST] template
Loss computed only on response tokens (completion-only via DataCollatorForCompletionOnlyLM)

LoRA Configuration

Parameter	Value
r	8
alpha	16
dropout	0.05
target_modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
bias	none
task_type	CAUSAL_LM

Hyperparameters

Parameter	Value
learning_rate	2e-4
epochs	3
batch_size (per device)	2
gradient_accumulation	4 (effective batch = 8)
warmup_ratio	0.03
lr_scheduler	cosine
optimizer	paged_adamw_8bit
max_seq_length	1024
precision	bf16
gradient_checkpointing	yes

Hardware

Trained on Google Colab with an A100 40 GB GPU.

Evaluation

Metric	Value
Eval Loss (completion-only)	1.38
Perplexity	3.97
ROUGE-1	0.42
ROUGE-2	0.21
ROUGE-L	0.38

ROUGE scores computed on 200 samples with greedy decoding (temperature=0.1, do_sample=False, max_new_tokens=256).

Demo

Try the model live: Mistral Hospitality Assistant — runs on ZeroGPU (free A10G).

Before / After Examples

FAQ — "What are the check-in and check-out times?"

Base model: Check-in and check-out times vary depending on the hotel. Generally, check-in is around 3 PM and check-out is around 11 AM.

Fine-tuned: Our standard check-in time is 3:00 PM and check-out is at 11:00 AM. Early check-in and late check-out are available upon request, subject to availability. Would you like me to arrange this for your stay?

Booking dialog:

Base model: I can help you with that. What city are you looking to book in?

Fine-tuned: I found a great option for you — the Marriott Marquis in downtown Rome is available for May 12-14. It's a 4-star hotel with excellent reviews. Shall I go ahead and reserve a room?

Usage

Option 1: QLoRA Adapter (this repo)

Requires ~6 GB VRAM (4-bit base + LoRA adapter):

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
)
base = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.3",
    quantization_config=bnb,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "Hadix10/mistral-hospitality-qlora")
tokenizer = AutoTokenizer.from_pretrained("Hadix10/mistral-hospitality-qlora")

prompt = "[INST] What are the check-in and check-out times? [/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, temperature=0.7, do_sample=True, top_p=0.9)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Option 2: Merged Model (no PEFT dependency)

A standalone merged model with the adapter baked into the base weights is available at Hadix10/mistral-hospitality-merged. No PEFT dependency needed at inference time — load it like any standard Hugging Face model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Hadix10/mistral-hospitality-merged", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("Hadix10/mistral-hospitality-merged")

Intended Use

Hospitality chatbots and virtual concierges
Hotel booking assistants
Customer service automation for the hospitality industry

Limitations

Domain-specific — trained on hospitality data only
English language only
Not suitable for safety-critical applications without human oversight
May hallucinate hotel names, dates, or policies

Repository

The full training pipeline, evaluation scripts, API server, and test suite are available on GitHub:

Code: GitHub
Demo: Mistral Hospitality Assistant
Merged Model: Hadix10/mistral-hospitality-merged

Features: QLoRA training, adapter merging, ROUGE + perplexity evaluation, LLM-as-judge scoring (Gemini), W&B logging, FastAPI server with SSE streaming, Gradio demo, and a full test suite.

Downloads last month: 5

Model tree for Hadix10/mistral-hospitality-qlora

Base model

mistralai/Mistral-7B-v0.3

Finetuned

mistralai/Mistral-7B-Instruct-v0.3

Adapter

(894)

this model

Hadix10
/

mistral-hospitality-qlora