Mistral Hospitality QLoRA

A QLoRA fine-tune of Mistral-7B-Instruct-v0.3 for hotel booking dialogs and hospitality FAQ.

Metric Value
Eval Loss 1.38
Perplexity 3.97
ROUGE-1 0.42
ROUGE-2 0.21
ROUGE-L 0.38

Try the live demo · GitHub

A merged (adapter-free) version is also available: Hadix10/mistral-hospitality-merged — no PEFT dependency needed at inference time.

Model Details

  • Base model: mistralai/Mistral-7B-Instruct-v0.3
  • Method: QLoRA (4-bit NF4 base + bf16 LoRA adapters)
  • Trainable params: ~0.3% of total
  • Language: English
  • License: MIT
  • Author: Hadi Hijazi

Training

Data

Dataset Task Examples Split
SGD Hotels Multi-turn booking dialog ~6 000 90/10
Bitext Hospitality Single-turn FAQ / intent ~25 000 90/10
  • Capped at 2 500 train / 500 val examples after independent per-dataset splits (seed 42)
  • Validation deduplicated against train to prevent data leakage
  • Formatted with Mistral's native [INST]...[/INST] template
  • Loss computed only on response tokens (completion-only via DataCollatorForCompletionOnlyLM)

LoRA Configuration

Parameter Value
r 8
alpha 16
dropout 0.05
target_modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
bias none
task_type CAUSAL_LM

Hyperparameters

Parameter Value
learning_rate 2e-4
epochs 3
batch_size (per device) 2
gradient_accumulation 4 (effective batch = 8)
warmup_ratio 0.03
lr_scheduler cosine
optimizer paged_adamw_8bit
max_seq_length 1024
precision bf16
gradient_checkpointing yes

Hardware

Trained on Google Colab with an A100 40 GB GPU.

Evaluation

Metric Value
Eval Loss (completion-only) 1.38
Perplexity 3.97
ROUGE-1 0.42
ROUGE-2 0.21
ROUGE-L 0.38

ROUGE scores computed on 200 samples with greedy decoding (temperature=0.1, do_sample=False, max_new_tokens=256).

Demo

Try the model live: Mistral Hospitality Assistant — runs on ZeroGPU (free A10G).

Before / After Examples

FAQ — "What are the check-in and check-out times?"

Base model: Check-in and check-out times vary depending on the hotel. Generally, check-in is around 3 PM and check-out is around 11 AM.

Fine-tuned: Our standard check-in time is 3:00 PM and check-out is at 11:00 AM. Early check-in and late check-out are available upon request, subject to availability. Would you like me to arrange this for your stay?

Booking dialog:

Base model: I can help you with that. What city are you looking to book in?

Fine-tuned: I found a great option for you — the Marriott Marquis in downtown Rome is available for May 12-14. It's a 4-star hotel with excellent reviews. Shall I go ahead and reserve a room?

Usage

Option 1: QLoRA Adapter (this repo)

Requires ~6 GB VRAM (4-bit base + LoRA adapter):

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
)
base = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.3",
    quantization_config=bnb,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "Hadix10/mistral-hospitality-qlora")
tokenizer = AutoTokenizer.from_pretrained("Hadix10/mistral-hospitality-qlora")

prompt = "[INST] What are the check-in and check-out times? [/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, temperature=0.7, do_sample=True, top_p=0.9)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Option 2: Merged Model (no PEFT dependency)

A standalone merged model with the adapter baked into the base weights is available at Hadix10/mistral-hospitality-merged. No PEFT dependency needed at inference time — load it like any standard Hugging Face model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Hadix10/mistral-hospitality-merged", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("Hadix10/mistral-hospitality-merged")

Intended Use

  • Hospitality chatbots and virtual concierges
  • Hotel booking assistants
  • Customer service automation for the hospitality industry

Limitations

  • Domain-specific — trained on hospitality data only
  • English language only
  • Not suitable for safety-critical applications without human oversight
  • May hallucinate hotel names, dates, or policies

Repository

The full training pipeline, evaluation scripts, API server, and test suite are available on GitHub:

Features: QLoRA training, adapter merging, ROUGE + perplexity evaluation, LLM-as-judge scoring (Gemini), W&B logging, FastAPI server with SSE streaming, Gradio demo, and a full test suite.

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Hadix10/mistral-hospitality-qlora

Adapter
(894)
this model

Datasets used to train Hadix10/mistral-hospitality-qlora

Space using Hadix10/mistral-hospitality-qlora 1