Qwen 2.5 7B Query Rewriter (Final) -- SemEval-2026 Task 8

Fine-tuned Qwen 2.5 7B Instruct model for query rewriting in multi-turn conversational retrieval. This is the final model used in our SemEval-2026 Task 8 (MTRAGEval) submission, achieving nDCG@5 of 0.531 (8th/38 systems).

This model has the LoRA adapter weights fused into the base model for direct inference without adapter loading.

Model Details

  • Base Model: Qwen/Qwen2.5-7B-Instruct
  • Training Method: LoRA (Low-Rank Adaptation), weights fused into base model
  • Training Data: 777 query rewriting examples from MTRAGEval gold rewrites (all training + holdout combined)
  • Training Iterations: 500
  • Framework: MLX (Apple Silicon optimized)
  • Task: Transform multi-turn conversational queries into standalone, search-friendly queries

Training Configuration

Parameter Value
LoRA Rank 16
LoRA Alpha 32
LoRA Dropout 0.15
Target Modules q/k/v/o_proj, gate/up/down_proj
Layers Adapted 28 (all)
Trainable Params 40.4M (0.53%)
Optimizer AdamW
Learning Rate 1e-5
Weight Decay 0.01
Batch Size 2 (effective 16 with grad accumulation 8)
Max Sequence Length 2048
Gradient Checkpointing Yes
Precision bf16
Seed 42

Usage

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("caraman/Qwen2.5-7B-mtrag-query-rewriter-final")
tokenizer = AutoTokenizer.from_pretrained("caraman/Qwen2.5-7B-mtrag-query-rewriter-final")

system_prompt = """You are a query rewriting assistant for information retrieval. Given a conversation history and a current question, rewrite the question to be completely standalone and self-contained.

Rules:
1. Resolve all pronouns (it, they, this, that) to their explicit referents
2. Include relevant context from the conversation that's needed to understand the query
3. Keep the rewritten query concise and search-friendly
4. Do not add information not present in the conversation
5. If the question is already standalone, return it unchanged"""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": """CONVERSATION HISTORY:
USER: Tell me about the Eiffel Tower
ASSISTANT: The Eiffel Tower is a wrought-iron lattice tower in Paris, France.

CURRENT QUESTION: When was it built?

Rewrite this question to be standalone:"""}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.2)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
# Expected: "When was the Eiffel Tower built?"

With MLX (Apple Silicon)

from mlx_lm import load, generate

model, tokenizer = load("caraman/Qwen2.5-7B-mtrag-query-rewriter-final")
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response = generate(model, tokenizer, prompt=prompt, max_tokens=256, temp=0.2)

Domain-Specific Temperature

For optimal performance, use domain-specific temperatures:

Domain Temperature Description
Cloud (technical docs) 0.0 Deterministic -- preserves exact technical terms
ClapNQ (Wikipedia) 0.2 Minimal diversity for well-structured queries
FiQA (financial forums) 0.3 Slight exploration for ambiguous queries
Govt (government docs) 0.3 Slight exploration for policy terminology

Performance

Part of a three-stage pipeline (query rewriting + hybrid BM25/dense retrieval + cross-encoder reranking):

Metric Development Holdout Official Test
nDCG@5 0.422 (uniform t=0.2) 0.531
Rank -- 8th / 38 systems

Query rewriting alone provides a 13.7% relative gain over no-rewriting baseline (nDCG@5: 0.371 -> 0.422).

License

Apache 2.0

Downloads last month
18
Safetensors
Model size
8B params
Tensor type
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for caraman/Qwen2.5-7B-mtrag-query-rewriter-final

Base model

Qwen/Qwen2.5-7B
Adapter
(1804)
this model