YAML Metadata Warning:The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

Qwen 2.5 7B Query Rewriter - LoRA Adapter

Fine-tuned LoRA adapter for query rewriting in multi-turn conversational retrieval (MTRAGEval Task A).

Model Details

  • Base Model: Qwen/Qwen2.5-7B-Instruct
  • Training Method: LoRA (Low-Rank Adaptation)
  • Training Data: MTRAG Benchmark human evaluations (551 train, 62 validation)
  • Best Checkpoint: Iteration 150
  • Framework: MLX (Apple Silicon optimized)
  • Task: Transform multi-turn conversational queries into standalone, search-friendly queries

Performance

The model resolves pronouns and includes necessary context from conversation history:

Original Query Rewritten Query
"What about Germany?" (after asking about France's capital) "What is the capital of Germany?"
"How much does it cost?" (after discussing iPhone 15 Pro) "What is the price of the iPhone 15 Pro?"

Usage

With MLX (Apple Silicon)

from mlx_lm import load, generate

# Load base model with adapter
model, tokenizer = load(
    "Qwen/Qwen2.5-7B-Instruct",
    adapter_path="caraman/Qwen2.5-7B-query-rewriter-lora"
)

# Prepare prompt
system_prompt = """You are a query rewriting assistant for information retrieval. Given a conversation history and a current question, rewrite the question to be completely standalone and self-contained."""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": """CONVERSATION HISTORY:
USER: Tell me about the Eiffel Tower
ASSISTANT: The Eiffel Tower is in Paris, France.

CURRENT QUESTION: When was it built?

Rewrite this question to be standalone:"""}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response = generate(model, tokenizer, prompt=prompt, max_tokens=256)
print(response)  # "When was the Eiffel Tower built?"

Training Configuration

  • LoRA Rank: 16
  • LoRA Alpha (scale): 32.0
  • Dropout: 0.15
  • Learning Rate: 1e-05
  • Batch Size: 4 (effective 16 with gradient accumulation)
  • Layers: 28
  • Training Iterations: 200 (best at 150)

Training Data

Extracted from MTRAG Benchmark human query rewrites across 4 domains:

  • ClapNQ (question answering)
  • FiQA (finance)
  • Government documents
  • IBM Cloud documentation

Evaluation

Evaluated on 164 holdout conversation queries with nDCG@10 for retrieval performance.

Limitations

  • Optimized for English only
  • Best for technical/informational queries
  • May not handle highly creative or open-ended questions well

Citation

@inproceedings{mtrageval2026,
  title={MTRAGEval: Multi-Turn Retrieval-Augmented Generation Evaluation},
  booktitle={SemEval 2026 Task 8},
  year={2026}
}

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for caraman/Qwen2.5-7B-query-rewriter-lora

Base model

Qwen/Qwen2.5-7B
Adapter
(1790)
this model