Introduction
This repository contains a LoRA adapter fine-tuned using Direct Preference Optimization (DPO) over a Retrieval-Augmented Generation (RAG) evaluation pipeline built on the Natural Questions validation set.
Training Pipeline
Base Model:
Qwen/Qwen2.5-3B-InstructRAG responses generated over NQ validation split
Responses scored using custom reward signals:
Faithfulness
Citation usage
Hallucination detection
Refusal detection
Preference pairs constructed using margin filtering
LoRA fine-tuning using DPO
Dataset Lineage
This model is trained and evaluated using:
Dataset repository:
AnjanSB/NQ-RAG-DPO-Evaluation
Configurations used:
rag_responses(base + trained generations)responses_scores(reward signals)dpo_train_data(preference dataset)comparison_metrics(evaluation results)
Evaluation Summary
| Metric | Base Model | Trained Model | Train Data |
|---|---|---|---|
| Mean Responses Margin | 0.4246 | 0.3612 | 2.1835 |
| Total Prompts | 1500 | 1500 | 228 |
| Mean Total Reward | 0.8378 | 0.8356 | 0.8582 |
| Faithfulness | 0.4300 | 0.4370 | 0.5307 |
| Citation Score | 0.8353 | 0.8683 | 0.8035 |
| Hallucination | 0.1974 | 0.1842 | 0.1095 |
The Train Data which is used to train base model indicates small in size(228 Total records) and high quality(2.18 a significant jump of mean response margin between choosen and rejected responses of a same prompt).
The trained model shows:
Significantly improved responses margin which indicates stable responses.
Improved citation usage
Reduced hallucination
Slightly improved faithfulness
Stable refusal behavior
π€ Author
AnjanSB
Experiment Repo : https://dagshub.com/AnjanSB/RAG-DPO-PEFT-LLMOPS
Profile : https://www.linkedin.com/in/anjansb/
- Downloads last month
- 6
Model tree for AnjanSB/Qwen2.5-3B-Instruct-NQ-RAG-DPO-LoRA
Datasets used to train AnjanSB/Qwen2.5-3B-Instruct-NQ-RAG-DPO-LoRA
Evaluation results
- Mean Responses Margin on NQ-RAG-DPO-Evaluation (Metrics & Inference)self-reported0.361
- Mean Total Reward on NQ-RAG-DPO-Evaluation (Metrics & Inference)self-reported0.836
- Faithfulness on NQ-RAG-DPO-Evaluation (Metrics & Inference)self-reported0.437
- Citation Score on NQ-RAG-DPO-Evaluation (Metrics & Inference)self-reported0.868
- Hallucination Score on NQ-RAG-DPO-Evaluation (Metrics & Inference)self-reported0.184
- Training Loss on NQ-RAG-DPO-Evaluation (Training Subset)self-reported0.694