Introduction

This repository contains a LoRA adapter fine-tuned using Direct Preference Optimization (DPO) over a Retrieval-Augmented Generation (RAG) evaluation pipeline built on the Natural Questions validation set.

Training Pipeline

  1. Base Model: Qwen/Qwen2.5-3B-Instruct

  2. RAG responses generated over NQ validation split

  3. Responses scored using custom reward signals:

  • Faithfulness

  • Citation usage

  • Hallucination detection

  • Refusal detection

  1. Preference pairs constructed using margin filtering

  2. LoRA fine-tuning using DPO

Dataset Lineage

This model is trained and evaluated using:

Dataset repository:

AnjanSB/NQ-RAG-DPO-Evaluation

Configurations used:

  • rag_responses (base + trained generations)

  • responses_scores (reward signals)

  • dpo_train_data (preference dataset)

  • comparison_metrics (evaluation results)

Evaluation Summary

Metric Base Model Trained Model Train Data
Mean Responses Margin 0.4246 0.3612 2.1835
Total Prompts 1500 1500 228
Mean Total Reward 0.8378 0.8356 0.8582
Faithfulness 0.4300 0.4370 0.5307
Citation Score 0.8353 0.8683 0.8035
Hallucination 0.1974 0.1842 0.1095

The Train Data which is used to train base model indicates small in size(228 Total records) and high quality(2.18 a significant jump of mean response margin between choosen and rejected responses of a same prompt).

The trained model shows:

  • Significantly improved responses margin which indicates stable responses.

  • Improved citation usage

  • Reduced hallucination

  • Slightly improved faithfulness

  • Stable refusal behavior

πŸ‘€ Author

AnjanSB

Experiment Repo : https://dagshub.com/AnjanSB/RAG-DPO-PEFT-LLMOPS

Profile : https://www.linkedin.com/in/anjansb/

Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for AnjanSB/Qwen2.5-3B-Instruct-NQ-RAG-DPO-LoRA

Base model

Qwen/Qwen2.5-3B
Adapter
(1129)
this model

Datasets used to train AnjanSB/Qwen2.5-3B-Instruct-NQ-RAG-DPO-LoRA

Evaluation results

  • Mean Responses Margin on NQ-RAG-DPO-Evaluation (Metrics & Inference)
    self-reported
    0.361
  • Mean Total Reward on NQ-RAG-DPO-Evaluation (Metrics & Inference)
    self-reported
    0.836
  • Faithfulness on NQ-RAG-DPO-Evaluation (Metrics & Inference)
    self-reported
    0.437
  • Citation Score on NQ-RAG-DPO-Evaluation (Metrics & Inference)
    self-reported
    0.868
  • Hallucination Score on NQ-RAG-DPO-Evaluation (Metrics & Inference)
    self-reported
    0.184
  • Training Loss on NQ-RAG-DPO-Evaluation (Training Subset)
    self-reported
    0.694