Qwen3_VL_8B_Instruct_DPE_v1

Qwen3_VL_8B_Instruct_DPE_v1 is the first-iteration model evolved from Qwen3-VL-8B-Instruct using the DPE (Diagnostic-driven Progressive Evolution) framework.

🌟 Model Overview

DPE is a self-evolving training framework for Large Multimodal Models (LMMs). It prioritizes the diagnosis of capability gaps to steer targeted data generation and mixture optimization. This version represents the first full cycle of the DPE pipeline.

v1 Key Features:

Initial Evolution: The first successful cycle of diagnostic-driven targeted refinement.
Immediate Gains: Demonstrated a significant performance jump in STEM reasoning, particularly on MMStar (+10.13%).

📊 Evaluation Results

Category	Benchmark	Base Model	DPE_v1 (Ours)	Improvement
STEM	MMMU	65.44	68.11	+2.67
	MMVet	67.29	70.92	+3.63
	MMStar	61.27	71.40	+10.13
Visual Math	MathVerse	53.22	55.99	+2.77
	MathVision	51.97	52.04	+0.07
Overall	Average	65.64	67.48	+1.84

📑 Citation

@misc{jia2026blindspotsgainsdiagnosticdriven,
      title={From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models}, 
      author={Hongrui Jia and Chaoya Jiang and Shikun Zhang and Wei Ye},
      year={2026},
      eprint={2602.22859},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2602.22859}, 
}

Downloads last month: 2

Safetensors

Model size

9B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for hongruijia/Qwen3_VL_8B_Instruct_DPE_v1

From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models

Paper • 2602.22859 • Published Feb 26 • 151