Qwen3_VL_8B_Instruct_DPE_v2

Qwen3_VL_8B_Instruct_DPE_v2 is the second-iteration model evolved through the DPE framework. It builds upon the improvements of v1 to further refine multimodal reasoning.

🌟 Model Overview

DPE mimics educational psychology by diagnosing "blind spots" and performing targeted corrections. This version represents the second full cycle of the evolutionary pipeline.

v2 Key Features:

Progressive Refinement: Second cycle of diagnosis and targeted reinforcement.
Enhanced Logic: Further refined mathematical reasoning and visual understanding, showing strong gains in MathVision (+3.06%).

📊 Evaluation Results

Category	Benchmark	Base Model	DPE_v2 (Ours)	Improvement
STEM	MMMU	65.44	69.11	+3.67
	MMStar	61.27	71.67	+10.40
Visual Math	MathVision	51.97	55.03	+3.06
	MathVista	76.20	78.00	+1.80
Overall	Average	65.64	67.72	+2.08

📑 Citation

@misc{jia2026blindspotsgainsdiagnosticdriven,
      title={From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models}, 
      author={Hongrui Jia and Chaoya Jiang and Shikun Zhang and Wei Ye},
      year={2026},
      eprint={2602.22859},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2602.22859}, 
}

📜 License

This model follows the Qwen Research License.

Downloads last month: 3

Safetensors

Model size

9B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for hongruijia/Qwen3_VL_8B_Instruct_DPE_v2

From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models

Paper • 2602.22859 • Published Feb 26 • 151