Enlarging Modification Space Boosts Image Refinement in Unified Multimodal Models

Jiayi Guo, Linqing Wang, Jiangshan Wang, Yang Yue, Zeyu Liu, Zhiyuan Zhao, Qinglin Lu,
Gao Huang, Chunyu Wang ✉️

Tsinghua University · Tencent Hunyuan (HY)

We present Refinement via Regeneration (RvR), a novel framework that reformulates image refinement in unified multimodal models from an editing-based paradigm to a regeneration-based one. Instead of relying on intermediate editing instructions and enforcing pixel-level consistency, our method directly regenerates images conditioned on the target prompt and semantic representations of the initial image, thereby enlarging the effective modification space. This design enables more complete semantic alignment and avoids error accumulation from coarse instructions, leading to more flexible and accurate refinement.

Downloads last month: 71

Safetensors

Model size

15B params

Tensor type

F32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for JiayiGuo821/RvR-7B-MoT

Refinement via Regeneration: Enlarging Modification Space Boosts Image Refinement in Unified Multimodal Models

Paper • 2604.25636 • Published 10 days ago • 24