File size: 4,132 Bytes

---
license: apache-2.0
language:
- en
- zh
library_name: diffusers
base_model:
- Qwen/Qwen-Image-Edit
pipeline_tag: image-to-image
tags:
- image-editing
- consistency
- aesthetics
- DiT
- Qwen-Image
- ValiantCat
---

<p align="center">
    <img src="https://ai.static.ad2.cc/banner.png" width="1000"/>
</p>

---

# 🌈 Qwen-Image-Edit-MeiTu

This model — **Qwen-Image-Edit-MeiTu** — is an improved variant of [Qwen/Qwen-Image-Edit](https://huggingface.co/Qwen/Qwen-Image-Edit), built with **DiT-based architecture fine-tuning** to enhance **visual consistency**, **aesthetic quality**, and **structural alignment** in complex edits.

Developed by **Valiant Cat AI Lab**, this version aims to further close the gap between high-fidelity semantic editing and coherent artistic rendering, achieving a more natural and professional output across a wide range of prompts and subjects.

---

## ✨ Key Improvements

* **Enhanced Consistency**:  
  Utilizes DiT (Diffusion Transformer) fine-tuning to ensure **structural stability** between input and edited regions, maintaining global spatial coherence.

* **Aesthetic Optimization**:  
  Trained with aesthetic discriminators and curated aesthetic score datasets, producing more **pleasing colors, contrast, and light balance**.

* **Better Detail Preservation**:  
  Improved low-level reconstruction for fine details such as **textures, faces, and typography**.

* **Broader Scene Adaptability**:  
  Performs well on **portraits, environments, product photos, and illustrations**, supporting both **semantic** and **appearance-based** editing.

---

## 🖼️ Showcase

Below are examples of **consistency and aesthetic improvement** in complex editing scenarios:

| Input & Output |
|----------------|
| <img src="preview/result1.png" width="800"/> |
| <img src="preview/result2.png" width="800"/> |
| <img src="preview/result3.png" width="800"/> |
| <img src="preview/result4.png" width="800"/> |
| <img src="preview/result5.png" width="800"/> |



## 💬 Recommended Prompts

Try these prompts to explore the model’s strengths:

* “make the lighting soft and cinematic with better balance”  
* “enhance the photo’s composition and maintain realism”  
* “refine skin tone and texture consistency”  
* “improve the global color tone and aesthetic harmony”  
* “increase photo realism and clarity without changing content”

---

## 🧩 Integration with ComfyUI

This model works seamlessly with a modified [ComfyUI Qwen-Image-Edit workflow](https://huggingface.co/valiantcat/Qwen-Image-Edit-MeiTu/blob/main/Qwen-Edit-MeiTu.json).  
Just use this model in the **Unet node** to workflow for edit image.

---

## 📥 Download Model

Weights available in **Safetensors** format:

👉 [Download Qwen-Image-Edit-MeiTu](https://huggingface.co/valiantcat/Qwen-Image-Edit-MeiTu)

---

## 🧠 Training

This model was trained and optimized by the  
**AI Laboratory of Chongqing Valiant Cat Technology Co., LTD.**  
Visit [https://vvicat.com/](https://vvicat.com/) for business collaborations or research partnerships.

---

## 📄 Related Paper

This model is part of the **Qwen-Edit+** research line and is associated with the following preprint:

**Fan Tang, Siyuan Li**  
*Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation*.  
Research Square, Version 1, 08 April 2026.  
DOI: [10.21203/rs.3.rs-9352857/v1](https://doi.org/10.21203/rs.3.rs-9352857/v1)

---

## 📚 Citation

If you use this model, please cite:

```bibtex
@article{tang2026qweneditplus,
  author  = {Fan Tang and Siyuan Li},
  title   = {Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation},
  journal = {Research Square},
  year    = {2026},
  doi     = {10.21203/rs.3.rs-9352857/v1},
  url     = {https://doi.org/10.21203/rs.3.rs-9352857/v1}
}
```

---

## 📜 License

Licensed under **Apache 2.0**.

---

## 💼 Join Us

We are hiring research engineers and creative ML practitioners at  
**Chongqing Valiant Cat Technology Co., LTD** — reach out via  
📧 **tommy@vvicat.com**