File size: 4,132 Bytes
3674cc4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
af4082b
3674cc4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2805b98
3674cc4
 
 
 
 
 
 
2805b98
3674cc4
533c6b5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a3cdd17
 
533c6b5
8b2b970
533c6b5
3674cc4
 
 
 
2805b98
3674cc4
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
license: apache-2.0
language:
- en
- zh
library_name: diffusers
base_model:
- Qwen/Qwen-Image-Edit
pipeline_tag: image-to-image
tags:
- image-editing
- consistency
- aesthetics
- DiT
- Qwen-Image
- ValiantCat
---

<p align="center">
    <img src="https://ai.static.ad2.cc/banner.png" width="1000"/>
</p>

---

# 🌈 Qwen-Image-Edit-MeiTu

This model — **Qwen-Image-Edit-MeiTu** — is an improved variant of [Qwen/Qwen-Image-Edit](https://huggingface.co/Qwen/Qwen-Image-Edit), built with **DiT-based architecture fine-tuning** to enhance **visual consistency**, **aesthetic quality**, and **structural alignment** in complex edits.

Developed by **Valiant Cat AI Lab**, this version aims to further close the gap between high-fidelity semantic editing and coherent artistic rendering, achieving a more natural and professional output across a wide range of prompts and subjects.

---

## ✨ Key Improvements

* **Enhanced Consistency**:  
  Utilizes DiT (Diffusion Transformer) fine-tuning to ensure **structural stability** between input and edited regions, maintaining global spatial coherence.

* **Aesthetic Optimization**:  
  Trained with aesthetic discriminators and curated aesthetic score datasets, producing more **pleasing colors, contrast, and light balance**.

* **Better Detail Preservation**:  
  Improved low-level reconstruction for fine details such as **textures, faces, and typography**.

* **Broader Scene Adaptability**:  
  Performs well on **portraits, environments, product photos, and illustrations**, supporting both **semantic** and **appearance-based** editing.

---

## 🖼️ Showcase

Below are examples of **consistency and aesthetic improvement** in complex editing scenarios:

| Input & Output |
|----------------|
| <img src="preview/result1.png" width="800"/> |
| <img src="preview/result2.png" width="800"/> |
| <img src="preview/result3.png" width="800"/> |
| <img src="preview/result4.png" width="800"/> |
| <img src="preview/result5.png" width="800"/> |



## 💬 Recommended Prompts

Try these prompts to explore the model’s strengths:

* “make the lighting soft and cinematic with better balance”  
* “enhance the photo’s composition and maintain realism”  
* “refine skin tone and texture consistency”  
* “improve the global color tone and aesthetic harmony”  
* “increase photo realism and clarity without changing content”

---

## 🧩 Integration with ComfyUI

This model works seamlessly with a modified [ComfyUI Qwen-Image-Edit workflow](https://huggingface.co/valiantcat/Qwen-Image-Edit-MeiTu/blob/main/Qwen-Edit-MeiTu.json).  
Just use this model in the **Unet node** to workflow for edit image.

---

## 📥 Download Model

Weights available in **Safetensors** format:

👉 [Download Qwen-Image-Edit-MeiTu](https://huggingface.co/valiantcat/Qwen-Image-Edit-MeiTu)

---

## 🧠 Training

This model was trained and optimized by the  
**AI Laboratory of Chongqing Valiant Cat Technology Co., LTD.**  
Visit [https://vvicat.com/](https://vvicat.com/) for business collaborations or research partnerships.

---

## 📄 Related Paper

This model is part of the **Qwen-Edit+** research line and is associated with the following preprint:

**Fan Tang, Siyuan Li**  
*Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation*.  
Research Square, Version 1, 08 April 2026.  
DOI: [10.21203/rs.3.rs-9352857/v1](https://doi.org/10.21203/rs.3.rs-9352857/v1)

---

## 📚 Citation

If you use this model, please cite:

```bibtex
@article{tang2026qweneditplus,
  author  = {Fan Tang and Siyuan Li},
  title   = {Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation},
  journal = {Research Square},
  year    = {2026},
  doi     = {10.21203/rs.3.rs-9352857/v1},
  url     = {https://doi.org/10.21203/rs.3.rs-9352857/v1}
}
```

---

## 📜 License

Licensed under **Apache 2.0**.

---

## 💼 Join Us

We are hiring research engineers and creative ML practitioners at  
**Chongqing Valiant Cat Technology Co., LTD** — reach out via  
📧 **tommy@vvicat.com**