File size: 4,132 Bytes
3674cc4 af4082b 3674cc4 2805b98 3674cc4 2805b98 3674cc4 533c6b5 a3cdd17 533c6b5 8b2b970 533c6b5 3674cc4 2805b98 3674cc4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 | ---
license: apache-2.0
language:
- en
- zh
library_name: diffusers
base_model:
- Qwen/Qwen-Image-Edit
pipeline_tag: image-to-image
tags:
- image-editing
- consistency
- aesthetics
- DiT
- Qwen-Image
- ValiantCat
---
<p align="center">
<img src="https://ai.static.ad2.cc/banner.png" width="1000"/>
</p>
---
# 🌈 Qwen-Image-Edit-MeiTu
This model — **Qwen-Image-Edit-MeiTu** — is an improved variant of [Qwen/Qwen-Image-Edit](https://huggingface.co/Qwen/Qwen-Image-Edit), built with **DiT-based architecture fine-tuning** to enhance **visual consistency**, **aesthetic quality**, and **structural alignment** in complex edits.
Developed by **Valiant Cat AI Lab**, this version aims to further close the gap between high-fidelity semantic editing and coherent artistic rendering, achieving a more natural and professional output across a wide range of prompts and subjects.
---
## ✨ Key Improvements
* **Enhanced Consistency**:
Utilizes DiT (Diffusion Transformer) fine-tuning to ensure **structural stability** between input and edited regions, maintaining global spatial coherence.
* **Aesthetic Optimization**:
Trained with aesthetic discriminators and curated aesthetic score datasets, producing more **pleasing colors, contrast, and light balance**.
* **Better Detail Preservation**:
Improved low-level reconstruction for fine details such as **textures, faces, and typography**.
* **Broader Scene Adaptability**:
Performs well on **portraits, environments, product photos, and illustrations**, supporting both **semantic** and **appearance-based** editing.
---
## 🖼️ Showcase
Below are examples of **consistency and aesthetic improvement** in complex editing scenarios:
| Input & Output |
|----------------|
| <img src="preview/result1.png" width="800"/> |
| <img src="preview/result2.png" width="800"/> |
| <img src="preview/result3.png" width="800"/> |
| <img src="preview/result4.png" width="800"/> |
| <img src="preview/result5.png" width="800"/> |
## 💬 Recommended Prompts
Try these prompts to explore the model’s strengths:
* “make the lighting soft and cinematic with better balance”
* “enhance the photo’s composition and maintain realism”
* “refine skin tone and texture consistency”
* “improve the global color tone and aesthetic harmony”
* “increase photo realism and clarity without changing content”
---
## 🧩 Integration with ComfyUI
This model works seamlessly with a modified [ComfyUI Qwen-Image-Edit workflow](https://huggingface.co/valiantcat/Qwen-Image-Edit-MeiTu/blob/main/Qwen-Edit-MeiTu.json).
Just use this model in the **Unet node** to workflow for edit image.
---
## 📥 Download Model
Weights available in **Safetensors** format:
👉 [Download Qwen-Image-Edit-MeiTu](https://huggingface.co/valiantcat/Qwen-Image-Edit-MeiTu)
---
## 🧠 Training
This model was trained and optimized by the
**AI Laboratory of Chongqing Valiant Cat Technology Co., LTD.**
Visit [https://vvicat.com/](https://vvicat.com/) for business collaborations or research partnerships.
---
## 📄 Related Paper
This model is part of the **Qwen-Edit+** research line and is associated with the following preprint:
**Fan Tang, Siyuan Li**
*Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation*.
Research Square, Version 1, 08 April 2026.
DOI: [10.21203/rs.3.rs-9352857/v1](https://doi.org/10.21203/rs.3.rs-9352857/v1)
---
## 📚 Citation
If you use this model, please cite:
```bibtex
@article{tang2026qweneditplus,
author = {Fan Tang and Siyuan Li},
title = {Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation},
journal = {Research Square},
year = {2026},
doi = {10.21203/rs.3.rs-9352857/v1},
url = {https://doi.org/10.21203/rs.3.rs-9352857/v1}
}
```
---
## 📜 License
Licensed under **Apache 2.0**.
---
## 💼 Join Us
We are hiring research engineers and creative ML practitioners at
**Chongqing Valiant Cat Technology Co., LTD** — reach out via
📧 **tommy@vvicat.com**
|