| --- |
| license: apache-2.0 |
| language: |
| - en |
| - zh |
| library_name: diffusers |
| base_model: |
| - Qwen/Qwen-Image-Edit |
| pipeline_tag: image-to-image |
| tags: |
| - image-editing |
| - consistency |
| - aesthetics |
| - DiT |
| - Qwen-Image |
| - ValiantCat |
| --- |
| |
| <p align="center"> |
| <img src="https://ai.static.ad2.cc/banner.png" width="1000"/> |
| </p> |
| |
| --- |
|
|
| # 🌈 Qwen-Image-Edit-MeiTu |
|
|
| This model — **Qwen-Image-Edit-MeiTu** — is an improved variant of [Qwen/Qwen-Image-Edit](https://huggingface.co/Qwen/Qwen-Image-Edit), built with **DiT-based architecture fine-tuning** to enhance **visual consistency**, **aesthetic quality**, and **structural alignment** in complex edits. |
|
|
| Developed by **Valiant Cat AI Lab**, this version aims to further close the gap between high-fidelity semantic editing and coherent artistic rendering, achieving a more natural and professional output across a wide range of prompts and subjects. |
|
|
| --- |
|
|
| ## ✨ Key Improvements |
|
|
| * **Enhanced Consistency**: |
| Utilizes DiT (Diffusion Transformer) fine-tuning to ensure **structural stability** between input and edited regions, maintaining global spatial coherence. |
|
|
| * **Aesthetic Optimization**: |
| Trained with aesthetic discriminators and curated aesthetic score datasets, producing more **pleasing colors, contrast, and light balance**. |
|
|
| * **Better Detail Preservation**: |
| Improved low-level reconstruction for fine details such as **textures, faces, and typography**. |
|
|
| * **Broader Scene Adaptability**: |
| Performs well on **portraits, environments, product photos, and illustrations**, supporting both **semantic** and **appearance-based** editing. |
|
|
| --- |
|
|
| ## 🖼️ Showcase |
|
|
| Below are examples of **consistency and aesthetic improvement** in complex editing scenarios: |
|
|
| | Input & Output | |
| |----------------| |
| | <img src="preview/result1.png" width="800"/> | |
| | <img src="preview/result2.png" width="800"/> | |
| | <img src="preview/result3.png" width="800"/> | |
| | <img src="preview/result4.png" width="800"/> | |
| | <img src="preview/result5.png" width="800"/> | |
|
|
|
|
|
|
| ## 💬 Recommended Prompts |
|
|
| Try these prompts to explore the model’s strengths: |
|
|
| * “make the lighting soft and cinematic with better balance” |
| * “enhance the photo’s composition and maintain realism” |
| * “refine skin tone and texture consistency” |
| * “improve the global color tone and aesthetic harmony” |
| * “increase photo realism and clarity without changing content” |
|
|
| --- |
|
|
| ## 🧩 Integration with ComfyUI |
|
|
| This model works seamlessly with a modified [ComfyUI Qwen-Image-Edit workflow](https://huggingface.co/valiantcat/Qwen-Image-Edit-MeiTu/blob/main/Qwen-Edit-MeiTu.json). |
| Just use this model in the **Unet node** to workflow for edit image. |
|
|
| --- |
|
|
| ## 📥 Download Model |
|
|
| Weights available in **Safetensors** format: |
|
|
| 👉 [Download Qwen-Image-Edit-MeiTu](https://huggingface.co/valiantcat/Qwen-Image-Edit-MeiTu) |
|
|
| --- |
|
|
| ## 🧠 Training |
|
|
| This model was trained and optimized by the |
| **AI Laboratory of Chongqing Valiant Cat Technology Co., LTD.** |
| Visit [https://vvicat.com/](https://vvicat.com/) for business collaborations or research partnerships. |
|
|
| --- |
|
|
| ## 📄 Related Paper |
|
|
| This model is part of the **Qwen-Edit+** research line and is associated with the following preprint: |
|
|
| **Fan Tang, Siyuan Li** |
| *Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation*. |
| Research Square, Version 1, 08 April 2026. |
| DOI: [10.21203/rs.3.rs-9352857/v1](https://doi.org/10.21203/rs.3.rs-9352857/v1) |
|
|
| --- |
|
|
| ## 📚 Citation |
|
|
| If you use this model, please cite: |
|
|
| ```bibtex |
| @article{tang2026qweneditplus, |
| author = {Fan Tang and Siyuan Li}, |
| title = {Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation}, |
| journal = {Research Square}, |
| year = {2026}, |
| doi = {10.21203/rs.3.rs-9352857/v1}, |
| url = {https://doi.org/10.21203/rs.3.rs-9352857/v1} |
| } |
| ``` |
|
|
| --- |
|
|
| ## 📜 License |
|
|
| Licensed under **Apache 2.0**. |
|
|
| --- |
|
|
| ## 💼 Join Us |
|
|
| We are hiring research engineers and creative ML practitioners at |
| **Chongqing Valiant Cat Technology Co., LTD** — reach out via |
| 📧 **tommy@vvicat.com** |
|
|
|
|