Update README.md

511dbfc verified about 1 month ago

4.13 kB

	---
	license: apache-2.0
	language:
	- en
	- zh
	library_name: diffusers
	base_model:
	- Qwen/Qwen-Image-Edit
	pipeline_tag: image-to-image
	tags:
	- image-editing
	- consistency
	- aesthetics
	- DiT
	- Qwen-Image
	- ValiantCat
	---

	<p align="center">
	<img src="https://ai.static.ad2.cc/banner.png" width="1000"/>
	</p>

	---

	# 🌈 Qwen-Image-Edit-MeiTu

	This model — Qwen-Image-Edit-MeiTu — is an improved variant of [Qwen/Qwen-Image-Edit](https://huggingface.co/Qwen/Qwen-Image-Edit), built with DiT-based architecture fine-tuning to enhance visual consistency, aesthetic quality, and structural alignment in complex edits.

	Developed by Valiant Cat AI Lab, this version aims to further close the gap between high-fidelity semantic editing and coherent artistic rendering, achieving a more natural and professional output across a wide range of prompts and subjects.

	---

	## ✨ Key Improvements

	* Enhanced Consistency:
	Utilizes DiT (Diffusion Transformer) fine-tuning to ensure structural stability between input and edited regions, maintaining global spatial coherence.

	* Aesthetic Optimization:
	Trained with aesthetic discriminators and curated aesthetic score datasets, producing more pleasing colors, contrast, and light balance.

	* Better Detail Preservation:
	Improved low-level reconstruction for fine details such as textures, faces, and typography.

	* Broader Scene Adaptability:
	Performs well on portraits, environments, product photos, and illustrations, supporting both semantic and appearance-based editing.

	---

	## 🖼️ Showcase

	Below are examples of consistency and aesthetic improvement in complex editing scenarios:

	\| Input & Output \|
	\|----------------\|
	\| <img src="preview/result1.png" width="800"/> \|
	\| <img src="preview/result2.png" width="800"/> \|
	\| <img src="preview/result3.png" width="800"/> \|
	\| <img src="preview/result4.png" width="800"/> \|
	\| <img src="preview/result5.png" width="800"/> \|



	## 💬 Recommended Prompts

	Try these prompts to explore the model’s strengths:

	* “make the lighting soft and cinematic with better balance”
	* “enhance the photo’s composition and maintain realism”
	* “refine skin tone and texture consistency”
	* “improve the global color tone and aesthetic harmony”
	* “increase photo realism and clarity without changing content”

	---

	## 🧩 Integration with ComfyUI

	This model works seamlessly with a modified [ComfyUI Qwen-Image-Edit workflow](https://huggingface.co/valiantcat/Qwen-Image-Edit-MeiTu/blob/main/Qwen-Edit-MeiTu.json).
	Just use this model in the Unet node to workflow for edit image.

	---

	## 📥 Download Model

	Weights available in Safetensors format:

	👉 [Download Qwen-Image-Edit-MeiTu](https://huggingface.co/valiantcat/Qwen-Image-Edit-MeiTu)

	---

	## 🧠 Training

	This model was trained and optimized by the
	AI Laboratory of Chongqing Valiant Cat Technology Co., LTD.
	Visit [https://vvicat.com/](https://vvicat.com/) for business collaborations or research partnerships.

	---

	## 📄 Related Paper

	This model is part of the Qwen-Edit+ research line and is associated with the following preprint:

	Fan Tang, Siyuan Li
	Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation.
	Research Square, Version 1, 08 April 2026.
	DOI: [10.21203/rs.3.rs-9352857/v1](https://doi.org/10.21203/rs.3.rs-9352857/v1)

	---

	## 📚 Citation

	If you use this model, please cite:

	```bibtex
	@article{tang2026qweneditplus,
	author = {Fan Tang and Siyuan Li},
	title = {Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation},
	journal = {Research Square},
	year = {2026},
	doi = {10.21203/rs.3.rs-9352857/v1},
	url = {https://doi.org/10.21203/rs.3.rs-9352857/v1}
	}
	```

	---

	## 📜 License

	Licensed under Apache 2.0.

	---

	## 💼 Join Us

	We are hiring research engineers and creative ML practitioners at
	Chongqing Valiant Cat Technology Co., LTD — reach out via
	📧 tommy@vvicat.com