Uni-Edit
/

Uni-Edit-BAGEL

text-generation

Model card Files Files and versions

Uni-Edit-BAGEL / README.md

zhengli1013's picture

Update README

304c4ad verified about 20 hours ago

|

history blame contribute delete

3.08 kB

	---
	base_model:
	- ByteDance-Seed/BAGEL-7B-MoT
	datasets:
	- Uni-Edit/Train-Data
	library_name: transformers
	pipeline_tag: any-to-any
	license: apache-2.0
	---

	<p align="left">
	<img src="https://github.com/zhengdian1/Uni-Edit/blob/main/assets/logo.jpg?raw=true" alt="Uni-Edit" width="480"/>
	</p>


	# 🥯 Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning


	[Project Page](https://zhengdian1.github.io/Uni-Edit-proj/) \| [GitHub Repository](https://github.com/zhengdian1/Uni-Edit) \| [Paper](https://arxiv.org/pdf/2605.21487)

	# 👀 Intro

	<div align="center">
	<img src="https://github.com/zhengdian1/Uni-Edit/blob/main/assets/teaser.webp?raw=true" alt="Uni-Edit Teaser" width="80%">
	</div>

	We introduce Uni-Edit, an intelligent image editing task that serves as the first general task for Unified Multimodal Model (UMM) tuning. Unlike conventional mixed multi-task training that suffers from inherent task conflicts and requires complex multi-stage pipelines, Uni-Edit breaks this paradigm. It achieves true mutual reinforcement by improving image understanding, generation, and editing capabilities simultaneously using only one task, one training stage, and one dataset.

	To overcome the limitations of simplistic existing editing data, we propose the first automated and scalable data synthesis pipeline for intelligent editing. By transforming diverse VQA data into complex instructions with embedded questions and nested logic, we build Uni-Edit-148k, a dedicated dataset pairing reasoning-intensive instructions with high-quality edited images.

	Extensive experiments on BAGEL and Janus-Pro demonstrate that tuning solely on Uni-Edit achieves comprehensive enhancements across all three multimodal capabilities without requiring any massive data mixing, balancing tricks, or auxiliary operations.

	## 🎥 Demo

	Refer to our website [[🌐Project Page]](https://zhengdian1.github.io/Uni-Edit-proj/)

	## 🚀 Training and Inference

	For detailed instructions on setup, training, inference, evaluation, data construction, please refer to the [official GitHub repository](https://github.com/zhengdian1/Uni-Edit).

	⚠️ IMPORTANT: Custom Architecture
	Because this is a custom architecture, you CANNOT load it directly via `AutoModel.from_pretrained()`. To run the provided inference code, you MUST physically merge these shards into a single `ema.safetensors` file on your local machine.

	Run the Python script in the [code](https://github.com/zhengdian1/Uni-Edit/merge.py) where you downloaded the repository.
	(Note: You need at least 54GB of free system RAM to perform this merge).

	## 📐 Citation

	If you find our work helpful for your research, please consider citing our work:

	```bibtex
	@article{zheng2026uniedit,
	title = {Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning},
	author = {Zheng, Dian and Zhang, Manyuan and Li, Hongyu and Liu, Hongbo and Zou, Kai and Feng, Kaituo and Li, Hongsheng},
	journal = {arXiv preprint arXiv:2605.21487},
	year = {2026}
	}
	```