--- base_model: - ByteDance-Seed/BAGEL-7B-MoT datasets: - Uni-Edit/Train-Data library_name: transformers pipeline_tag: any-to-any license: apache-2.0 ---

Uni-Edit

# πŸ₯― Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning [**Project Page**](https://zhengdian1.github.io/Uni-Edit-proj/) | [**GitHub Repository**](https://github.com/zhengdian1/Uni-Edit) | [**Paper**](https://arxiv.org/pdf/2605.21487) # πŸ‘€ Intro
Uni-Edit Teaser
We introduce **Uni-Edit**, an intelligent image editing task that serves as the **first general task for Unified Multimodal Model (UMM) tuning**. Unlike conventional mixed multi-task training that suffers from inherent task conflicts and requires complex multi-stage pipelines, Uni-Edit breaks this paradigm. It achieves true mutual reinforcement by **improving image understanding, generation, and editing capabilities simultaneously using only one task, one training stage, and one dataset.** To overcome the limitations of simplistic existing editing data, we propose the **first automated and scalable data synthesis pipeline** for intelligent editing. By transforming diverse VQA data into complex instructions with embedded questions and nested logic, we build **Uni-Edit-148k**, a dedicated dataset pairing reasoning-intensive instructions with high-quality edited images. Extensive experiments on BAGEL and Janus-Pro demonstrate that tuning solely on Uni-Edit achieves **comprehensive enhancements across all three multimodal capabilities** without requiring any massive data mixing, balancing tricks, or auxiliary operations. ## πŸŽ₯ Demo Refer to our website [[🌐Project Page]](https://zhengdian1.github.io/Uni-Edit-proj/) ## πŸš€ Training and Inference For detailed instructions on setup, training, inference, evaluation, data construction, please refer to the [official GitHub repository](https://github.com/zhengdian1/Uni-Edit). **⚠️ IMPORTANT: Custom Architecture** Because this is a custom architecture, you **CANNOT** load it directly via `AutoModel.from_pretrained()`. To run the provided inference code, you **MUST** physically merge these shards into a single `ema.safetensors` file on your local machine. Run the Python script in the [code](https://github.com/zhengdian1/Uni-Edit/merge.py) where you downloaded the repository. *(Note: You need at least 54GB of free system RAM to perform this merge).* ## πŸ“ Citation If you find our work helpful for your research, please consider citing our work: ```bibtex @article{zheng2026uniedit, title = {Uni-Edit: Intelligent Editing Is A General Task For Unified Model Tuning}, author = {Zheng, Dian and Zhang, Manyuan and Li, Hongyu and Liu, Hongbo and Zou, Kai and Feng, Kaituo and Li, Hongsheng}, journal = {arXiv preprint arXiv:2605.21487}, year = {2026} } ```