File size: 5,542 Bytes
0762e6e b8e3e0c 0762e6e 5b8ca55 0762e6e 07e01a8 0762e6e cb9c2c9 0762e6e cb9c2c9 0762e6e cb9c2c9 0762e6e cb9c2c9 0762e6e cb9c2c9 0762e6e 9ebe614 0762e6e cb9c2c9 0762e6e 9ebe614 0762e6e 5b8ca55 0762e6e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 | <h1 align="center">
Continuous-Time Distribution Matching for Few-Step Diffusion Distillation
</h1>
<div align="center">
<a href="https://byliutao.github.io/cdm_page/">
<img src="https://img.shields.io/badge/Project_Page-0055b3?logo=githubpages&logoColor=white" alt="Project Page">
</a>
<a href="https://huggingface.co/byliutao/stable-diffusion-3-medium-turbo">
<img src="https://img.shields.io/badge/%F0%9F%A4%97%20Model-SD3_Medium-ffc107" alt="SD3-Medium Model">
</a>
<a href="https://huggingface.co/byliutao/Longcat-Image-Turbo">
<img src="https://img.shields.io/badge/%F0%9F%A4%97%20Model-LongCat-ffc107" alt="LongCat Model">
</a>
<a href="https://github.com/byliutao/cdm">
<img src="https://img.shields.io/badge/GitHub-byliutao%2Fcdm-black?logo=github&logoColor=white" alt="GitHub">
</a>
<a href="http://arxiv.org/abs/2605.06376">
<img src="https://img.shields.io/badge/Paper-2605.06376-b31b1b?logo=arxiv&logoColor=white" alt="arXiv Paper">
</a>
</div>
<p align="center">
<a href="#algorithm-overview">Algorithm Overview</a> •
<a href="#4-nfe-generation-results">Results</a> •
<a href="#inference">Inference</a> •
<a href="#training">Training</a> •
<a href="#evaluation">Evaluation</a> •
<a href="#citation">Citation</a>
</p>
<p align="center">
<img src="assets/teaser.png" width="95%" alt="Teaser: High-quality images generated with only 4 NFE">
</p>
## Algorithm Overview
<p align="center">
<img src="assets/pipe.png" width="90%" alt="Pipeline overview of Continuous-Time Distribution Matching">
</p>
**Overview of Continuous-Time Distribution Matching (CDM).** **Top:** Our approach employs a dynamic continuous time schedule during backward simulation, sampling intermediate anchors uniformly from (0, 1]. **Bottom Left:** CFG augmentation (CA) and distribution matching (DM) operate on this dynamic schedule to align text-image conditions and data distributions at on-trajectory anchors. **Bottom Right:** To address inter-anchor inconsistency, the proposed CDM objective explicitly extrapolates off-trajectory latents using the predicted velocity.
## 4-NFE Generation Results
### SD3-Medium
<p align="center">
<img src="assets/sd3.png" width="90%" alt="SD3.5-Medium 4-NFE generation samples">
</p>
### LongCat
<p align="center">
<img src="assets/longcat.png" width="90%" alt="LongCat 4-NFE generation samples">
</p>
---
## Inference
```bash
# Clone this repository
git clone https://github.com/byliutao/cdm.git
cd cdm
# [Optional] Use HuggingFace mirror if huggingface.co is not accessible
export HF_ENDPOINT="https://hf-mirror.com"
export HF_TOKEN="hf_xxx"
# Create and activate the inference environment
conda create -n cdm_infer python=3.10
conda activate cdm_infer
pip install -r config/requirements_infer.txt
# Run inference
python scripts/infer/sd3_m.py # SD3-Medium
python scripts/infer/longcat.py # LongCat
```
## Training
```bash
# Create and activate the training environment
conda create -n cdm_train python=3.10
conda activate cdm_train
pip install -r config/requirements_train.txt
pip install flash-attn==2.7.4.post1 --no-build-isolation # May take 1-2 hours
# Launch training with FSDP2
accelerate launch --config_file config/accelerate_fsdp2.yaml \
--num_processes 8 -m scripts.train \
--config config/config.py:sd3 # SD3-Medium
accelerate launch --config_file config/accelerate_fsdp2.yaml \
--num_processes 8 -m scripts.train \
--config config/config.py:longcat # LongCat
```
## Evaluation
Evaluation is split into two phases: **image generation** and **metric computation**.
### Step 1 — Export a checkpoint to a pipeline
```bash
conda activate cdm_train
python -m scripts.save \
--experiment_dir "logs/experiments/sd3/test" \
--output_dir "logs/pipelines/test" \
--checkpoint_steps "2000"
```
### Step 2 — Generate images
```bash
accelerate launch --num_processes 8 -m scripts.eval \
--phase generate \
--model_path "logs/pipelines/test/checkpoint-2000" \
--eval_metrics imagereward clipscore pickscore hpsv2 hpsv3 aesthetic ocr dpgbench \
--output_dir "logs/evaluations/test" \
--base_model sd3 \
--save_images
```
### Step 3 — Compute metrics
```bash
# Create a separate environment for evaluation dependencies
conda create -n cdm_eval python=3.10
conda activate cdm_eval
pip install -r config/requirements_eval.txt
pip install image-reward --no-deps
pip install fairseq --no-deps
# NOTE: If running on multiple GPUs, download checkpoints on 1 GPU first.
# For FID evaluation, place COCO 2014 val images under: dataset/coco2014val_10k/images
accelerate launch --num_processes 8 -m scripts.eval \
--phase evaluate \
--eval_metrics imagereward clipscore pickscore hpsv2 hpsv3 aesthetic ocr dpgbench \
--output_dir "logs/evaluations/test"
```
## License
This project is licensed under the MIT License — see the [LICENSE](LICENSE) file for details.
## Citation
If our work assists your research, please consider giving us a star ⭐ or citing us:
```bibtex
@misc{liu2026continuoustimedistributionmatchingfewstep,
title={Continuous-Time Distribution Matching for Few-Step Diffusion Distillation},
author={Tao Liu and Hao Yan and Mengting Chen and Taihang Hu and Zhengrong Yue and Zihao Pan and Jinsong Lan and Xiaoyong Zhu and Ming-Ming Cheng and Bo Zheng and Yaxing Wang},
year={2026},
eprint={2605.06376},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.06376},
}
```
|