Update README.md

9ebe614 verified 1 day ago

5.54 kB

	<h1 align="center">
	Continuous-Time Distribution Matching for Few-Step Diffusion Distillation
	</h1>

	<div align="center">

	<a href="https://byliutao.github.io/cdm_page/">
	<img src="https://img.shields.io/badge/Project_Page-0055b3?logo=githubpages&logoColor=white" alt="Project Page">
	</a>
	<a href="https://huggingface.co/byliutao/stable-diffusion-3-medium-turbo">
	<img src="https://img.shields.io/badge/%F0%9F%A4%97%20Model-SD3_Medium-ffc107" alt="SD3-Medium Model">
	</a>
	<a href="https://huggingface.co/byliutao/Longcat-Image-Turbo">
	<img src="https://img.shields.io/badge/%F0%9F%A4%97%20Model-LongCat-ffc107" alt="LongCat Model">
	</a>
	<a href="https://github.com/byliutao/cdm">
	<img src="https://img.shields.io/badge/GitHub-byliutao%2Fcdm-black?logo=github&logoColor=white" alt="GitHub">
	</a>
	<a href="http://arxiv.org/abs/2605.06376">
	<img src="https://img.shields.io/badge/Paper-2605.06376-b31b1b?logo=arxiv&logoColor=white" alt="arXiv Paper">
	</a>

	</div>

	<p align="center">
	<a href="#algorithm-overview">Algorithm Overview</a> •
	<a href="#4-nfe-generation-results">Results</a> •
	<a href="#inference">Inference</a> •
	<a href="#training">Training</a> •
	<a href="#evaluation">Evaluation</a> •
	<a href="#citation">Citation</a>
	</p>

	<p align="center">
	<img src="assets/teaser.png" width="95%" alt="Teaser: High-quality images generated with only 4 NFE">
	</p>

	## Algorithm Overview

	<p align="center">
	<img src="assets/pipe.png" width="90%" alt="Pipeline overview of Continuous-Time Distribution Matching">
	</p>

	Overview of Continuous-Time Distribution Matching (CDM). Top: Our approach employs a dynamic continuous time schedule during backward simulation, sampling intermediate anchors uniformly from (0, 1]. Bottom Left: CFG augmentation (CA) and distribution matching (DM) operate on this dynamic schedule to align text-image conditions and data distributions at on-trajectory anchors. Bottom Right: To address inter-anchor inconsistency, the proposed CDM objective explicitly extrapolates off-trajectory latents using the predicted velocity.

	## 4-NFE Generation Results

	### SD3-Medium

	<p align="center">
	<img src="assets/sd3.png" width="90%" alt="SD3.5-Medium 4-NFE generation samples">
	</p>

	### LongCat

	<p align="center">
	<img src="assets/longcat.png" width="90%" alt="LongCat 4-NFE generation samples">
	</p>

	---

	## Inference

	```bash
	# Clone this repository
	git clone https://github.com/byliutao/cdm.git
	cd cdm

	# [Optional] Use HuggingFace mirror if huggingface.co is not accessible
	export HF_ENDPOINT="https://hf-mirror.com"
	export HF_TOKEN="hf_xxx"

	# Create and activate the inference environment
	conda create -n cdm_infer python=3.10
	conda activate cdm_infer
	pip install -r config/requirements_infer.txt

	# Run inference
	python scripts/infer/sd3_m.py # SD3-Medium
	python scripts/infer/longcat.py # LongCat
	```

	## Training

	```bash

	# Create and activate the training environment
	conda create -n cdm_train python=3.10
	conda activate cdm_train
	pip install -r config/requirements_train.txt
	pip install flash-attn==2.7.4.post1 --no-build-isolation # May take 1-2 hours

	# Launch training with FSDP2
	accelerate launch --config_file config/accelerate_fsdp2.yaml \
	--num_processes 8 -m scripts.train \
	--config config/config.py:sd3 # SD3-Medium

	accelerate launch --config_file config/accelerate_fsdp2.yaml \
	--num_processes 8 -m scripts.train \
	--config config/config.py:longcat # LongCat
	```

	## Evaluation

	Evaluation is split into two phases: image generation and metric computation.

	### Step 1 — Export a checkpoint to a pipeline

	```bash
	conda activate cdm_train

	python -m scripts.save \
	--experiment_dir "logs/experiments/sd3/test" \
	--output_dir "logs/pipelines/test" \
	--checkpoint_steps "2000"
	```

	### Step 2 — Generate images

	```bash
	accelerate launch --num_processes 8 -m scripts.eval \
	--phase generate \
	--model_path "logs/pipelines/test/checkpoint-2000" \
	--eval_metrics imagereward clipscore pickscore hpsv2 hpsv3 aesthetic ocr dpgbench \
	--output_dir "logs/evaluations/test" \
	--base_model sd3 \
	--save_images
	```

	### Step 3 — Compute metrics

	```bash
	# Create a separate environment for evaluation dependencies
	conda create -n cdm_eval python=3.10
	conda activate cdm_eval
	pip install -r config/requirements_eval.txt
	pip install image-reward --no-deps
	pip install fairseq --no-deps

	# NOTE: If running on multiple GPUs, download checkpoints on 1 GPU first.
	# For FID evaluation, place COCO 2014 val images under: dataset/coco2014val_10k/images

	accelerate launch --num_processes 8 -m scripts.eval \
	--phase evaluate \
	--eval_metrics imagereward clipscore pickscore hpsv2 hpsv3 aesthetic ocr dpgbench \
	--output_dir "logs/evaluations/test"
	```

	## License

	This project is licensed under the MIT License — see the [LICENSE](LICENSE) file for details.

	## Citation

	If our work assists your research, please consider giving us a star ⭐ or citing us:

	```bibtex
	@misc{liu2026continuoustimedistributionmatchingfewstep,
	title={Continuous-Time Distribution Matching for Few-Step Diffusion Distillation},
	author={Tao Liu and Hao Yan and Mengting Chen and Taihang Hu and Zhengrong Yue and Zihao Pan and Jinsong Lan and Xiaoyong Zhu and Ming-Ming Cheng and Bo Zheng and Yaxing Wang},
	year={2026},
	eprint={2605.06376},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2605.06376},
	}
	```