Continuous-Time Distribution Matching for Few-Step Diffusion Distillation

Project Page SD3-Medium Model LongCat Model GitHub arXiv Paper

Algorithm OverviewResultsInferenceTrainingEvaluationCitation

Teaser: High-quality images generated with only 4 NFE

## Algorithm Overview

Pipeline overview of Continuous-Time Distribution Matching

**Overview of Continuous-Time Distribution Matching (CDM).** **Top:** Our approach employs a dynamic continuous time schedule during backward simulation, sampling intermediate anchors uniformly from (0, 1]. **Bottom Left:** CFG augmentation (CA) and distribution matching (DM) operate on this dynamic schedule to align text-image conditions and data distributions at on-trajectory anchors. **Bottom Right:** To address inter-anchor inconsistency, the proposed CDM objective explicitly extrapolates off-trajectory latents using the predicted velocity. ## 4-NFE Generation Results ### SD3-Medium

SD3.5-Medium 4-NFE generation samples

### LongCat

LongCat 4-NFE generation samples

--- ## Inference ```bash # Clone this repository git clone https://github.com/byliutao/cdm.git cd cdm # [Optional] Use HuggingFace mirror if huggingface.co is not accessible export HF_ENDPOINT="https://hf-mirror.com" export HF_TOKEN="hf_xxx" # Create and activate the inference environment conda create -n cdm_infer python=3.10 conda activate cdm_infer pip install -r config/requirements_infer.txt # Run inference python scripts/infer/sd3_m.py # SD3-Medium python scripts/infer/longcat.py # LongCat ``` ## Training ```bash # Create and activate the training environment conda create -n cdm_train python=3.10 conda activate cdm_train pip install -r config/requirements_train.txt pip install flash-attn==2.7.4.post1 --no-build-isolation # May take 1-2 hours # Launch training with FSDP2 accelerate launch --config_file config/accelerate_fsdp2.yaml \ --num_processes 8 -m scripts.train \ --config config/config.py:sd3 # SD3-Medium accelerate launch --config_file config/accelerate_fsdp2.yaml \ --num_processes 8 -m scripts.train \ --config config/config.py:longcat # LongCat ``` ## Evaluation Evaluation is split into two phases: **image generation** and **metric computation**. ### Step 1 — Export a checkpoint to a pipeline ```bash conda activate cdm_train python -m scripts.save \ --experiment_dir "logs/experiments/sd3/test" \ --output_dir "logs/pipelines/test" \ --checkpoint_steps "2000" ``` ### Step 2 — Generate images ```bash accelerate launch --num_processes 8 -m scripts.eval \ --phase generate \ --model_path "logs/pipelines/test/checkpoint-2000" \ --eval_metrics imagereward clipscore pickscore hpsv2 hpsv3 aesthetic ocr dpgbench \ --output_dir "logs/evaluations/test" \ --base_model sd3 \ --save_images ``` ### Step 3 — Compute metrics ```bash # Create a separate environment for evaluation dependencies conda create -n cdm_eval python=3.10 conda activate cdm_eval pip install -r config/requirements_eval.txt pip install image-reward --no-deps pip install fairseq --no-deps # NOTE: If running on multiple GPUs, download checkpoints on 1 GPU first. # For FID evaluation, place COCO 2014 val images under: dataset/coco2014val_10k/images accelerate launch --num_processes 8 -m scripts.eval \ --phase evaluate \ --eval_metrics imagereward clipscore pickscore hpsv2 hpsv3 aesthetic ocr dpgbench \ --output_dir "logs/evaluations/test" ``` ## License This project is licensed under the MIT License — see the [LICENSE](LICENSE) file for details. ## Citation If our work assists your research, please consider giving us a star ⭐ or citing us: ```bibtex @misc{liu2026continuoustimedistributionmatchingfewstep, title={Continuous-Time Distribution Matching for Few-Step Diffusion Distillation}, author={Tao Liu and Hao Yan and Mengting Chen and Taihang Hu and Zhengrong Yue and Zihao Pan and Jinsong Lan and Xiaoyong Zhu and Ming-Ming Cheng and Bo Zheng and Yaxing Wang}, year={2026}, eprint={2605.06376}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2605.06376}, } ```