Add metadata and link to paper
Browse filesHi there! This PR improves the model card for Longcat-Image-Turbo by adding relevant YAML metadata for better discoverability.
Specifically, I have:
- Added `pipeline_tag: text-to-image`.
- Added `library_name: diffusers` (based on the model structure).
- Added `license: mit`.
- Included a direct link to the paper [Continuous-Time Distribution Matching for Few-Step Diffusion Distillation](https://huggingface.co/papers/2605.06376) and its GitHub repository.
README.md
CHANGED
|
@@ -1,3 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
<h1 align="center">
|
| 2 |
Continuous-Time Distribution Matching for Few-Step Diffusion Distillation
|
| 3 |
</h1>
|
|
@@ -16,12 +22,16 @@
|
|
| 16 |
<a href="https://github.com/byliutao/cdm">
|
| 17 |
<img src="https://img.shields.io/badge/GitHub-byliutao%2Fcdm-black?logo=github&logoColor=white" alt="GitHub">
|
| 18 |
</a>
|
| 19 |
-
<a href="
|
| 20 |
-
<img src="https://img.shields.io/badge/Paper-2605.06376-b31b1b?logo=arxiv&logoColor=white" alt="
|
| 21 |
</a>
|
| 22 |
|
| 23 |
</div>
|
| 24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
<p align="center">
|
| 26 |
<a href="#algorithm-overview">Algorithm Overview</a> •
|
| 27 |
<a href="#4-nfe-generation-results">Results</a> •
|
|
@@ -61,22 +71,19 @@
|
|
| 61 |
|
| 62 |
## Inference
|
| 63 |
|
|
|
|
|
|
|
| 64 |
```bash
|
| 65 |
# Clone this repository
|
| 66 |
git clone https://github.com/byliutao/cdm.git
|
| 67 |
cd cdm
|
| 68 |
|
| 69 |
-
# [Optional] Use HuggingFace mirror if huggingface.co is not accessible
|
| 70 |
-
export HF_ENDPOINT="https://hf-mirror.com"
|
| 71 |
-
export HF_TOKEN="hf_xxx"
|
| 72 |
-
|
| 73 |
# Create and activate the inference environment
|
| 74 |
conda create -n cdm_infer python=3.10
|
| 75 |
conda activate cdm_infer
|
| 76 |
pip install -r config/requirements_infer.txt
|
| 77 |
|
| 78 |
# Run inference
|
| 79 |
-
python scripts/infer/sd3_m.py # SD3-Medium
|
| 80 |
python scripts/infer/longcat.py # LongCat
|
| 81 |
```
|
| 82 |
|
|
@@ -87,64 +94,13 @@ python scripts/infer/longcat.py # LongCat
|
|
| 87 |
conda create -n cdm_train python=3.10
|
| 88 |
conda activate cdm_train
|
| 89 |
pip install -r config/requirements_train.txt
|
| 90 |
-
pip install flash-attn==2.7.4.post1 --no-build-isolation # May take 1-2 hours
|
| 91 |
|
| 92 |
# Launch training with FSDP2
|
| 93 |
-
accelerate launch --config_file config/accelerate_fsdp2.yaml \
|
| 94 |
-
--num_processes 8 -m scripts.train \
|
| 95 |
-
--config config/config.py:sd3 # SD3-Medium
|
| 96 |
-
|
| 97 |
accelerate launch --config_file config/accelerate_fsdp2.yaml \
|
| 98 |
--num_processes 8 -m scripts.train \
|
| 99 |
--config config/config.py:longcat # LongCat
|
| 100 |
```
|
| 101 |
|
| 102 |
-
## Evaluation
|
| 103 |
-
|
| 104 |
-
Evaluation is split into two phases: **image generation** and **metric computation**.
|
| 105 |
-
|
| 106 |
-
### Step 1 — Export a checkpoint to a pipeline
|
| 107 |
-
|
| 108 |
-
```bash
|
| 109 |
-
conda activate cdm_train
|
| 110 |
-
|
| 111 |
-
python -m scripts.save \
|
| 112 |
-
--experiment_dir "logs/experiments/sd3/test" \
|
| 113 |
-
--output_dir "logs/pipelines/test" \
|
| 114 |
-
--checkpoint_steps "2000"
|
| 115 |
-
```
|
| 116 |
-
|
| 117 |
-
### Step 2 — Generate images
|
| 118 |
-
|
| 119 |
-
```bash
|
| 120 |
-
accelerate launch --num_processes 8 -m scripts.eval \
|
| 121 |
-
--phase generate \
|
| 122 |
-
--model_path "logs/pipelines/test/checkpoint-2000" \
|
| 123 |
-
--eval_metrics imagereward clipscore pickscore hpsv2 hpsv3 aesthetic ocr dpgbench \
|
| 124 |
-
--output_dir "logs/evaluations/test" \
|
| 125 |
-
--base_model sd3 \
|
| 126 |
-
--save_images
|
| 127 |
-
```
|
| 128 |
-
|
| 129 |
-
### Step 3 — Compute metrics
|
| 130 |
-
|
| 131 |
-
```bash
|
| 132 |
-
# Create a separate environment for evaluation dependencies
|
| 133 |
-
conda create -n cdm_eval python=3.10
|
| 134 |
-
conda activate cdm_eval
|
| 135 |
-
pip install -r config/requirements_eval.txt
|
| 136 |
-
pip install image-reward --no-deps
|
| 137 |
-
pip install fairseq --no-deps
|
| 138 |
-
|
| 139 |
-
# NOTE: If running on multiple GPUs, download checkpoints on 1 GPU first.
|
| 140 |
-
# For FID evaluation, place COCO 2014 val images under: dataset/coco2014val_10k/images
|
| 141 |
-
|
| 142 |
-
accelerate launch --num_processes 8 -m scripts.eval \
|
| 143 |
-
--phase evaluate \
|
| 144 |
-
--eval_metrics imagereward clipscore pickscore hpsv2 hpsv3 aesthetic ocr dpgbench \
|
| 145 |
-
--output_dir "logs/evaluations/test"
|
| 146 |
-
```
|
| 147 |
-
|
| 148 |
## License
|
| 149 |
|
| 150 |
This project is licensed under the MIT License — see the [LICENSE](LICENSE) file for details.
|
|
@@ -163,4 +119,4 @@ If our work assists your research, please consider giving us a star ⭐ or citin
|
|
| 163 |
primaryClass={cs.CV},
|
| 164 |
url={https://arxiv.org/abs/2605.06376},
|
| 165 |
}
|
| 166 |
-
```
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
library_name: diffusers
|
| 4 |
+
pipeline_tag: text-to-image
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
<h1 align="center">
|
| 8 |
Continuous-Time Distribution Matching for Few-Step Diffusion Distillation
|
| 9 |
</h1>
|
|
|
|
| 22 |
<a href="https://github.com/byliutao/cdm">
|
| 23 |
<img src="https://img.shields.io/badge/GitHub-byliutao%2Fcdm-black?logo=github&logoColor=white" alt="GitHub">
|
| 24 |
</a>
|
| 25 |
+
<a href="https://huggingface.co/papers/2605.06376">
|
| 26 |
+
<img src="https://img.shields.io/badge/Paper-2605.06376-b31b1b?logo=arxiv&logoColor=white" alt="Paper">
|
| 27 |
</a>
|
| 28 |
|
| 29 |
</div>
|
| 30 |
|
| 31 |
+
This repository contains the weights for Longcat-Image-Turbo, a few-step distilled version of Longcat-Image using the **Continuous-Time Distribution Matching (CDM)** method presented in [Continuous-Time Distribution Matching for Few-Step Diffusion Distillation](https://huggingface.co/papers/2605.06376).
|
| 32 |
+
|
| 33 |
+
CDM migrates the Distribution Matching Distillation (DMD) framework from discrete anchoring to continuous optimization, allowing for high-quality image generation with very few steps (e.g., 4 NFE).
|
| 34 |
+
|
| 35 |
<p align="center">
|
| 36 |
<a href="#algorithm-overview">Algorithm Overview</a> •
|
| 37 |
<a href="#4-nfe-generation-results">Results</a> •
|
|
|
|
| 71 |
|
| 72 |
## Inference
|
| 73 |
|
| 74 |
+
To use this model, please refer to the [GitHub repository](https://github.com/byliutao/cdm).
|
| 75 |
+
|
| 76 |
```bash
|
| 77 |
# Clone this repository
|
| 78 |
git clone https://github.com/byliutao/cdm.git
|
| 79 |
cd cdm
|
| 80 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
# Create and activate the inference environment
|
| 82 |
conda create -n cdm_infer python=3.10
|
| 83 |
conda activate cdm_infer
|
| 84 |
pip install -r config/requirements_infer.txt
|
| 85 |
|
| 86 |
# Run inference
|
|
|
|
| 87 |
python scripts/infer/longcat.py # LongCat
|
| 88 |
```
|
| 89 |
|
|
|
|
| 94 |
conda create -n cdm_train python=3.10
|
| 95 |
conda activate cdm_train
|
| 96 |
pip install -r config/requirements_train.txt
|
|
|
|
| 97 |
|
| 98 |
# Launch training with FSDP2
|
|
|
|
|
|
|
|
|
|
|
|
|
| 99 |
accelerate launch --config_file config/accelerate_fsdp2.yaml \
|
| 100 |
--num_processes 8 -m scripts.train \
|
| 101 |
--config config/config.py:longcat # LongCat
|
| 102 |
```
|
| 103 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 104 |
## License
|
| 105 |
|
| 106 |
This project is licensed under the MIT License — see the [LICENSE](LICENSE) file for details.
|
|
|
|
| 119 |
primaryClass={cs.CV},
|
| 120 |
url={https://arxiv.org/abs/2605.06376},
|
| 121 |
}
|
| 122 |
+
```
|