| --- |
| license: mit |
| library_name: pytorch |
| tags: |
| - image-generation |
| - diffusion |
| - imagenet |
| - frechet-distance |
| - fd-loss |
| datasets: |
| - imagenet-1k |
| --- |
| |
| # Representation Fréchet Loss for Visual Generation |
|
|
| [](https://arxiv.org/abs/2604.28190) |
| [](https://github.com/Jiawei-Yang/FD-Loss) |
|
|
| This repository hosts the released checkpoints and reference data for |
| **Representation Fréchet Loss for Visual Generation**. |
|
|
| Paper: [Representation Fréchet Loss for Visual Generation](https://arxiv.org/abs/2604.28190). |
|
|
| Code, training scripts, and evaluation utilities are available at: |
| [github.com/Jiawei-Yang/FD-Loss](https://github.com/Jiawei-Yang/FD-Loss). |
|
|
| FD-Loss post-trains visual generators by matching generated-image feature |
| distributions to real-image feature distributions in frozen representation spaces. |
| This release contains base and FD-loss post-trained checkpoints for pMF, iMF, and |
| JiT models, together with the reference statistics used by the paper. |
|
|
| ## Files |
|
|
| ```text |
| checkpoints/ |
| base/ |
| iMF-B.pth |
| iMF-L.pth |
| iMF-XL.pth |
| JiT-B.pth |
| JiT-L.pth |
| JiT-H.pth |
| pMF-B_256.pth |
| pMF-L_256.pth |
| pMF-H_256.pth |
| pMF-B_512.pth |
| pMF-L_512.pth |
| pMF-H_512.pth |
| post-trained/ |
| iMF-B_FD-Inception.pth |
| iMF-B_FD-SIM.pth |
| iMF-L_FD-Inception.pth |
| iMF-L_FD-SIM.pth |
| iMF-XL_FD-Inception.pth |
| iMF-XL_FD-SIM.pth |
| JiT-B_FD-Inception.pth |
| JiT-B_FD-SIM.pth |
| JiT-L_FD-Inception.pth |
| JiT-L_FD-SIM.pth |
| JiT-H_FD-Inception.pth |
| JiT-H_FD-SIM.pth |
| pMF-B_FD-Inception.pth |
| pMF-B_FD-SIM.pth |
| pMF-L_FD-Inception.pth |
| pMF-L_FD-SIM.pth |
| pMF-H_FD-Inception.pth |
| pMF-H_FD-SIM.pth |
| pMF-B_512_FD-SIM.pth |
| pMF-L_512_FD-SIM.pth |
| pMF-H_512_FD-SIM.pth |
| data/ |
| fid_stats/ |
| paper_ref_stats.pkl |
| train.txt |
| val.txt |
| val_labeled.txt |
| ``` |
|
|
| ## Download |
|
|
| Install the Hugging Face CLI: |
|
|
| ```bash |
| pip install -U huggingface_hub |
| ``` |
|
|
| Download all checkpoints and data files into a clone of the code repository: |
|
|
| ```bash |
| hf download jjiaweiyang/FD-Loss \ |
| --local-dir . \ |
| --include "checkpoints/**/*.pth" \ |
| --include "data/**" |
| ``` |
|
|
| For released-checkpoint evaluation only: |
|
|
| ```bash |
| hf download jjiaweiyang/FD-Loss \ |
| --local-dir . \ |
| --include "checkpoints/post-trained/*.pth" \ |
| --include "data/**" |
| ``` |
|
|
| Then unpack the bundled reference statistics: |
|
|
| ```bash |
| python scripts/extract_paper_ref_stats.py |
| ``` |
|
|
| ## Evaluation |
|
|
| Run from the root of the GitHub repository: |
|
|
| ```bash |
| PRESET=pMF_H_256 \ |
| CKPT_PATH=checkpoints/post-trained/pMF-H_FD-SIM.pth \ |
| GPUS_PER_NODE=8 \ |
| bash scripts/evaluate_released_ckpt.sh |
| |
| PRESET=JiT_H \ |
| CKPT_PATH=checkpoints/post-trained/JiT-H_FD-SIM.pth \ |
| GPUS_PER_NODE=8 \ |
| bash scripts/evaluate_released_ckpt.sh |
| |
| PRESET=iMF_XL \ |
| CKPT_PATH=checkpoints/post-trained/iMF-XL_FD-SIM.pth \ |
| GPUS_PER_NODE=8 \ |
| bash scripts/evaluate_released_ckpt.sh |
| ``` |
|
|
| Additional presets, smoke-test settings, and the Table 1 ablation, Table 2 |
| repurposing, and Table 3 scalability scripts are documented in the GitHub |
| repository. |
|
|
| The evaluator reports both raw FD and the paper-normalized metrics. `FDr` is raw |
| FD divided by the validation-set raw FD for the corresponding representation |
| space, and `FDr-6` is the arithmetic mean over Inception, ConvNeXt, DINOv2, MAE, |
| SigLIP, and CLIP. The released code uses these validation-set raw FD values: |
|
|
| | Representation | Inception | ConvNeXt | DINOv2 | MAE | SigLIP | CLIP | |
| |---|---:|---:|---:|---:|---:|---:| |
| | valFD | 1.68 | 56.87 | 14.19 | 0.04 | 0.60 | 5.60 | |
|
|
| To reproduce these normalizers from ImageNet validation images: |
|
|
| ```bash |
| DATA_ROOT=/path/to/imagenet \ |
| torchrun --nproc_per_node=8 scripts/compute_valfd.py \ |
| --data_root "$DATA_ROOT" |
| ``` |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{yang2026fdloss, |
| title={Representation Fréchet Loss for Visual Generation}, |
| author={Yang, Jiawei and Geng, Zhengyang and Ju, Xuan and Tian, Yonglong and Wang, Yue}, |
| journal={arXiv:2604.28190}, |
| url={https://arxiv.org/abs/2604.28190}, |
| year={2026} |
| } |
| ``` |
|
|
| If you have any questions, feel free to contact me through email |
| ([yangjiaw@usc.edu](mailto:yangjiaw@usc.edu)). |
|
|