FD-Loss / README.md
jjiaweiyang's picture
Add arXiv badge
88ce2bb verified
---
license: mit
library_name: pytorch
tags:
- image-generation
- diffusion
- imagenet
- frechet-distance
- fd-loss
datasets:
- imagenet-1k
---
# Representation Fréchet Loss for Visual Generation
[![arXiv](https://img.shields.io/badge/arXiv-2604.28190-b31b1b.svg)](https://arxiv.org/abs/2604.28190)
[![GitHub](https://img.shields.io/badge/GitHub-code-181717.svg)](https://github.com/Jiawei-Yang/FD-Loss)
This repository hosts the released checkpoints and reference data for
**Representation Fréchet Loss for Visual Generation**.
Paper: [Representation Fréchet Loss for Visual Generation](https://arxiv.org/abs/2604.28190).
Code, training scripts, and evaluation utilities are available at:
[github.com/Jiawei-Yang/FD-Loss](https://github.com/Jiawei-Yang/FD-Loss).
FD-Loss post-trains visual generators by matching generated-image feature
distributions to real-image feature distributions in frozen representation spaces.
This release contains base and FD-loss post-trained checkpoints for pMF, iMF, and
JiT models, together with the reference statistics used by the paper.
## Files
```text
checkpoints/
base/
iMF-B.pth
iMF-L.pth
iMF-XL.pth
JiT-B.pth
JiT-L.pth
JiT-H.pth
pMF-B_256.pth
pMF-L_256.pth
pMF-H_256.pth
pMF-B_512.pth
pMF-L_512.pth
pMF-H_512.pth
post-trained/
iMF-B_FD-Inception.pth
iMF-B_FD-SIM.pth
iMF-L_FD-Inception.pth
iMF-L_FD-SIM.pth
iMF-XL_FD-Inception.pth
iMF-XL_FD-SIM.pth
JiT-B_FD-Inception.pth
JiT-B_FD-SIM.pth
JiT-L_FD-Inception.pth
JiT-L_FD-SIM.pth
JiT-H_FD-Inception.pth
JiT-H_FD-SIM.pth
pMF-B_FD-Inception.pth
pMF-B_FD-SIM.pth
pMF-L_FD-Inception.pth
pMF-L_FD-SIM.pth
pMF-H_FD-Inception.pth
pMF-H_FD-SIM.pth
pMF-B_512_FD-SIM.pth
pMF-L_512_FD-SIM.pth
pMF-H_512_FD-SIM.pth
data/
fid_stats/
paper_ref_stats.pkl
train.txt
val.txt
val_labeled.txt
```
## Download
Install the Hugging Face CLI:
```bash
pip install -U huggingface_hub
```
Download all checkpoints and data files into a clone of the code repository:
```bash
hf download jjiaweiyang/FD-Loss \
--local-dir . \
--include "checkpoints/**/*.pth" \
--include "data/**"
```
For released-checkpoint evaluation only:
```bash
hf download jjiaweiyang/FD-Loss \
--local-dir . \
--include "checkpoints/post-trained/*.pth" \
--include "data/**"
```
Then unpack the bundled reference statistics:
```bash
python scripts/extract_paper_ref_stats.py
```
## Evaluation
Run from the root of the GitHub repository:
```bash
PRESET=pMF_H_256 \
CKPT_PATH=checkpoints/post-trained/pMF-H_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh
PRESET=JiT_H \
CKPT_PATH=checkpoints/post-trained/JiT-H_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh
PRESET=iMF_XL \
CKPT_PATH=checkpoints/post-trained/iMF-XL_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh
```
Additional presets, smoke-test settings, and the Table 1 ablation, Table 2
repurposing, and Table 3 scalability scripts are documented in the GitHub
repository.
The evaluator reports both raw FD and the paper-normalized metrics. `FDr` is raw
FD divided by the validation-set raw FD for the corresponding representation
space, and `FDr-6` is the arithmetic mean over Inception, ConvNeXt, DINOv2, MAE,
SigLIP, and CLIP. The released code uses these validation-set raw FD values:
| Representation | Inception | ConvNeXt | DINOv2 | MAE | SigLIP | CLIP |
|---|---:|---:|---:|---:|---:|---:|
| valFD | 1.68 | 56.87 | 14.19 | 0.04 | 0.60 | 5.60 |
To reproduce these normalizers from ImageNet validation images:
```bash
DATA_ROOT=/path/to/imagenet \
torchrun --nproc_per_node=8 scripts/compute_valfd.py \
--data_root "$DATA_ROOT"
```
## Citation
```bibtex
@article{yang2026fdloss,
title={Representation Fréchet Loss for Visual Generation},
author={Yang, Jiawei and Geng, Zhengyang and Ju, Xuan and Tian, Yonglong and Wang, Yue},
journal={arXiv:2604.28190},
url={https://arxiv.org/abs/2604.28190},
year={2026}
}
```
If you have any questions, feel free to contact me through email
([yangjiaw@usc.edu](mailto:yangjiaw@usc.edu)).