license: mit
library_name: pytorch
tags:
- image-generation
- diffusion
- imagenet
- frechet-distance
- fd-loss
datasets:
- imagenet-1k
Representation Fréchet Loss for Visual Generation
This repository hosts the released checkpoints and reference data for Representation Fréchet Loss for Visual Generation.
Paper: Representation Fréchet Loss for Visual Generation.
Code, training scripts, and evaluation utilities are available at: github.com/Jiawei-Yang/FD-Loss.
FD-Loss post-trains visual generators by matching generated-image feature distributions to real-image feature distributions in frozen representation spaces. This release contains base and FD-loss post-trained checkpoints for pMF, iMF, and JiT models, together with the reference statistics used by the paper.
Files
checkpoints/
base/
iMF-B.pth
iMF-L.pth
iMF-XL.pth
JiT-B.pth
JiT-L.pth
JiT-H.pth
pMF-B_256.pth
pMF-L_256.pth
pMF-H_256.pth
pMF-B_512.pth
pMF-L_512.pth
pMF-H_512.pth
post-trained/
iMF-B_FD-Inception.pth
iMF-B_FD-SIM.pth
iMF-L_FD-Inception.pth
iMF-L_FD-SIM.pth
iMF-XL_FD-Inception.pth
iMF-XL_FD-SIM.pth
JiT-B_FD-Inception.pth
JiT-B_FD-SIM.pth
JiT-L_FD-Inception.pth
JiT-L_FD-SIM.pth
JiT-H_FD-Inception.pth
JiT-H_FD-SIM.pth
pMF-B_FD-Inception.pth
pMF-B_FD-SIM.pth
pMF-L_FD-Inception.pth
pMF-L_FD-SIM.pth
pMF-H_FD-Inception.pth
pMF-H_FD-SIM.pth
pMF-B_512_FD-SIM.pth
pMF-L_512_FD-SIM.pth
pMF-H_512_FD-SIM.pth
data/
fid_stats/
paper_ref_stats.pkl
train.txt
val.txt
val_labeled.txt
Download
Install the Hugging Face CLI:
pip install -U huggingface_hub
Download all checkpoints and data files into a clone of the code repository:
hf download jjiaweiyang/FD-Loss \
--local-dir . \
--include "checkpoints/**/*.pth" \
--include "data/**"
For released-checkpoint evaluation only:
hf download jjiaweiyang/FD-Loss \
--local-dir . \
--include "checkpoints/post-trained/*.pth" \
--include "data/**"
Then unpack the bundled reference statistics:
python scripts/extract_paper_ref_stats.py
Evaluation
Run from the root of the GitHub repository:
PRESET=pMF_H_256 \
CKPT_PATH=checkpoints/post-trained/pMF-H_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh
PRESET=JiT_H \
CKPT_PATH=checkpoints/post-trained/JiT-H_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh
PRESET=iMF_XL \
CKPT_PATH=checkpoints/post-trained/iMF-XL_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh
Additional presets, smoke-test settings, and the Table 1 ablation, Table 2 repurposing, and Table 3 scalability scripts are documented in the GitHub repository.
The evaluator reports both raw FD and the paper-normalized metrics. FDr is raw
FD divided by the validation-set raw FD for the corresponding representation
space, and FDr-6 is the arithmetic mean over Inception, ConvNeXt, DINOv2, MAE,
SigLIP, and CLIP. The released code uses these validation-set raw FD values:
| Representation | Inception | ConvNeXt | DINOv2 | MAE | SigLIP | CLIP |
|---|---|---|---|---|---|---|
| valFD | 1.68 | 56.87 | 14.19 | 0.04 | 0.60 | 5.60 |
To reproduce these normalizers from ImageNet validation images:
DATA_ROOT=/path/to/imagenet \
torchrun --nproc_per_node=8 scripts/compute_valfd.py \
--data_root "$DATA_ROOT"
Citation
@article{yang2026fdloss,
title={Representation Fréchet Loss for Visual Generation},
author={Yang, Jiawei and Geng, Zhengyang and Ju, Xuan and Tian, Yonglong and Wang, Yue},
journal={arXiv:2604.28190},
url={https://arxiv.org/abs/2604.28190},
year={2026}
}
If you have any questions, feel free to contact me through email (yangjiaw@usc.edu).