FD-Loss / README.md

Add arXiv badge

88ce2bb verified 7 days ago

4.21 kB

license: mit
library_name: pytorch
tags:
  - image-generation
  - diffusion
  - imagenet
  - frechet-distance
  - fd-loss
datasets:
  - imagenet-1k

Representation Fréchet Loss for Visual Generation

This repository hosts the released checkpoints and reference data for Representation Fréchet Loss for Visual Generation.

Paper: Representation Fréchet Loss for Visual Generation.

Code, training scripts, and evaluation utilities are available at: github.com/Jiawei-Yang/FD-Loss.

FD-Loss post-trains visual generators by matching generated-image feature distributions to real-image feature distributions in frozen representation spaces. This release contains base and FD-loss post-trained checkpoints for pMF, iMF, and JiT models, together with the reference statistics used by the paper.

Files

checkpoints/
  base/
    iMF-B.pth
    iMF-L.pth
    iMF-XL.pth
    JiT-B.pth
    JiT-L.pth
    JiT-H.pth
    pMF-B_256.pth
    pMF-L_256.pth
    pMF-H_256.pth
    pMF-B_512.pth
    pMF-L_512.pth
    pMF-H_512.pth
  post-trained/
    iMF-B_FD-Inception.pth
    iMF-B_FD-SIM.pth
    iMF-L_FD-Inception.pth
    iMF-L_FD-SIM.pth
    iMF-XL_FD-Inception.pth
    iMF-XL_FD-SIM.pth
    JiT-B_FD-Inception.pth
    JiT-B_FD-SIM.pth
    JiT-L_FD-Inception.pth
    JiT-L_FD-SIM.pth
    JiT-H_FD-Inception.pth
    JiT-H_FD-SIM.pth
    pMF-B_FD-Inception.pth
    pMF-B_FD-SIM.pth
    pMF-L_FD-Inception.pth
    pMF-L_FD-SIM.pth
    pMF-H_FD-Inception.pth
    pMF-H_FD-SIM.pth
    pMF-B_512_FD-SIM.pth
    pMF-L_512_FD-SIM.pth
    pMF-H_512_FD-SIM.pth
data/
  fid_stats/
    paper_ref_stats.pkl
  train.txt
  val.txt
  val_labeled.txt

Download

Install the Hugging Face CLI:

pip install -U huggingface_hub

Download all checkpoints and data files into a clone of the code repository:

hf download jjiaweiyang/FD-Loss \
  --local-dir . \
  --include "checkpoints/**/*.pth" \
  --include "data/**"

For released-checkpoint evaluation only:

hf download jjiaweiyang/FD-Loss \
  --local-dir . \
  --include "checkpoints/post-trained/*.pth" \
  --include "data/**"

Then unpack the bundled reference statistics:

python scripts/extract_paper_ref_stats.py

Evaluation

Run from the root of the GitHub repository:

PRESET=pMF_H_256 \
CKPT_PATH=checkpoints/post-trained/pMF-H_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh

PRESET=JiT_H \
CKPT_PATH=checkpoints/post-trained/JiT-H_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh

PRESET=iMF_XL \
CKPT_PATH=checkpoints/post-trained/iMF-XL_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh

Additional presets, smoke-test settings, and the Table 1 ablation, Table 2 repurposing, and Table 3 scalability scripts are documented in the GitHub repository.

The evaluator reports both raw FD and the paper-normalized metrics. FDr is raw FD divided by the validation-set raw FD for the corresponding representation space, and FDr-6 is the arithmetic mean over Inception, ConvNeXt, DINOv2, MAE, SigLIP, and CLIP. The released code uses these validation-set raw FD values:

Representation	Inception	ConvNeXt	DINOv2	MAE	SigLIP	CLIP
valFD	1.68	56.87	14.19	0.04	0.60	5.60

To reproduce these normalizers from ImageNet validation images:

DATA_ROOT=/path/to/imagenet \
torchrun --nproc_per_node=8 scripts/compute_valfd.py \
  --data_root "$DATA_ROOT"

Citation

@article{yang2026fdloss,
  title={Representation Fréchet Loss for Visual Generation},
  author={Yang, Jiawei and Geng, Zhengyang and Ju, Xuan and Tian, Yonglong and Wang, Yue},
  journal={arXiv:2604.28190},
  url={https://arxiv.org/abs/2604.28190},
  year={2026}
}

If you have any questions, feel free to contact me through email (yangjiaw@usc.edu).