File size: 4,206 Bytes
e2f0d3d 88ce2bb e2f0d3d 88ce2bb e2f0d3d dd8f07c e2f0d3d dd8f07c e2f0d3d dd8f07c e2f0d3d 97becd8 e2f0d3d d7114f3 e2f0d3d 88ce2bb e2f0d3d 88ce2bb e2f0d3d 6b0dbe8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | ---
license: mit
library_name: pytorch
tags:
- image-generation
- diffusion
- imagenet
- frechet-distance
- fd-loss
datasets:
- imagenet-1k
---
# Representation Fréchet Loss for Visual Generation
[](https://arxiv.org/abs/2604.28190)
[](https://github.com/Jiawei-Yang/FD-Loss)
This repository hosts the released checkpoints and reference data for
**Representation Fréchet Loss for Visual Generation**.
Paper: [Representation Fréchet Loss for Visual Generation](https://arxiv.org/abs/2604.28190).
Code, training scripts, and evaluation utilities are available at:
[github.com/Jiawei-Yang/FD-Loss](https://github.com/Jiawei-Yang/FD-Loss).
FD-Loss post-trains visual generators by matching generated-image feature
distributions to real-image feature distributions in frozen representation spaces.
This release contains base and FD-loss post-trained checkpoints for pMF, iMF, and
JiT models, together with the reference statistics used by the paper.
## Files
```text
checkpoints/
base/
iMF-B.pth
iMF-L.pth
iMF-XL.pth
JiT-B.pth
JiT-L.pth
JiT-H.pth
pMF-B_256.pth
pMF-L_256.pth
pMF-H_256.pth
pMF-B_512.pth
pMF-L_512.pth
pMF-H_512.pth
post-trained/
iMF-B_FD-Inception.pth
iMF-B_FD-SIM.pth
iMF-L_FD-Inception.pth
iMF-L_FD-SIM.pth
iMF-XL_FD-Inception.pth
iMF-XL_FD-SIM.pth
JiT-B_FD-Inception.pth
JiT-B_FD-SIM.pth
JiT-L_FD-Inception.pth
JiT-L_FD-SIM.pth
JiT-H_FD-Inception.pth
JiT-H_FD-SIM.pth
pMF-B_FD-Inception.pth
pMF-B_FD-SIM.pth
pMF-L_FD-Inception.pth
pMF-L_FD-SIM.pth
pMF-H_FD-Inception.pth
pMF-H_FD-SIM.pth
pMF-B_512_FD-SIM.pth
pMF-L_512_FD-SIM.pth
pMF-H_512_FD-SIM.pth
data/
fid_stats/
paper_ref_stats.pkl
train.txt
val.txt
val_labeled.txt
```
## Download
Install the Hugging Face CLI:
```bash
pip install -U huggingface_hub
```
Download all checkpoints and data files into a clone of the code repository:
```bash
hf download jjiaweiyang/FD-Loss \
--local-dir . \
--include "checkpoints/**/*.pth" \
--include "data/**"
```
For released-checkpoint evaluation only:
```bash
hf download jjiaweiyang/FD-Loss \
--local-dir . \
--include "checkpoints/post-trained/*.pth" \
--include "data/**"
```
Then unpack the bundled reference statistics:
```bash
python scripts/extract_paper_ref_stats.py
```
## Evaluation
Run from the root of the GitHub repository:
```bash
PRESET=pMF_H_256 \
CKPT_PATH=checkpoints/post-trained/pMF-H_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh
PRESET=JiT_H \
CKPT_PATH=checkpoints/post-trained/JiT-H_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh
PRESET=iMF_XL \
CKPT_PATH=checkpoints/post-trained/iMF-XL_FD-SIM.pth \
GPUS_PER_NODE=8 \
bash scripts/evaluate_released_ckpt.sh
```
Additional presets, smoke-test settings, and the Table 1 ablation, Table 2
repurposing, and Table 3 scalability scripts are documented in the GitHub
repository.
The evaluator reports both raw FD and the paper-normalized metrics. `FDr` is raw
FD divided by the validation-set raw FD for the corresponding representation
space, and `FDr-6` is the arithmetic mean over Inception, ConvNeXt, DINOv2, MAE,
SigLIP, and CLIP. The released code uses these validation-set raw FD values:
| Representation | Inception | ConvNeXt | DINOv2 | MAE | SigLIP | CLIP |
|---|---:|---:|---:|---:|---:|---:|
| valFD | 1.68 | 56.87 | 14.19 | 0.04 | 0.60 | 5.60 |
To reproduce these normalizers from ImageNet validation images:
```bash
DATA_ROOT=/path/to/imagenet \
torchrun --nproc_per_node=8 scripts/compute_valfd.py \
--data_root "$DATA_ROOT"
```
## Citation
```bibtex
@article{yang2026fdloss,
title={Representation Fréchet Loss for Visual Generation},
author={Yang, Jiawei and Geng, Zhengyang and Ju, Xuan and Tian, Yonglong and Wang, Yue},
journal={arXiv:2604.28190},
url={https://arxiv.org/abs/2604.28190},
year={2026}
}
```
If you have any questions, feel free to contact me through email
([yangjiaw@usc.edu](mailto:yangjiaw@usc.edu)).
|