Image Translation Checkpoint Collections
Collection
pytorch-image-translation-models implementation • 9 items • Updated
we do not have a full checkpoint conversion validation, if you encounter pipeline loading failure and unsidered output, please contact me via bili_sakura@zju.edu.cn
Packaged BBDM/LBBDM checkpoints converted from original BBDM-format weights to:
unet/config.jsonunet/diffusion_pytorch_model.safetensorsscheduler/scheduler_config.jsonThese checkpoints use OpenAI-style BBDM UNet keys (input_blocks.*, middle_block.*, output_blocks.*).
Use the community BBDM loader from pytorch-image-translation-models.
| Model variant | Domain |
|---|---|
edges2handbags-f4 |
Edges -> Handbags |
edges2shoes-f4 |
Edges -> Shoes |
faces2comics-f4 |
Faces -> Comics |
CelebAMaskHQ-f4 |
CelebAMaskHQ |
CelebAMaskHQ-f8 |
CelebAMaskHQ |
CelebAMaskHQ-f16 |
CelebAMaskHQ |
BBDM-ckpt/
edges2shoes-f4/
unet/
config.json
diffusion_pytorch_model.safetensors
scheduler/
scheduler_config.json
conversion_status.json
...
import torch
from diffusers import VQModel
from examples.community.bbdm import load_bbdm_community_pipeline
device = "cuda"
ckpt_root = "/root/worksapce/models/BiliSakura/BBDM-ckpt"
# Example: edges2shoes-f4 <-> vqgan_f4 pairing
pipe = load_bbdm_community_pipeline(f"{ckpt_root}/edges2shoes-f4", device=device)
vqvae = VQModel.from_pretrained(f"{ckpt_root}/vqgan_f4/vqvae").to(device).eval()
# Input image should be normalized to [-1, 1], shape [B, 3, 256, 256].
x = torch.rand(1, 3, 256, 256, device=device) * 2 - 1
with torch.no_grad():
x_latent = vqvae.encode(x).latents # [B, 3, 64, 64]
y_latent = pipe(source_image=x_latent, num_inference_steps=200, output_type="pt").images
y = vqvae.decode(y_latent).sample # [B, 3, 256, 256]
print(x_latent.shape, y_latent.shape, y.shape)
Note: num_inference_steps should be >= 3 for linear skip sampling.
Use VQGAN variants that match the UNet latent tensor shape (in_channels, image_size):
| BBDM checkpoint | UNet latent shape | Recommended VQGAN |
|---|---|---|
edges2handbags-f4 |
[B, 3, 64, 64] |
vqgan_f4/vqvae |
edges2shoes-f4 |
[B, 3, 64, 64] |
vqgan_f4/vqvae |
faces2comics-f4 |
[B, 3, 64, 64] |
vqgan_f4/vqvae |
CelebAMaskHQ-f4 |
[B, 3, 64, 64] |
vqgan_f4/vqvae |
CelebAMaskHQ-f8 |
[B, 4, 32, 32] |
vqgan_f8/vqvae |
CelebAMaskHQ-f16 |
[B, 8, 16, 16] |
vqgan_f16/vqvae |
These converted checkpoints are generally not compatible with src.BBDMPipeline.from_pretrained(...)
because the native src BBDM wrapper expects diffusers UNet2DModel weight keys.
Use:
examples.community.bbdm.load_bbdm_community_pipeline(...)python -m examples.community.bbdm.convert_ckpt_to_unet \
--raw-root "/root/worksapce/models/raw/BBDM Checkpoints" \
--output-root "/root/worksapce/models/BiliSakura/BBDM-ckpt"
@inproceedings{li2023bbdm,
title={BBDM: Image-to-Image Translation with Brownian Bridge Diffusion Models},
author={Li, Bo and Xue, Kang and Liu, Bin and Lai, Yu-Kun},
booktitle={CVPR},
year={2023}
}