Aligning Latent and Image Spaces to Connect the Unconnectable
Paper • 2104.06954 • Published
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("crab27/ddpm-landscape", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]A 256x256 unconditional DDPM that generates natural landscape images. Full fine-tune of google/ddpm-church-256 on the Landscapes HQ (LHQ) dataset.
# !pip install diffusers
from diffusers import DDPMPipeline, DDIMPipeline, PNDMPipeline
model_id = "crab27/ddpm-landscape"
# load model and scheduler
ddpm = DDPMPipeline.from_pretrained(model_id) # you can replace DDPMPipeline with DDIMPipeline or PNDMPipeline for faster inference
# run pipeline in inference (sample random noise and denoise)
image = ddpm().images[0]
# save image
image.save("ddpm_generated_image.png")
google/ddpm-church-256 — original 256x256 DDPM by Ho et al. All credit for the base architecture and pretrained weights goes to the original authors.@article{ALIS,
title = {Aligning Latent and Image Spaces to Connect the Unconnectable},
author = {Skorokhodov, Ivan and Sotnikov, Grigorii and Elhoseiny, Mohamed},
journal = {arXiv preprint arXiv:2104.06954},
year = {2021}
}
| Base model | google/ddpm-church-256 |
| Dataset | LHQ (256x256) |
| Epochs | 50 |
| Batch size | 32 |
| Optimizer | AdamW |
| Learning rate | 1e-5 (cosine schedule, 500 warmup steps) |
| Loss | MSE on predicted noise |
| Augmentation | Random horizontal flip |
diffusers.