If you encounter pipeline loading failure or unexpected output, please contact bili_sakura@zju.edu.cn.
DiffusionSat Custom Pipelines
Custom community pipelines for loading DiffusionSat checkpoints directly with diffusers.DiffusionPipeline.from_pretrained().
Model Index
model_index.json is set to the default text-to-image pipeline (DiffusionSatPipeline) so DiffusionPipeline.from_pretrained() works out of the box. The ControlNet variant is loaded via custom_pipeline plus the controlnet subfolder, as shown below.
Available Pipelines
This directory contains two custom pipelines:
pipeline_diffusionsat.py: Standard text-to-image pipeline with DiffusionSat metadata support.pipeline_diffusionsat_controlnet.py: ControlNet pipeline with DiffusionSat metadata and conditional metadata support.
Setup
The checkpoint folder (ckpt/diffusionsat/) should contain the standard diffusers components (unet, vae, scheduler, etc.). You can reference these pipeline files directly from this directory or copy them to your checkpoint folder.
For this model card, demo_images/readme_controlnet.jpeg is generated from demo_images/texas_condition_input.png using the same sample pair from the Texas validation mini-set.
Usage
1. Text-to-Image Pipeline
Use pipeline_diffusionsat.py for standard generation.
import torch
from diffusers import DiffusionPipeline
# Load pipeline
pipe = DiffusionPipeline.from_pretrained(
"path/to/ckpt/diffusionsat",
custom_pipeline="./pipeline_diffusionsat.py", # Path to this file
torch_dtype=torch.float16,
trust_remote_code=True,
)
pipe = pipe.to("cuda")
# Optional: Metadata (normalized lat, lon, timestamp, GSD, etc.)
# metadata = [0.5, -0.3, 0.7, 0.2, 0.1, 0.0, 0.5]
# Generate
image = pipe(
"satellite image of farmland",
metadata=None, # Optional
height=256,
width=256,
num_inference_steps=30,
).images[0]
2. ControlNet Pipeline
Use pipeline_diffusionsat_controlnet.py for ControlNet generation.
import torch
import numpy as np
from PIL import Image
from torchvision import transforms
from diffusers import DiffusionPipeline
from controlnet.controlnet_3d import ControlNetModel3D
# 1. Load the Texas 3D ControlNet
controlnet = ControlNetModel3D.from_pretrained(
"path/to/ckpt/diffusionsat/controlnet",
torch_dtype=torch.float16,
)
# 2. Load Pipeline with ControlNet
pipe = DiffusionPipeline.from_pretrained(
"path/to/ckpt/diffusionsat",
controlnet=controlnet,
custom_pipeline="./pipeline_diffusionsat_controlnet.py", # Path to this file
torch_dtype=torch.float16,
trust_remote_code=True,
)
pipe = pipe.to("cuda")
# 3. Prepare notebook-matched conditioning image
# sample pair: housing-13176, source=tif.rgb-2016.npy, target=tif.rgb-2018.npy
prep = transforms.Compose([
transforms.Resize(256, interpolation=transforms.InterpolationMode.BICUBIC, antialias=True),
transforms.CenterCrop(256),
])
control_image = prep(Image.open("./demo_images/texas_condition_input.png").convert("RGB"))
control_tensor = torch.from_numpy(np.array(control_image)).permute(2, 0, 1).unsqueeze(0).float() / 255.0
control_tensor = control_tensor.to(device="cuda", dtype=torch.float16)
# Optional temporal conditioning metadata (num_metadata x num_frames)
cond_metadata = [[0.0] for _ in range(7)]
# 4. Generate (notebook-aligned settings)
image = pipe(
prompt="a satlas satellite image of houses built in 2014 covering 0.1929 acres",
image=control_tensor,
metadata=None,
cond_metadata=cond_metadata,
is_temporal=True,
height=256,
width=256,
num_inference_steps=50,
guidance_scale=1.0,
).images[0]
- Downloads last month
- 19