Learning Illumination Control in Diffusion Models — Model weights (HF)

Weights for Learning Illumination Control in Diffusion Models (ReALM-GEN @ ICLR 2026).

  • Architecture: StableDiffusionInstructPix2PixPipeline (SD 1.5)
  • Inputs: RGB image (512² domain) + instruction string
  • Output: RGB relit image

Download weights (CLI)

pip install -U "huggingface_hub[cli]"

huggingface-cli download nishitanand/sd-image-relighting-model \
  --repo-type model \
  --local-dir ./sd-image-relighting-model

Or in Python / Diffusers, load directly by repo id (no local dir required).


Inference (Python)

import torch
from PIL import Image
from diffusers import StableDiffusionInstructPix2PixPipeline

model_id = "nishitanand/sd-image-relighting-model"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
).to("cuda")
pipe.safety_checker = None

image = Image.open("degraded_input.png").convert("RGB")
instruction = (
    "Soft window light from the left with warm highlights and gentle shadows "
    "across the face."
)

out = pipe(
    instruction,
    image=image,
    num_inference_steps=50,
    image_guidance_scale=1.5,
    guidance_scale=7.5,
).images[0]

out.save("relit.png")

Tips

  • Use prompts in the same long descriptive style as training (see dataset / paper).
  • Input should resemble the synthetic degraded domain the model was trained on for best identity preservation.

Inference (project script)

After cloning the code repository:

cd inference
pip install -r requirements.txt

python inference.py \
  --model_path nishitanand/sd-image-relighting-model \
  --input_image /path/to/degraded.png \
  --instruction "Your full-sentence lighting description here." \
  --output_path ./relit.png

For full training dependencies, you can instead use pip install -r training/sd1_5/requirements.txt.

Repository layout (this HF model repo)

At the root of the snapshot you should see a standard Diffusers tree, for example:

  • model_index.json
  • unet/, vae/, text_encoder/, tokenizer/
  • scheduler/, feature_extractor/, safety_checker/

Optional: checkpoint-* (e.g. checkpoint-13000/) — raw Accelerate training snapshot. from_pretrained / inference / evaluate_models.py load the repository root where model_index.json lives — they do not automatically use checkpoint-*. Point --model_path / --trained_model_path at ./hf_model (or the Hub id), not at checkpoint-*, unless you have re-exported that step as a standalone Diffusers layout.


Limitations

  • Best on face imagery similar to training; strong out-of-domain poses or resolutions may fail.
  • Not a physically-based renderer; extreme prompts can cause drift.

Citation (BibTeX)

@article{anand2026learning,
  title={Learning Illumination Control in Diffusion Models},
  author={Anand, Nishit and Suri, Manan and Metzler, Christopher and Manocha, Dinesh and Duraiswami, Ramani},
  journal={arXiv preprint arXiv:2604.24877},
  year={2026},
  note={ReALM-GEN @ ICLR 2026}
}

License

This model repository is released under CC BY-NC-SA 4.0 (see LICENSE at the repo root when mirrored from GitHub). Base Stable Diffusion components remain subject to their original licenses.

FFHQ. Weights were trained on data derived from Flickr-Faces-HQ (FFHQ). The FFHQ dataset package is distributed by NVIDIA under CC BY-NC-SA 4.0; individual source images carry their own Flickr licenses. Comply with NVlabs/ffhq-dataset and cite A Style-Based Generator Architecture for Generative Adversarial Networks (Karras, Laine, Aila) and FFHQ/NVIDIA as required.

When you publish results using these weights, cite arXiv:2604.24877 and link the code, dataset, and this model repository.

Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nishitanand/sd-image-relighting-model

Finetuned
(586)
this model

Dataset used to train nishitanand/sd-image-relighting-model

Paper for nishitanand/sd-image-relighting-model