| --- |
| license: cc-by-nc-sa-4.0 |
| language: |
| - en |
| tags: |
| - diffusion |
| - diffusers |
| - stable-diffusion |
| - instructpix2pix |
| - image-to-image |
| - image-editing |
| library_name: diffusers |
| pipeline_tag: image-to-image |
| base_model: runwayml/stable-diffusion-v1-5 |
| datasets: |
| - nishitanand/image-relighting-diffusion-data |
| --- |
| |
| # Learning Illumination Control in Diffusion Models — Model weights (HF) |
|
|
| Weights for [*Learning Illumination Control in Diffusion Models*](https://arxiv.org/abs/2604.24877) (ReALM-GEN @ ICLR 2026). |
|
|
| | | | |
| |--|--| |
| | **Code** | [github.com/nishitanand/image-relighting-diffusion](https://github.com/nishitanand/image-relighting-diffusion) | |
| | **Dataset** | [huggingface.co/datasets/nishitanand/image-relighting-diffusion-data](https://huggingface.co/datasets/nishitanand/image-relighting-diffusion-data) | |
| | **Paper** | [arxiv.org/abs/2604.24877](https://arxiv.org/abs/2604.24877) | |
| | **Project site** | [nishitanand.github.io/relighting-diffusion-website](https://nishitanand.github.io/relighting-diffusion-website) | |
|
|
| - **Architecture:** `StableDiffusionInstructPix2PixPipeline` (SD 1.5) |
| - **Inputs:** RGB image (512² domain) + instruction string |
| - **Output:** RGB relit image |
|
|
| --- |
|
|
| ## Download weights (CLI) |
|
|
| ```bash |
| pip install -U "huggingface_hub[cli]" |
| |
| huggingface-cli download nishitanand/sd-image-relighting-model \ |
| --repo-type model \ |
| --local-dir ./sd-image-relighting-model |
| ``` |
|
|
| Or in Python / Diffusers, load directly by repo id (no local dir required). |
|
|
| --- |
|
|
| ## Inference (Python) |
|
|
| ```python |
| import torch |
| from PIL import Image |
| from diffusers import StableDiffusionInstructPix2PixPipeline |
| |
| model_id = "nishitanand/sd-image-relighting-model" |
| pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained( |
| model_id, |
| torch_dtype=torch.float16, |
| ).to("cuda") |
| pipe.safety_checker = None |
| |
| image = Image.open("degraded_input.png").convert("RGB") |
| instruction = ( |
| "Soft window light from the left with warm highlights and gentle shadows " |
| "across the face." |
| ) |
| |
| out = pipe( |
| instruction, |
| image=image, |
| num_inference_steps=50, |
| image_guidance_scale=1.5, |
| guidance_scale=7.5, |
| ).images[0] |
| |
| out.save("relit.png") |
| ``` |
|
|
| **Tips** |
|
|
| - Use prompts in the same **long descriptive** style as training (see dataset / paper). |
| - Input should resemble the **synthetic degraded** domain the model was trained on for best identity preservation. |
|
|
| --- |
|
|
| ## Inference (project script) |
|
|
| After cloning the code repository: |
|
|
| ```bash |
| cd inference |
| pip install -r requirements.txt |
| |
| python inference.py \ |
| --model_path nishitanand/sd-image-relighting-model \ |
| --input_image /path/to/degraded.png \ |
| --instruction "Your full-sentence lighting description here." \ |
| --output_path ./relit.png |
| ``` |
|
|
| For full training dependencies, you can instead use `pip install -r training/sd1_5/requirements.txt`. |
|
|
| ## Repository layout (this HF model repo) |
|
|
| At the **root** of the snapshot you should see a standard Diffusers tree, for example: |
|
|
| - `model_index.json` |
| - `unet/`, `vae/`, `text_encoder/`, `tokenizer/` |
| - `scheduler/`, `feature_extractor/`, `safety_checker/` |
|
|
| **Optional:** `checkpoint-*` (e.g. `checkpoint-13000/`) — raw **Accelerate** training snapshot. **`from_pretrained` / inference / `evaluate_models.py`** load the **repository root** where `model_index.json` lives — they do **not** automatically use `checkpoint-*`. Point `--model_path` / `--trained_model_path` at **`./hf_model`** (or the Hub id), not at `checkpoint-*`, unless you have re-exported that step as a standalone Diffusers layout. |
| |
| --- |
| |
| ## Limitations |
| |
| - Best on **face** imagery similar to training; strong out-of-domain poses or resolutions may fail. |
| - Not a physically-based renderer; extreme prompts can cause drift. |
| |
| --- |
| |
| ## Citation (BibTeX) |
| |
| ```bibtex |
| @article{anand2026learning, |
| title={Learning Illumination Control in Diffusion Models}, |
| author={Anand, Nishit and Suri, Manan and Metzler, Christopher and Manocha, Dinesh and Duraiswami, Ramani}, |
| journal={arXiv preprint arXiv:2604.24877}, |
| year={2026}, |
| note={ReALM-GEN @ ICLR 2026} |
| } |
| ``` |
| |
| --- |
| |
| ## License |
| |
| This model repository is released under [**CC BY-NC-SA 4.0**](https://creativecommons.org/licenses/by-nc-sa/4.0/) (see [`LICENSE`](LICENSE) at the repo root when mirrored from GitHub). Base **Stable Diffusion** components remain subject to their original licenses. |
| |
| **FFHQ.** Weights were trained on data derived from **Flickr-Faces-HQ (FFHQ)**. The FFHQ dataset package is distributed by NVIDIA under **CC BY-NC-SA 4.0**; individual source images carry their own Flickr licenses. Comply with [NVlabs/ffhq-dataset](https://github.com/NVlabs/ffhq-dataset) and cite *A Style-Based Generator Architecture for Generative Adversarial Networks* (Karras, Laine, Aila) and FFHQ/NVIDIA as required. |
| |
| When you publish results using these weights, cite [**arXiv:2604.24877**](https://arxiv.org/abs/2604.24877) and link the **code**, **dataset**, and **this model** repository. |
| |