license: apache-2.0
pipeline_tag: image-to-video
tags:
- video-generation
- video-relighting
- diffusion
Relit-LiVE: Relight Video by Jointly Learning Environment Video
Relit-LiVE is a novel video relighting framework that produces physically consistent, temporally stable results without requiring prior knowledge of camera pose. It explicitly introduces raw reference images into the rendering process, enabling the model to recover critical scene cues. The framework simultaneously generates relit videos and per-frame environment maps aligned with each camera viewpoint in a single diffusion process.
Links
- Paper: Relit-LiVE: Relight Video by Jointly Learning Environment Video
- Code: GitHub Repository
- Project Page: Relit-LiVE Project
Installation
To set up the environment, follow these steps:
conda create -n diffsynth python=3.10
conda activate diffsynth
pip install -e .
pip install lightning pandas websockets pyexr natsort gradio
pip install -U deepspeed
pip install transformers==4.50.0
Usage
You can use the provided relit_inference.py script to perform video relighting. Below is an example of basic 25-frame relighting:
python relit_inference.py \
--dataset_path datasets/demos \
--ckpt_path checkpoints/model_frame25_480_832.ckpt \
--output_dir inference_output \
--cfg_scale 1.0 \
--height 480 \
--width 832 \
--num_frames 25 \
--padding_resolution \
--use_ref_image \
--env_map_path datasets/envs/Pink_Sunrise \
--frame_interval 1 \
--num_inference_steps 50 \
--quality 10
For high-resolution single-frame relighting:
python relit_inference.py \
--dataset_path datasets/demos \
--ckpt_path checkpoints/model_frame1_1024_1472.ckpt \
--output_dir inference_output \
--cfg_scale 1.0 \
--height 1024 \
--width 1472 \
--num_frames 1 \
--padding_resolution \
--use_ref_image \
--env_map_path datasets/envs/Pink_Sunrise \
--frame_interval 1 \
--num_inference_steps 50 \
--quality 10
Checkpoints
The following checkpoints are available in this repository:
| Checkpoint | Resolution | Frames |
|---|---|---|
model_frame25_480_832.ckpt |
480 × 832 | 25 |
model_frame57_480_832.ckpt |
480 × 832 | 57 |
model_frame1_1024_1472.ckpt |
1024 × 1472 | 1 (Image) |
Note: Inference also requires the Wan2.1 base model weights to be placed under models/Wan-AI/Wan2.1-T2V-1.3B/.
Citation
If you find this work helpful, please consider citing the paper:
@article{xiao2026relitlive,
title={Relit-LiVE: Relight Video by Jointly Learning Environment Video},
author={Xiao, Weiqing and Li, Hong and Yang, Xiuyu and Chen, Houyuan and Li, Wenyi and Liu, Tianqi and Xu, Shaocong and Ye, Chongjie and Zhao, Hao and Wang, Beibei},
journal={arXiv preprint arXiv:2605.06658},
year={2026}
}