Improve model card and add metadata (#1)

129b44a about 17 hours ago

3.06 kB

license: apache-2.0
pipeline_tag: image-to-video
tags:
  - video-generation
  - video-relighting
  - diffusion

Relit-LiVE: Relight Video by Jointly Learning Environment Video

Relit-LiVE is a novel video relighting framework that produces physically consistent, temporally stable results without requiring prior knowledge of camera pose. It explicitly introduces raw reference images into the rendering process, enabling the model to recover critical scene cues. The framework simultaneously generates relit videos and per-frame environment maps aligned with each camera viewpoint in a single diffusion process.

Installation

To set up the environment, follow these steps:

conda create -n diffsynth python=3.10
conda activate diffsynth
pip install -e .
pip install lightning pandas websockets pyexr natsort gradio
pip install -U deepspeed
pip install transformers==4.50.0

Usage

You can use the provided relit_inference.py script to perform video relighting. Below is an example of basic 25-frame relighting:

python relit_inference.py \
    --dataset_path datasets/demos \
    --ckpt_path checkpoints/model_frame25_480_832.ckpt \
    --output_dir inference_output \
    --cfg_scale 1.0 \
    --height 480 \
    --width 832 \
    --num_frames 25 \
    --padding_resolution \
    --use_ref_image \
    --env_map_path datasets/envs/Pink_Sunrise \
    --frame_interval 1 \
    --num_inference_steps 50 \
    --quality 10

For high-resolution single-frame relighting:

python relit_inference.py \
    --dataset_path datasets/demos \
    --ckpt_path checkpoints/model_frame1_1024_1472.ckpt \
    --output_dir inference_output \
    --cfg_scale 1.0 \
    --height 1024 \
    --width 1472 \
    --num_frames 1 \
    --padding_resolution \
    --use_ref_image \
    --env_map_path datasets/envs/Pink_Sunrise \
    --frame_interval 1 \
    --num_inference_steps 50 \
    --quality 10

Checkpoints

The following checkpoints are available in this repository:

Checkpoint	Resolution	Frames
`model_frame25_480_832.ckpt`	480 × 832	25
`model_frame57_480_832.ckpt`	480 × 832	57
`model_frame1_1024_1472.ckpt`	1024 × 1472	1 (Image)

Note: Inference also requires the Wan2.1 base model weights to be placed under models/Wan-AI/Wan2.1-T2V-1.3B/.

Citation

If you find this work helpful, please consider citing the paper:

@article{xiao2026relitlive,
  title={Relit-LiVE: Relight Video by Jointly Learning Environment Video},
  author={Xiao, Weiqing and Li, Hong and Yang, Xiuyu and Chen, Houyuan and Li, Wenyi and Liu, Tianqi and Xu, Shaocong and Ye, Chongjie and Zhao, Hao and Wang, Beibei},
  journal={arXiv preprint arXiv:2605.06658},
  year={2026}
}

weiqingXiao
/

Relit-LiVE

Relit-LiVE: Relight Video by Jointly Learning Environment Video

Links

Installation

Usage

Checkpoints

Citation