Perflow-Shuai
/

longlive_2.0_5B_tmp_20260507

+---
+license: apache-2.0
+pipeline_tag: text-to-video
+tags:
+  - text-to-video
+  - video-generation
+  - diffusion
+  - long-video
+  - longlive2
+  - wan2.2
+---
+# LongLive2.0 5B Checkpoints
+This repository hosts temporary LongLive2.0 5B checkpoints for inference with
+the LongLive2.0 release code:
+https://github.com/wileewang/LongLive2.0
+The checkpoint package contains two parts:
+- **Base generator checkpoint**: the AR-trained Wan2.2-TI2V-5B generator.
+- **LoRA checkpoint**: the DMD-distilled few-step LoRA adapter.
+LongLive2.0 inference loads the base generator first, applies the LoRA modules,
+and then loads the LoRA weights.
+## Installation
+```bash
+git clone https://github.com/wileewang/LongLive2.0.git
+cd LongLive2.0
+conda create -n longlive2 python=3.10 -y
+conda activate longlive2
+pip install torch==2.8.0 torchvision==0.23.0 --index-url https://download.pytorch.org/whl/cu128
+pip install -r requirements.txt
+pip install flash-attn --no-build-isolation
+```
+The released LongLive2.0 checkpoint is sufficient for standard inference. You
+only need to download the original Wan2.2-TI2V-5B components if you want to run
+training, initialize from the original Wan weights, or use code paths that
+explicitly load the base Wan model files:
+```bash
+huggingface-cli download Wan-AI/Wan2.2-TI2V-5B \
+  --local-dir wan_models/Wan2.2-TI2V-5B
+```
+Download this checkpoint repository:
+```bash
+huggingface-cli download Perflow-Shuai/longlive_2.0_5B_tmp_20260507 \
+  --local-dir checkpoints/longlive2_5b
+```
+## Configure Inference
+Edit `configs/inference.yaml`:
+```yaml
+checkpoints:
+  generator_ckpt: checkpoints/longlive2_5b/path/to/base_generator.pt
+  lora_ckpt: checkpoints/longlive2_5b/path/to/dmd_lora.pt
+adapter:
+  type: lora
+  rank: 128
+  alpha: 128
+  dropout: 0.0
+  verbose: true
+data:
+  data_path: /path/to/inference_prompts
+output_folder: videos/longlive2
+num_samples: 1
+inference:
+  sampling_steps: 4
+  sink_size: 8
+  guidance_scale: 1.0
+  multi_shot_sink: true
+  multi_shot_rope_offset: 8
+```
+Replace the checkpoint filenames above with the actual files in this repository.
+If the LoRA checkpoint is not used, remove the `adapter` section and leave
+`lora_ckpt` unset.
+## Prompt Folder
+`data.data_path` can be either:
+- a `.txt` file, where each line is one single-shot prompt; or
+- a directory of multi-shot prompt folders.
+Example multi-shot prompt folder:
+```text
+inference_prompts/
+  robot_lab_demo/
+    0.json
+    1.json
+    2.json
+    shot_durations.txt
+```
+Each JSON file contains:
+```json
+{
+  "caption": "A compact silver robot with one blue optic explores a clean robotics lab."
+}
+```
+`shot_durations.txt` is optional. If provided, each number is the number of
+temporal chunks assigned to the corresponding caption, for example:
+```text
+2 2 4
+```
+## Run
+Single node, 8 GPUs:
+```bash
+torchrun --standalone --nnodes=1 --nproc_per_node=8 inference.py \
+  --config_path configs/inference.yaml
+```
+Single GPU:
+```bash
+python inference.py --config_path configs/inference.yaml
+```
+Outputs are written to `output_folder`.
+## Notes
+- The base checkpoint and LoRA checkpoint should be loaded together for the
+  few-step DMD model.
+- `inference.sampling_steps` controls the number of denoising steps.
+- `inference.multi_shot_sink` enables the multi-shot attention sink.
+- `inference.multi_shot_rope_offset` controls the multi-shot RoPE offset.
+- For NVFP4 inference, use the separate NVFP4 config and setup instructions in
+  the LongLive2.0 documentation.