Yukang commited on
Commit
e388217
·
verified ·
1 Parent(s): 5bf5caa

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +187 -0
README.md ADDED
@@ -0,0 +1,187 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: nvidia-open-model-license
4
+ license_link: >-
5
+ https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/
6
+ pipeline_tag: text-to-video
7
+ tags:
8
+ - text-to-video
9
+ - multi-shot
10
+ - video-generation
11
+ - diffusion
12
+ - long-video
13
+ - longlive2
14
+ - wan2.2
15
+ ---
16
+
17
+ <p align="center">
18
+ <img src="https://github.com/wileewang/LongLive2.0/blob/release-clean-merge/assets/longlive2/logo.png?raw=true" alt="LongLive2.0 logo" width="100%">
19
+ </p>
20
+
21
+ # LongLive2.0 5B Checkpoints
22
+
23
+ This repository hosts temporary LongLive2.0 5B checkpoints for inference with
24
+ the LongLive2.0 release code:
25
+
26
+ https://github.com/wileewang/LongLive2.0
27
+
28
+ The checkpoint package contains two parts:
29
+
30
+ - **Base generator checkpoint**: the AR-trained Wan2.2-TI2V-5B generator.
31
+ - **LoRA checkpoint**: the DMD-distilled few-step LoRA adapter.
32
+
33
+ LongLive2.0 inference loads the base generator first, applies the LoRA modules,
34
+ and then loads the LoRA weights.
35
+
36
+ ## Installation
37
+
38
+ ```bash
39
+ git clone https://github.com/wileewang/LongLive2.0.git
40
+ cd LongLive2.0
41
+
42
+ conda create -n longlive2 python=3.10 -y
43
+ conda activate longlive2
44
+ pip install torch==2.8.0 torchvision==0.23.0 --index-url https://download.pytorch.org/whl/cu128
45
+ pip install -r requirements.txt
46
+ pip install flash-attn --no-build-isolation
47
+ ```
48
+
49
+ The released LongLive2.0 checkpoint is sufficient for standard inference. You
50
+ only need to download the original Wan2.2-TI2V-5B components if you want to run
51
+ training, initialize from the original Wan weights, or use code paths that
52
+ explicitly load the base Wan model files:
53
+
54
+ ```bash
55
+ huggingface-cli download Wan-AI/Wan2.2-TI2V-5B \
56
+ --local-dir wan_models/Wan2.2-TI2V-5B
57
+ ```
58
+
59
+ Download this checkpoint repository:
60
+
61
+ ```bash
62
+ huggingface-cli download Perflow-Shuai/longlive_2.0_5B_tmp_20260507 \
63
+ --local-dir checkpoints/longlive2_5b
64
+ ```
65
+
66
+ ## Configure Inference
67
+
68
+ Edit `configs/inference.yaml`:
69
+
70
+ ```yaml
71
+ checkpoints:
72
+ generator_ckpt: checkpoints/longlive2_5b/path/to/base_generator.pt
73
+ lora_ckpt: checkpoints/longlive2_5b/path/to/dmd_lora.pt
74
+
75
+ adapter:
76
+ type: lora
77
+ rank: 128
78
+ alpha: 128
79
+ dropout: 0.0
80
+ verbose: true
81
+
82
+ data:
83
+ data_path: /path/to/inference_prompts
84
+
85
+ output_folder: videos/longlive2
86
+ num_samples: 1
87
+
88
+ inference:
89
+ sampling_steps: 4
90
+ sink_size: 8
91
+ guidance_scale: 1.0
92
+ multi_shot_sink: true
93
+ multi_shot_rope_offset: 8
94
+ ```
95
+
96
+ Replace the checkpoint filenames above with the actual files in this repository.
97
+ If the LoRA checkpoint is not used, remove the `adapter` section and leave
98
+ `lora_ckpt` unset.
99
+
100
+ ## Prompt Folder
101
+
102
+ `data.data_path` is passed to `MultiTextConcatDataset` in `inference.py`. It can
103
+ be either:
104
+
105
+ - a `.txt` file, where each line is one single-shot prompt; or
106
+ - a directory of multi-shot prompt folders.
107
+
108
+ For a directory input, the code supports both of the following layouts. The
109
+ direct caption-root layout is the simplest:
110
+
111
+ ```text
112
+ inference_prompts/
113
+ robot_lab_demo/
114
+ 0.json
115
+ 1.json
116
+ 2.json
117
+ shot_durations.txt
118
+ ```
119
+
120
+ It also supports a dataset root with an outer `caption/` folder:
121
+
122
+ ```text
123
+ inference_prompts/
124
+ caption/
125
+ robot_lab_demo/
126
+ 0.json
127
+ 1.json
128
+ 2.json
129
+ shot_durations.txt
130
+ ```
131
+
132
+ Each JSON file contains:
133
+
134
+ ```json
135
+ {
136
+ "caption": "A compact silver robot with one blue optic explores a clean robotics lab."
137
+ }
138
+ ```
139
+
140
+ `shot_durations.txt` is optional. If provided, each number is the number of
141
+ temporal chunks assigned to the corresponding caption, for example:
142
+
143
+ ```text
144
+ 2 2 4
145
+ ```
146
+
147
+ ## Run
148
+
149
+ Single node, 8 GPUs:
150
+
151
+ ```bash
152
+ torchrun --standalone --nnodes=1 --nproc_per_node=8 inference.py \
153
+ --config_path configs/inference.yaml
154
+ ```
155
+
156
+ Single GPU:
157
+
158
+ ```bash
159
+ python inference.py --config_path configs/inference.yaml
160
+ ```
161
+
162
+ Outputs are written to `output_folder`.
163
+
164
+ ## Notes
165
+
166
+ - The base checkpoint and LoRA checkpoint should be loaded together for the
167
+ few-step DMD model.
168
+ - `inference.sampling_steps` controls the number of denoising steps.
169
+ - `inference.multi_shot_sink` enables the multi-shot attention sink.
170
+ - `inference.multi_shot_rope_offset` controls the multi-shot RoPE offset.
171
+ - For NVFP4 inference, use the separate NVFP4 config and setup instructions in
172
+ the LongLive2.0 documentation.
173
+
174
+ ## License/Terms of Use
175
+
176
+ GOVERNING TERMS: This trial service is governed by the [NVIDIA API Trial Terms of Service](https://assets.ngc.nvidia.com/products/api-catalog/legal/NVIDIA%20API%20Trial%20Terms%20of%20Service.pdf). Use of this model is governed by the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).
177
+
178
+ ## Citation
179
+
180
+ ```bibtex
181
+ @article{longlive_2,
182
+ title={LongLive2.0: An NVFP4 Parallel Infrastructure for Long Video Generation},
183
+ author={Chen, Yukang and Wang, Luozhou and Huang, Wei and Yang, Shuai and Zhang, Bohan and Xiao, Yicheng and Chu, Ruihang and Mao, Weian and Hu, Qixin and Liu, Shaoteng and Zhao, Yuyang and Mao, Huizi and Chen, Ying-Cong and Xie, Enze and Qi, Xiaojuan and Han, Song},
184
+ journal={arXiv preprint arXiv},
185
+ year={2026}
186
+ }
187
+ ```