wileewang commited on
Commit
bd2ce2c
·
verified ·
1 Parent(s): edb63a2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +150 -0
README.md ADDED
@@ -0,0 +1,150 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-to-video
4
+ tags:
5
+ - text-to-video
6
+ - video-generation
7
+ - diffusion
8
+ - long-video
9
+ - longlive2
10
+ - wan2.2
11
+ ---
12
+
13
+ # LongLive2.0 5B Checkpoints
14
+
15
+ This repository hosts temporary LongLive2.0 5B checkpoints for inference with
16
+ the LongLive2.0 release code:
17
+
18
+ https://github.com/wileewang/LongLive2.0
19
+
20
+ The checkpoint package contains two parts:
21
+
22
+ - **Base generator checkpoint**: the AR-trained Wan2.2-TI2V-5B generator.
23
+ - **LoRA checkpoint**: the DMD-distilled few-step LoRA adapter.
24
+
25
+ LongLive2.0 inference loads the base generator first, applies the LoRA modules,
26
+ and then loads the LoRA weights.
27
+
28
+ ## Installation
29
+
30
+ ```bash
31
+ git clone https://github.com/wileewang/LongLive2.0.git
32
+ cd LongLive2.0
33
+
34
+ conda create -n longlive2 python=3.10 -y
35
+ conda activate longlive2
36
+ pip install torch==2.8.0 torchvision==0.23.0 --index-url https://download.pytorch.org/whl/cu128
37
+ pip install -r requirements.txt
38
+ pip install flash-attn --no-build-isolation
39
+ ```
40
+
41
+ The released LongLive2.0 checkpoint is sufficient for standard inference. You
42
+ only need to download the original Wan2.2-TI2V-5B components if you want to run
43
+ training, initialize from the original Wan weights, or use code paths that
44
+ explicitly load the base Wan model files:
45
+
46
+ ```bash
47
+ huggingface-cli download Wan-AI/Wan2.2-TI2V-5B \
48
+ --local-dir wan_models/Wan2.2-TI2V-5B
49
+ ```
50
+
51
+ Download this checkpoint repository:
52
+
53
+ ```bash
54
+ huggingface-cli download Perflow-Shuai/longlive_2.0_5B_tmp_20260507 \
55
+ --local-dir checkpoints/longlive2_5b
56
+ ```
57
+
58
+ ## Configure Inference
59
+
60
+ Edit `configs/inference.yaml`:
61
+
62
+ ```yaml
63
+ checkpoints:
64
+ generator_ckpt: checkpoints/longlive2_5b/path/to/base_generator.pt
65
+ lora_ckpt: checkpoints/longlive2_5b/path/to/dmd_lora.pt
66
+
67
+ adapter:
68
+ type: lora
69
+ rank: 128
70
+ alpha: 128
71
+ dropout: 0.0
72
+ verbose: true
73
+
74
+ data:
75
+ data_path: /path/to/inference_prompts
76
+
77
+ output_folder: videos/longlive2
78
+ num_samples: 1
79
+
80
+ inference:
81
+ sampling_steps: 4
82
+ sink_size: 8
83
+ guidance_scale: 1.0
84
+ multi_shot_sink: true
85
+ multi_shot_rope_offset: 8
86
+ ```
87
+
88
+ Replace the checkpoint filenames above with the actual files in this repository.
89
+ If the LoRA checkpoint is not used, remove the `adapter` section and leave
90
+ `lora_ckpt` unset.
91
+
92
+ ## Prompt Folder
93
+
94
+ `data.data_path` can be either:
95
+
96
+ - a `.txt` file, where each line is one single-shot prompt; or
97
+ - a directory of multi-shot prompt folders.
98
+
99
+ Example multi-shot prompt folder:
100
+
101
+ ```text
102
+ inference_prompts/
103
+ robot_lab_demo/
104
+ 0.json
105
+ 1.json
106
+ 2.json
107
+ shot_durations.txt
108
+ ```
109
+
110
+ Each JSON file contains:
111
+
112
+ ```json
113
+ {
114
+ "caption": "A compact silver robot with one blue optic explores a clean robotics lab."
115
+ }
116
+ ```
117
+
118
+ `shot_durations.txt` is optional. If provided, each number is the number of
119
+ temporal chunks assigned to the corresponding caption, for example:
120
+
121
+ ```text
122
+ 2 2 4
123
+ ```
124
+
125
+ ## Run
126
+
127
+ Single node, 8 GPUs:
128
+
129
+ ```bash
130
+ torchrun --standalone --nnodes=1 --nproc_per_node=8 inference.py \
131
+ --config_path configs/inference.yaml
132
+ ```
133
+
134
+ Single GPU:
135
+
136
+ ```bash
137
+ python inference.py --config_path configs/inference.yaml
138
+ ```
139
+
140
+ Outputs are written to `output_folder`.
141
+
142
+ ## Notes
143
+
144
+ - The base checkpoint and LoRA checkpoint should be loaded together for the
145
+ few-step DMD model.
146
+ - `inference.sampling_steps` controls the number of denoising steps.
147
+ - `inference.multi_shot_sink` enables the multi-shot attention sink.
148
+ - `inference.multi_shot_rope_offset` controls the multi-shot RoPE offset.
149
+ - For NVFP4 inference, use the separate NVFP4 config and setup instructions in
150
+ the LongLive2.0 documentation.