mvp-lab
/

RAVEN

@@ -25,12 +25,14 @@ Project page: https://yanzuo.lu/raven
 RAVEN is a causal autoregressive text-to-video generation model built on Wan2.1-T2V-1.3B. It is designed for real-time streaming video generation by extrapolating future video chunks from previously generated content.
-The release contains two checkpoints:
 | File | Description |
 | --- | --- |
-| `raven_model.pt` | Main RAVEN checkpoint for causal autoregressive text-to-video generation. |
-| `cmgrpo_raven_full.pt` | Unmerged CM-GRPO LoRA checkpoint. In the codebase this is loaded through the LoRA path with rank 256 and alpha 256 on top of the RAVEN/Wan backbone. |
 RAVEN trains a causal video generator using a training-time test framework that repacks each self rollout into an interleaved sequence of clean historical endpoints and noisy denoising states. This aligns the model's training attention pattern with inference-time autoregressive extrapolation and allows downstream chunk losses to supervise the historical representations used for future predictions.
@@ -74,10 +76,51 @@ source venv/bin/activate
 Download this model repository:
 ```bash
-hf download oliveryanzuolu/RAVEN --local-dir /path/to/RAVEN-weights
 ```
-Then point the relevant config files to the downloaded checkpoints (`raven_model.pt` for RAVEN, `cmgrpo_raven_full.pt` for CM-GRPO).
 Reference configs:

 RAVEN is a causal autoregressive text-to-video generation model built on Wan2.1-T2V-1.3B. It is designed for real-time streaming video generation by extrapolating future video chunks from previously generated content.
+The release contains the RAVEN checkpoint plus three interchangeable CM-GRPO variants:
 | File | Description |
 | --- | --- |
+| `raven_model.pt` | Full RAVEN backbone for causal autoregressive text-to-video generation. |
+| `cmgrpo_raven_lora.safetensors` | CM-GRPO LoRA adapter only. Load `raven_model.pt` as the base weight and this file through the LoRA path. |
+| `cmgrpo_raven_full.pt` | RAVEN base and CM-GRPO LoRA adapter packed into one PEFT-wrapped state dict. Load this file through the LoRA path without a separate base weight. |
+| `cmgrpo_raven_merge.pt` | Full CM-GRPO backbone with the adapter already merged into RAVEN. Load this file as the base weight, with no LoRA block. |
 RAVEN trains a causal video generator using a training-time test framework that repacks each self rollout into an interleaved sequence of clean historical endpoints and noisy denoising states. This aligns the model's training attention pattern with inference-time autoregressive extrapolation and allows downstream chunk losses to supervise the historical representations used for future predictions.
 Download this model repository:
 ```bash
+hf download mvp-lab/RAVEN --local-dir /path/to/RAVEN-weights
 ```
+Then point the relevant config files to the downloaded checkpoints. RAVEN itself (`raven_model.pt`) is a single full backbone:
+```jsonc
+"backbone": {
+    "weight": "/path/to/RAVEN-weights/raven_model.pt"
+}
+```
+CM-GRPO can be loaded in any of three equivalent forms:
+Adapter only (`cmgrpo_raven_lora.safetensors`):
+```jsonc
+"backbone": {
+    "weight": "/path/to/RAVEN-weights/raven_model.pt",
+    "lora": {
+        "enabled": true,
+        "weight": "/path/to/RAVEN-weights/cmgrpo_raven_lora.safetensors"
+    }
+}
+```
+Base + LoRA bundle (`cmgrpo_raven_full.pt`):
+```jsonc
+"backbone": {
+    "lora": {
+        "enabled": true,
+        "weight": "/path/to/RAVEN-weights/cmgrpo_raven_full.pt"
+    }
+}
+```
+Merged backbone (`cmgrpo_raven_merge.pt`):
+```jsonc
+"backbone": {
+    "weight": "/path/to/RAVEN-weights/cmgrpo_raven_merge.pt"
+}
+```
+The released CM-GRPO configs use the base + LoRA bundle form by default.
 Reference configs: