Instructions to use DarthZhu/VideoRLVR-Wan2.2-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use DarthZhu/VideoRLVR-Wan2.2-Base with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image, export_to_video # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("DarthZhu/VideoRLVR-Wan2.2-Base", dtype=torch.bfloat16, device_map="cuda") pipe.to("cuda") prompt = "A man with short gray hair plays a red electric guitar." image = load_image( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png" ) output = pipe(image=image, prompt=prompt).frames[0] export_to_video(output, "output.mp4") - Notebooks
- Google Colab
- Kaggle
Improve model card and metadata
Browse filesHi! I'm Niels from the Hugging Face community science team.
I've opened this pull request to enhance your model card with metadata and links to your research. Adding this information helps make your model more discoverable and provides users with the necessary context regarding its training and architecture.
Specifically, I have:
- Added the `image-to-video` pipeline tag.
- Added `library_name: diffusers` as the configuration files indicate compatibility.
- Added the `apache-2.0` license.
- Provided links to the [paper](https://huggingface.co/papers/2605.15458), project page, and GitHub repository.
- Included the citation information.
Feel free to merge this if it looks good to you!
README.md
CHANGED
|
@@ -1,6 +1,36 @@
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
-
|
|
|
|
| 4 |
base_model:
|
| 5 |
- Wan-AI/Wan2.2-TI2V-5B
|
| 6 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
library_name: diffusers
|
| 4 |
+
pipeline_tag: image-to-video
|
| 5 |
base_model:
|
| 6 |
- Wan-AI/Wan2.2-TI2V-5B
|
| 7 |
+
datasets:
|
| 8 |
+
- DarthZhu/VideoRLVR-Data
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# VideoRLVR
|
| 12 |
+
|
| 13 |
+
VideoRLVR is a reinforcement learning (RL) recipe for training video reasoning models with verifiable rewards, introduced in the paper [Video Models Can Reason with Verifiable Rewards](https://huggingface.co/papers/2605.15458).
|
| 14 |
+
|
| 15 |
+
This checkpoint is an RL-optimized version of [Wan2.2-TI2V-5B](https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B) trained on procedurally generated reasoning tasks including Maze, FlowFree, and Sokoban.
|
| 16 |
+
|
| 17 |
+
- **Paper:** [Video Models Can Reason with Verifiable Rewards](https://huggingface.co/papers/2605.15458)
|
| 18 |
+
- **Project Page:** [https://darthzhu.github.io/VideoRLVR-page/](https://darthzhu.github.io/VideoRLVR-page/)
|
| 19 |
+
- **Repository:** [https://github.com/luka-group/VideoRLVR](https://github.com/luka-group/VideoRLVR)
|
| 20 |
+
|
| 21 |
+
## Overview
|
| 22 |
+
|
| 23 |
+
VideoRLVR formulates video reasoning as the generation of verifiable visual trajectories. It utilizes an SDE-GRPO optimization backbone, dense decomposed rewards, and an Early-Step Focus strategy for efficient training. This approach enables video diffusion models to satisfy explicit spatial, temporal, or logical constraints, moving beyond perceptual imitation toward reliable rule-consistent visual reasoning.
|
| 24 |
+
|
| 25 |
+
Across tasks like Maze, FlowFree, and Sokoban, VideoRLVR consistently improves over supervised fine-tuning baselines, demonstrating that verifiable RL can effectively optimize models for objective success criteria.
|
| 26 |
+
|
| 27 |
+
## Citation
|
| 28 |
+
|
| 29 |
+
```bibtex
|
| 30 |
+
@article{zhu2026video,
|
| 31 |
+
title={Video Models Can Reason with Verifiable Rewards},
|
| 32 |
+
author={Tinghui Zhu and Sheng Zhang and James Y. Huang and Selena Song and Xiaofei Wen and Yuankai Li and Hoifung Poon and Muhao Chen},
|
| 33 |
+
journal={arXiv preprint arXiv:2605.15458},
|
| 34 |
+
year={2026}
|
| 35 |
+
}
|
| 36 |
+
```
|