Commit ·
a8ba480
0
Parent(s):
Duplicate from vita-video-gen/svi-model
Browse filesCo-authored-by: svi <vita-video-gen@users.noreply.huggingface.co>
- .gitattributes +35 -0
- README.md +68 -0
- version-1.0/svi-dance.safetensors +3 -0
- version-1.0/svi-film-opt-10212025.safetensors +3 -0
- version-1.0/svi-film-transitions.safetensors +3 -0
- version-1.0/svi-film.safetensors +3 -0
- version-1.0/svi-shot.safetensors +3 -0
- version-1.0/svi-talk.safetensors +3 -0
- version-1.0/svi-tom.safetensors +3 -0
- version-2.0/SVI_Wan2.1-I2V-14B_lora_v2.0.safetensors +3 -0
- version-2.0/SVI_Wan2.2-I2V-A14B_high_noise_lora_v2.0.safetensors +3 -0
- version-2.0/SVI_Wan2.2-I2V-A14B_high_noise_lora_v2.0_pro.safetensors +3 -0
- version-2.0/SVI_Wan2.2-I2V-A14B_low_noise_lora_v2.0.safetensors +3 -0
- version-2.0/SVI_Wan2.2-I2V-A14B_low_noise_lora_v2.0_pro.safetensors +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
datasets:
|
| 3 |
+
- vita-video-gen/svi-benchmark
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
tags:
|
| 7 |
+
- video generation
|
| 8 |
+
pipeline_tag: image-to-video
|
| 9 |
+
library_name: diffusers
|
| 10 |
+
license: mit
|
| 11 |
+
project_page: https://stable-video-infinity.github.io/homepage/
|
| 12 |
+
papers:
|
| 13 |
+
- title: 'Stable Video Infinity: Infinite-Length Video Generation with Error Recycling'
|
| 14 |
+
authors:
|
| 15 |
+
- Wuyang Li
|
| 16 |
+
- Wentao Pan
|
| 17 |
+
- Po-Chien Luan
|
| 18 |
+
- Yang Gao
|
| 19 |
+
- Alexandre Alahi
|
| 20 |
+
url: https://huggingface.co/papers/2510.09212
|
| 21 |
+
conference: arXiv preprint, 2025
|
| 22 |
+
---
|
| 23 |
+
|
| 24 |
+
<div align="center">
|
| 25 |
+
|
| 26 |
+
<h1>Stable Video Infinity: Infinite-Length Video Generation with Error Recycling<h1>
|
| 27 |
+
|
| 28 |
+
<p align="center">
|
| 29 |
+
<a href="https://huggingface.co/papers/2510.09212"> <img src="https://img.shields.io/badge/Paper-HuggingFace-red?logo=huggingface&logoColor=yellow" alt="Paper on Hugging Face"/> </a>
|
| 30 |
+
<a href="https://stable-video-infinity.github.io/homepage/"> <img src="https://img.shields.io/badge/Project-Page-green" alt="Project Page"/> </a>
|
| 31 |
+
<a href="https://github.com/vita-epfl/Stable-Video-Infinity"> <img src="https://img.shields.io/badge/SVI-GitHub-black?logo=github&logoColor=white" alt="SVI on GitHub"/> </a>
|
| 32 |
+
<a href="https://huggingface.co/datasets/vita-video-gen/svi-benchmark"> <img src="https://img.shields.io/badge/SVI_Dataset-Hugging%20Face-orange?logo=huggingface&logoColor=yellow" alt="SVI Dataset"/> </a>
|
| 33 |
+
<a href="https://huggingface.co/vita-video-gen/svi-model"> <img src="https://img.shields.io/badge/SVI_models-Hugging%20Face-FFCC00?logo=huggingface&logoColor=yellow" alt="SVI Models"/> </a> </p> </div>
|
| 34 |
+
|
| 35 |
+
## 🎯 About This Repository
|
| 36 |
+
**Stable-Video-Infinity(SVI)** is able to generate ANY-length videos with high temporal consistency, plausible scene transitions, and controllable streaming storylines in ANY domains.
|
| 37 |
+
This repository contains the model weights of SVI Family.
|
| 38 |
+
|
| 39 |
+
## 🌟 Key Highlights
|
| 40 |
+
- **OpenSVI**: Everything is open-sourced: training & evaluation scripts, datasets, and more.
|
| 41 |
+
- **Infinite Length**: No inherent limit on video duration; generate arbitrarily long stories (see the 10‑minute “Tom and Jerry” demo).
|
| 42 |
+
- **Versatile**: Supports diverse in-the-wild generation tasks: multi-scene short films, single‑scene animations, skeleton-/audio-conditioned generation, cartoons, and more.
|
| 43 |
+
- **Efficient**: Only LoRA adapters are tuned, requiring very little training data: anyone can make their own SVI easily.
|
| 44 |
+
## 📦 Resources
|
| 45 |
+
| **Model** | **Task** | **Input** | **Output** | **Hugging Face Link** | **Comments** |
|
| 46 |
+
|-------|------|-------|--------|-------------------|------------------|
|
| 47 |
+
| **ALL** | Infinite possibility | Image + X | X video | [🤗 Folder](https://huggingface.co/vita-video-gen/svi-model/tree/main/version-1.0) |Family bucket! I want to play with all! |
|
| 48 |
+
| **SVI-Shot** | Single-scene generation | Image + Text prompt | Long video | [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-shot.safetensors?download=true) | Generate consistent long video with 1 text prompt. (This will never drift) |
|
| 49 |
+
| **SVI-Film** | Multi-scene generation | Image + Text prompt stream | Film-style video | [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-film.safetensors?download=true) | Generate creative long video with 1 text prompt stream (5 second per text). |
|
| 50 |
+
| **SVI-Film (Transition)** | Multi-scene generation | Image + Text prompt stream | Film-style video | [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-film-transitions.safetensors?download=true) |Generate creative long video with 1 text prompt stream. (More scene transitions due to the training data) |
|
| 51 |
+
| **SVI-Tom&Jerry** | Cartoon animation | Image | Cartoon video | [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-tom.safetensors?download=true) | Generate creative long cartoon videos with 1 text prompt stream (This will never drift in our 20 min test)|
|
| 52 |
+
| **SVI-Talk** | Talking head | Image + Audio | Talking video | [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-talk.safetensors?download=true) |Generate long videos with audio-conditioned human speaking |
|
| 53 |
+
| **SVI-Dance** | Dancing animation | Image + Skeleton | Dance video | [🤗 Model](https://huggingface.co/vita-video-gen/svi-model/resolve/main/version-1.0/svi-dance.safetensors?download=true) | Generate long videos with skeleton-conditioned human dancing |
|
| 54 |
+
|
| 55 |
+
Note: If you want to play with T2V, you can directly use SVI with an image generated by any T2I model!
|
| 56 |
+
|
| 57 |
+
## 📝 Citation
|
| 58 |
+
If you find our work helpful for your research, please consider citing our paper. Thank you so much!
|
| 59 |
+
|
| 60 |
+
```bibtex
|
| 61 |
+
@article{li2025stable,
|
| 62 |
+
title={Stable Video Infinity: Infinite-Length Video Generation with Error Recycling},
|
| 63 |
+
author={Wuyang Li and Wentao Pan and Po-Chien Luan and Yang Gao and Alexandre Alahi},
|
| 64 |
+
journal={arXiv preprint arXiv: arXiv:2510.09212},
|
| 65 |
+
year={2025},
|
| 66 |
+
url={https://huggingface.co/papers/2510.09212},
|
| 67 |
+
}
|
| 68 |
+
```
|
version-1.0/svi-dance.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e762aa261a493332496d0e0b98c104908b09dd16ddf4fbcec678fdfea54830cb
|
| 3 |
+
size 2455245856
|
version-1.0/svi-film-opt-10212025.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:21ccac80d313bec2aa5cf7242c8a89ba58c62bf5db88f149efd01fed4a0c909d
|
| 3 |
+
size 2453769560
|
version-1.0/svi-film-transitions.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d1f31151d84f32077e029fe0d92ff9c8cef7060e3787d6db98e282d8902d98a1
|
| 3 |
+
size 2453769560
|
version-1.0/svi-film.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:216336375018faa6980ce8dcdacb5e0517fc49ee957c4b1915dc8d13f79d49cc
|
| 3 |
+
size 2453769560
|
version-1.0/svi-shot.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fa100103b04b034bb7968b8d61a608b9177a2618277d8e8dd5c40abf8dafb9cb
|
| 3 |
+
size 2453769560
|
version-1.0/svi-talk.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a906399d9cbe14b1cdbbb9b9d14b300b5e2bd30a21998f97d1e5e03a849aed8c
|
| 3 |
+
size 2453769560
|
version-1.0/svi-tom.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4b57722b23ce6819cae35734665ac41253a9be3bea466660ef6139f05eccda68
|
| 3 |
+
size 2453769560
|
version-2.0/SVI_Wan2.1-I2V-14B_lora_v2.0.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5730a21842772595dea8db838ad9b8ffc16e50a4fcebfe7e0ec224b788aceb29
|
| 3 |
+
size 2453769560
|
version-2.0/SVI_Wan2.2-I2V-A14B_high_noise_lora_v2.0.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8e8d8ed323dca40439a5ff3829432008320de13ab6896b83c36d0eaa00783aef
|
| 3 |
+
size 1226928552
|
version-2.0/SVI_Wan2.2-I2V-A14B_high_noise_lora_v2.0_pro.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:299b33006863194d077a43bc0abf16fc52963457657d867763f2b61fd6a9bd52
|
| 3 |
+
size 1226928552
|
version-2.0/SVI_Wan2.2-I2V-A14B_low_noise_lora_v2.0.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:febf40f1e1a6696e8551165eb72808f37e8f3ec59031280afce5740a580aca8b
|
| 3 |
+
size 1226928552
|
version-2.0/SVI_Wan2.2-I2V-A14B_low_noise_lora_v2.0_pro.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e8fcce153df0f5a2b49a17c2f82bd795002f0e3b35f25d6922da9cfe072b9c0b
|
| 3 |
+
size 1226928552
|