Update model card metadata, links, and sample usage

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +40 -11
README.md CHANGED
@@ -1,26 +1,55 @@
1
  ---
2
  license: apache-2.0
 
3
  tags:
4
- - video-generation
5
- - multimodal
6
- - diffusion
7
- arxiv: 2605.00658
8
  ---
9
 
10
  # UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors
11
 
12
  UniVidX is a unified multimodal video diffusion framework for versatile video generation and perception. It supports omni-directional conditional generation across multiple modalities by training a single model to handle different input-output mappings rather than one fixed task.
13
 
 
 
14
  This repository hosts the released UniVidX checkpoints:
15
 
16
- - `univid_intrinsic.safetensors`: checkpoint for UniVid-Intrinsic, covering RGB, albedo, irradiance, and normal video modalities.
17
- - `univid_alpha.safetensors`: checkpoint for UniVid-Alpha, covering blended RGB video, alpha matte, foreground, and background modalities.
 
 
 
 
 
 
18
 
19
- ## Links
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- - Paper: [arXiv:2605.00658](https://arxiv.org/pdf/2605.00658)
22
- - Code: [github.com/houyuanchen111/UniVidX](https://github.com/houyuanchen111/UniVidX)
23
- - Project / Model Page: [huggingface.co/houyuanchen/UniVidX](https://huggingface.co/houyuanchen/UniVidX)
 
 
 
 
 
 
 
 
 
 
24
 
25
  ## Citation
26
 
@@ -39,4 +68,4 @@ If you find this work useful, please cite:
39
  doi = {10.1145/3811304},
40
  url = {https://doi.org/10.1145/3811304}
41
  }
42
- ```
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: any-to-any
4
  tags:
5
+ - video-generation
6
+ - multimodal
7
+ - diffusion
 
8
  ---
9
 
10
  # UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors
11
 
12
  UniVidX is a unified multimodal video diffusion framework for versatile video generation and perception. It supports omni-directional conditional generation across multiple modalities by training a single model to handle different input-output mappings rather than one fixed task.
13
 
14
+ [**Project Page**](https://houyuanchen111.github.io/UniVidX.github.io/) | [**Paper**](https://huggingface.co/papers/2605.00658) | [**Code**](https://github.com/houyuanchen111/UniVidX)
15
+
16
  This repository hosts the released UniVidX checkpoints:
17
 
18
+ - `univid_intrinsic.safetensors`: checkpoint for **UniVid-Intrinsic**, covering RGB, albedo, irradiance, and normal video modalities.
19
+ - `univid_alpha.safetensors`: checkpoint for **UniVid-Alpha**, covering blended RGB video, alpha matte, foreground, and background modalities.
20
+
21
+ ## Sample Usage
22
+
23
+ To use these models, you need to clone the official repository and set up the environment.
24
+
25
+ ### 1. Installation
26
 
27
+ ```bash
28
+ # Clone the repository
29
+ git clone https://github.com/houyuanchen111/UniVidX.git
30
+ cd UniVidX
31
+
32
+ # Create environment
33
+ conda create -n unividx python=3.10
34
+ conda activate unividx
35
+
36
+ # Install dependencies
37
+ pip install -r requirements.txt
38
+ ```
39
 
40
+ ### 2. Inference
41
+
42
+ The framework uses YAML configuration files to manage tasks. After downloading the backbone weights and UniVidX checkpoints as described in the [GitHub README](https://github.com/houyuanchen111/UniVidX), you can run inference using:
43
+
44
+ ```bash
45
+ # UniVid-Alpha inference
46
+ python scripts/inference_univid_alpha.py --config configs/univid_alpha_inference.yaml
47
+
48
+ # UniVid-Intrinsic inference
49
+ python scripts/inference_univid_intrinsic.py --config configs/univid_intrinsic_inference.yaml
50
+ ```
51
+
52
+ The framework supports 15 different task modes (e.g., `t2RAIN`, `R2PFB`) for various conditional generation scenarios.
53
 
54
  ## Citation
55
 
 
68
  doi = {10.1145/3811304},
69
  url = {https://doi.org/10.1145/3811304}
70
  }
71
+ ```