Brian9999 commited on
Commit
8625d6a
·
0 Parent(s):

Super-squash branch 'main' using huggingface_hub

Browse files
Files changed (3) hide show
  1. .gitattributes +35 -0
  2. README.md +79 -0
  3. model.safetensors +3 -0
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - video-generation
5
+ - game-rendering
6
+ - game-editing
7
+ - diffusion
8
+ - g-buffer
9
+ - relighting
10
+ - text-to-video
11
+ - wan2.1
12
+ pipeline_tag: text-to-video
13
+ base_model: Wan-AI/Wan2.1-T2V-1.3B
14
+ datasets:
15
+ - custom
16
+ library_name: diffusers
17
+ ---
18
+
19
+ # Game Editing
20
+
21
+ **Game Editing** is a fine-tuned video diffusion model for controllable game video synthesis. It enables users to manipulate lighting and environmental effects in game footage via text prompts, conditioned on G-buffer inputs.
22
+
23
+ ## Model Details
24
+
25
+ | Attribute | Detail |
26
+ |-----------|--------|
27
+ | **Base Model** | [Wan 2.1-T2V-1.3B](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B) |
28
+ | **Parameters** | 1.42B (BF16) |
29
+ | **Resolution** | 832 × 480 (480p) |
30
+ | **Frame Rate** | 16 FPS |
31
+ | **Clip Length** | 81 frames |
32
+ | **Format** | SafeTensors |
33
+
34
+ ## Inputs
35
+
36
+ The model takes the following inputs:
37
+
38
+ - **G-buffers** as conditional inputs, providing dense geometric and material priors:
39
+ - **Basecolor** (albedo)
40
+ - **Normal** (surface normals)
41
+ - **Depth**
42
+ - **Roughness**
43
+ - **Metallic**
44
+ - **Text prompt** describing the desired lighting and environmental effects
45
+
46
+ The G-buffers encode the scene's geometry and materials, while the text prompt controls lighting conditions, atmospheric effects, and overall visual style. This decoupled design allows users to edit the visual appearance of game footage without altering the underlying scene structure.
47
+
48
+ ## Training
49
+
50
+ ### Architecture
51
+
52
+ We adapt [Wan 2.1-T2V-1.3B](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B) by incorporating G-buffers (dense geometric and material priors) as conditional inputs. The model is fully fine-tuned following the original training configuration of the base model.
53
+
54
+ ### Data
55
+
56
+ The model is trained on video clips from the [**Black Myth: Wukong** dataset](https://github.com/ShandaAI/AlayaRenderer). Descriptive captions for each clip are generated using [Qwen3-VL-235B-A22B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-235B-A22B-Instruct). Since G-buffers already provide dense geometric and material information, the captions focus exclusively on **lighting and environmental effects**, enabling fine-grained text-based control over these attributes during inference.
57
+
58
+ ### Procedure
59
+
60
+ - **Full fine-tuning** on the [Black Myth: Wukong dataset](https://github.com/ShandaAI/AlayaRenderer)
61
+ - Spatial resolution: **832 × 480** (480p)
62
+ - Frame rate: **16 FPS**
63
+ - Clip length: **81 frames**
64
+
65
+ ## Evaluation & Generalization
66
+
67
+ In the absence of directly comparable methods, we establish a baseline by adapting DiffusionRenderer's forward renderer with DiffusionLight-extracted environment maps as lighting conditions.
68
+
69
+ - A held-out subset of Black Myth: Wukong is used for testing.
70
+ - **Cross-dataset evaluation** on **Cyberpunk 2077** demonstrates strong generalization to unseen game environments, maintaining high-fidelity and controllable video synthesis.
71
+
72
+ ## Intended Use
73
+
74
+ - **Game video editing**: Manipulate lighting and environmental effects in game footage through text descriptions.
75
+ - **Controllable video synthesis**: Generate stylized game video conditioned on G-buffers and text prompts.
76
+
77
+ ## Citation
78
+
79
+ If you find this model useful, please consider citing our work.
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6ad3f92058110417e68e56c53fc11ce8f077a4dea2689c3a0fef43251d8da853
3
+ size 2839060368