File size: 5,903 Bytes
07cfe31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
---
license: apache-2.0
language:
- en
base_model:
- Wan-AI/Wan2.1-I2V-14B-720P
- Wan-AI/Wan2.1-I2V-14B-720P-Diffusers
pipeline_tag: image-to-video
tags:
- text-to-image
- lora
- diffusers
- template:diffusion-lora
widget:
- text: >-
    gangtiexia,背景保持不变,这个人开始变身白色机甲,变身过程中出现机甲面罩遮住脸部,变身完成之后这个人向前走
  output:
    url: result/output1.mp4
- text: >-
    gangtiexia,背景保持不变,这个人开始变身粉色机甲,变身过程中出现机甲面罩遮住脸部,变身完成之后这个人向前走
  output:
    url: result/output2.mp4
- text: >-
    gangtiexia,背景保持不变,这个人开始变身金色机甲,变身过程中出现机甲面罩遮住脸部,变身完成之后这个人向前走
  output:
    url: result/output3.mp4
- text: >-
    gangtiexia,背景保持不变,这个人开始变身金色机甲,变身过程中出现机甲面罩遮住脸部,变身完成之后这个人向前走
  output:
    url: result/output4.mp4
- text: >-
    gangtiexia,背景保持不变,这个人开始变身白色机甲,变身过程中出现机甲面罩遮住脸部,变身完成之后这个人向前走
  output:
    url: result/output5.mp4
- text: >-
    gangtiexia,背景保持不变,这个人开始变身红色机甲,变身过程中出现机甲面罩遮住脸部,变身完成之后这个人向前走
  output:
    url: result/output6.mp4

---
<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
  <h1 style="color: #24292e; margin-top: 0;">valiantcat LoRA for Wan2.1 14B I2V 720p</h1>
  
  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Overview</h2>
    <p>This LoRA is trained on the Wan2.1 14B I2V 720p model.</p>
  </div>

  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Features</h2>
    <ul style="margin-bottom: 0;">
      <li>Transform any image into a video of two people began to fight</li>
      <li>Trained on the Wan2.1 14B 720p I2V base model</li>
      <li>Consistent results across different object types</li>
      <li>Simple prompt structure that's easy to adapt</li>
    </ul>
  </div>

<Gallery />

# Model File and Inference Workflow

## 📥 Download Links:

- [wan2.1-Mecha.safetensors](./wan2.1-Mecha.safetensors) - LoRA Model File
- [wan_img2video_lora_workflow.json](./result/wan2.1-exmple.json) - Wan I2V with LoRA Workflow for ComfyUI

## Using with Diffusers
```py
pip install git+https://github.com/huggingface/diffusers.git
```

```py
import torch
from diffusers.utils import export_to_video, load_image
from diffusers import AutoencoderKLWan, WanImageToVideoPipeline
from transformers import CLIPVisionModel
import numpy as np

model_id = "Wan-AI/Wan2.1-I2V-14B-720P-Diffusers"
image_encoder = CLIPVisionModel.from_pretrained(model_id, subfolder="image_encoder", torch_dtype=torch.float32)
vae = AutoencoderKLWan.from_pretrained(model_id, subfolder="vae", torch_dtype=torch.float32)
pipe = WanImageToVideoPipeline.from_pretrained(model_id, vae=vae, image_encoder=image_encoder, torch_dtype=torch.bfloat16)
pipe.to("cuda")

pipe.load_lora_weights("valiantcat/Wan2.1-Mecha-LoRA")

pipe.enable_model_cpu_offload() #for low-vram environments

prompt = "gangtiexia,背景保持不变,这个人开始变身红色机甲,变身过程中出现机甲面罩遮住脸部,变身完成之后这个人向前走."

image = load_image("https://huggingface.co/valiantcat/Wan2.1-Mecha-LoRA/blob/main/result/test.jpg")

max_area = 512 * 768
aspect_ratio = image.height / image.width
mod_value = pipe.vae_scale_factor_spatial * pipe.transformer.config.patch_size[1]
height = round(np.sqrt(max_area * aspect_ratio)) // mod_value * mod_value
width = round(np.sqrt(max_area / aspect_ratio)) // mod_value * mod_value
image = image.resize((width, height))

output = pipe(
    image=image,
    prompt=prompt,
    height=height,
    width=width,
    num_frames=81,
    guidance_scale=5.0,
    num_inference_steps=25
).frames[0]
export_to_video(output, "output.mp4", fps=16)
```

---
<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Recommended Settings</h2>
    <ul style="margin-bottom: 0;">
      <li><b>LoRA Strength:</b> 1.0</li>
      <li><b>Embedded Guidance Scale:</b> 6.0</li>
      <li><b>Flow Shift:</b> 5.0</li>
    </ul>
  </div>

  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Trigger Words</h2>
    <p>The key trigger phrase is: <code style="background-color: #f0f0f0; padding: 3px 6px; border-radius: 4px;">gangtiexia</code></p>
  </div>

  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Prompt Template</h2>
    <p>For best results, use this prompt structure:</p>
    <div style="background-color: #f0f0f0; padding: 12px; border-radius: 6px; margin: 10px 0;">
      <i>gangtiexia,背景保持不变,这个人开始变身[color]色机甲,变身过程中出现机甲面罩遮住脸部,变身完成之后这个人向前走</i>
    </div>
    <p>Simply replace <code style="background-color: #f0f0f0; padding: 3px 6px; border-radius: 4px;">[color]</code> with whatever you want to let this person transform into the color of a mecha</p>
  </div>