| --- |
| language: en |
| license: apache-2.0 |
| tags: |
| - text-encoder |
| - vae |
| - clip-vision |
| - wan-2.2 |
| - comfyui |
| --- |
| |
| # Wan 2.2 shared encoders — VAE + UMT5 + CLIP-vision |
|
|
| **Organization:** [WindstormLabs](https://huggingface.co/WindstormLabs) |
| **Used by:** [SceneMachine](https://huggingface.co/SceneMachine) |
|
|
| SceneMachine is a sub-project of Windstorm Labs. This repo hosts shared AI/ML infrastructure used by SceneMachine and reusable by future Windstorm Labs sub-projects. |
|
|
| ## What this is |
|
|
| Shared encoder weights used across the Wan 2.2 T2V / I2V / Animate stacks. Wan 2.1 VAE (still used by 2.2), UMT5-xxl text encoder, SigLIP vision-patch14-384 (for I2V), and CLIP-ViT-H (REQUIRED by Animate's face_adapter — SigLIP triggers a LayerNorm shape mismatch). |
| |
| ## Upstream source |
| |
| Primary distribution: https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged + https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged (CLIP-ViT-H) |
| |
| This repo is a **mirror**. License terms of the upstream apply unchanged. |
| |
| **Primary license owner(s):** Alibaba / Google T5 / Google SigLIP / OpenCLIP-laion |
| |
| ## Files |
| |
| | Filename | Size | |
| |---|---| |
| | `umt5_xxl_bf16_from_pth.safetensors` | 11.36 GB | |
| | `clip_vision_h.safetensors` | 1.26 GB | |
| | `sigclip_vision_patch14_384.safetensors` | 0.86 GB | |
| | `wan_2.1_vae.safetensors` | 0.25 GB | |
|
|
| **Total: 13.74 GB** |
|
|
| ## Related repos |
|
|
| - The full SceneMachine model stack: search the [SceneMachine HF collection](https://huggingface.co/SceneMachine). |
| - Shared encoders and Wan VAE: [`WindstormLabs/wan22-encoders`](https://huggingface.co/WindstormLabs/wan22-encoders). |
| - Speed LoRAs: [`WindstormLabs/wan22-loras`](https://huggingface.co/WindstormLabs/wan22-loras). |
|
|
| ## Provenance |
|
|
| Mirror created 2026-05-13 from local working copy on the SceneMachine development rig. Hashes preserved via HF's content-addressed storage. |
|
|
| 🤖 Repo and README generated by [Claude Code](https://claude.com/claude-code) during a SceneMachine CTO session. |
|
|