Efficient-Large-Model
/

SANA-WM_bidirectional

@@ -6,7 +6,6 @@ tags:
   - camera-control
   - world-model
   - diffusion
-library_name: NVlabs-Sana
 ---
 # SANA-WM (Bidirectional)
@@ -33,9 +32,9 @@ Four core designs drive the architecture:
 Paper: <https://arxiv.org/abs/2605.15178>
 ```bibtex
-@article{zhu2026sanawm,
   title   = {{SANA-WM}: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer},
-  author  = {Zhu, Haoyi and Liu, Haozhe and Zhao, Yuyang and Ye, Tian and Chen, Junsong and Yu, Jincheng and He, Tong and Han, Song and Xie, Enze},
   journal = {arXiv preprint arXiv:2605.15178},
   year    = {2026},
 }
@@ -52,13 +51,10 @@ Paper: <https://arxiv.org/abs/2605.15178>
 | Inference config                   | `config.yaml`                             |     — |
 The Sana text encoder (`gemma-2-2b-it`) is **not** bundled here — it is
-fetched on demand from `Efficient-Large-Model/gemma-2-2b-it`.
 ## Usage
-Install the inference repo (see [environment_setup_sana_wm.sh](https://github.com/NVlabs/Sana/blob/main/environment_setup_sana_wm.sh))
-and run:
 ```bash
 python inference_video_scripts/inference_sana_wm.py \
   --image      asset/sana_wm/demo_0.png \
@@ -91,5 +87,4 @@ aspect-preserving resized + center-cropped to that resolution.
 ## License
 Released under the Apache 2.0 license. The bundled LTX-2 refiner and VAE
-inherit the LTX-2 upstream license; see the parent NVlabs-Sana
-repository for details.

   - camera-control
   - world-model
   - diffusion
 ---
 # SANA-WM (Bidirectional)
 Paper: <https://arxiv.org/abs/2605.15178>
 ```bibtex
+@article{sanawm2026,
   title   = {{SANA-WM}: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer},
+  author  = {Anonymous},
   journal = {arXiv preprint arXiv:2605.15178},
   year    = {2026},
 }
 | Inference config                   | `config.yaml`                             |     — |
 The Sana text encoder (`gemma-2-2b-it`) is **not** bundled here — it is
+fetched on demand from the public Hugging Face mirror.
 ## Usage
 ```bash
 python inference_video_scripts/inference_sana_wm.py \
   --image      asset/sana_wm/demo_0.png \
 ## License
 Released under the Apache 2.0 license. The bundled LTX-2 refiner and VAE
+inherit the LTX-2 upstream license.