Dramabox

Running on Zero

App Files Files Community

tedi-resemble commited on 12 days ago

Commit

2081283

verified ·

1 Parent(s): d7f5f1a

Drop xformers: FA3 Hopper kernel crashes on ZeroGPU H200

Browse files

xformers==0.0.32.post2 raised 'CUDA error: invalid argument' from the bundled flash-attention 3 Hopper launch template at runtime on ZeroGPU's H200. ltx_core/model/transformer/attention.py has a try/except import on xformers.ops and falls back to PytorchAttention (torch.nn.functional.scaled_dot_product_attention), which on torch 2.8 + H200 uses cuDNN/mem-efficient SDPA backends. No app code change needed.

Files changed (1) hide show

requirements.txt +5 -1

requirements.txt CHANGED Viewed

@@ -8,7 +8,11 @@ pydantic==2.10.6
 safetensors>=0.4.0
 accelerate>=0.25.0
 peft>=0.7.0
-xformers==0.0.32.post2
 av>=12.0.0
 einops>=0.7.0
 PyYAML>=6.0

 safetensors>=0.4.0
 accelerate>=0.25.0
 peft>=0.7.0
+# xformers removed: 0.0.32.post2's bundled FA3 Hopper kernel raises
+# "CUDA error: invalid argument" on ZeroGPU's H200 runtime. attention.py
+# falls back to torch SDPA cleanly when xformers is absent (try/except
+# import at module level), and SDPA on torch 2.8 has its own efficient
+# H200 backends.
 av>=12.0.0
 einops>=0.7.0
 PyYAML>=6.0