tedi-resemble commited on
Commit
2081283
·
verified ·
1 Parent(s): d7f5f1a

Drop xformers: FA3 Hopper kernel crashes on ZeroGPU H200

Browse files

xformers==0.0.32.post2 raised 'CUDA error: invalid argument' from the bundled flash-attention 3 Hopper launch template at runtime on ZeroGPU's H200. ltx_core/model/transformer/attention.py has a try/except import on xformers.ops and falls back to PytorchAttention (torch.nn.functional.scaled_dot_product_attention), which on torch 2.8 + H200 uses cuDNN/mem-efficient SDPA backends. No app code change needed.

Files changed (1) hide show
  1. requirements.txt +5 -1
requirements.txt CHANGED
@@ -8,7 +8,11 @@ pydantic==2.10.6
8
  safetensors>=0.4.0
9
  accelerate>=0.25.0
10
  peft>=0.7.0
11
- xformers==0.0.32.post2
 
 
 
 
12
  av>=12.0.0
13
  einops>=0.7.0
14
  PyYAML>=6.0
 
8
  safetensors>=0.4.0
9
  accelerate>=0.25.0
10
  peft>=0.7.0
11
+ # xformers removed: 0.0.32.post2's bundled FA3 Hopper kernel raises
12
+ # "CUDA error: invalid argument" on ZeroGPU's H200 runtime. attention.py
13
+ # falls back to torch SDPA cleanly when xformers is absent (try/except
14
+ # import at module level), and SDPA on torch 2.8 has its own efficient
15
+ # H200 backends.
16
  av>=12.0.0
17
  einops>=0.7.0
18
  PyYAML>=6.0