Drop xformers: FA3 Hopper kernel crashes on ZeroGPU H200
Browse filesxformers==0.0.32.post2 raised 'CUDA error: invalid argument' from the bundled flash-attention 3 Hopper launch template at runtime on ZeroGPU's H200. ltx_core/model/transformer/attention.py has a try/except import on xformers.ops and falls back to PytorchAttention (torch.nn.functional.scaled_dot_product_attention), which on torch 2.8 + H200 uses cuDNN/mem-efficient SDPA backends. No app code change needed.
- requirements.txt +5 -1
requirements.txt
CHANGED
|
@@ -8,7 +8,11 @@ pydantic==2.10.6
|
|
| 8 |
safetensors>=0.4.0
|
| 9 |
accelerate>=0.25.0
|
| 10 |
peft>=0.7.0
|
| 11 |
-
xformers
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
av>=12.0.0
|
| 13 |
einops>=0.7.0
|
| 14 |
PyYAML>=6.0
|
|
|
|
| 8 |
safetensors>=0.4.0
|
| 9 |
accelerate>=0.25.0
|
| 10 |
peft>=0.7.0
|
| 11 |
+
# xformers removed: 0.0.32.post2's bundled FA3 Hopper kernel raises
|
| 12 |
+
# "CUDA error: invalid argument" on ZeroGPU's H200 runtime. attention.py
|
| 13 |
+
# falls back to torch SDPA cleanly when xformers is absent (try/except
|
| 14 |
+
# import at module level), and SDPA on torch 2.8 has its own efficient
|
| 15 |
+
# H200 backends.
|
| 16 |
av>=12.0.0
|
| 17 |
einops>=0.7.0
|
| 18 |
PyYAML>=6.0
|