Spaces:
Running on Zero
Running on Zero
fix(spaces): trim preload_from_hub to 10 entries (HF cap) + future-work doc
Browse filesHF caps preload_from_hub at 10 entries; we had 16 → Configuration error.
Dropped the 6 lightest entries (4 Dolly camera LoRAs at ~0.30 GB each,
Motion-Track IC-LoRA at 0.30 GB, Pose-Control IC-LoRA at 0.61 GB).
Those still lazy-download on first use of their respective modes.
Captured deeper optimizations (drop unused 84 GB Lightricks-side
transformer preload, drop GGUF, auto-sync README<->registry, etc.) in
docs/future_improvements.md for follow-up rather than bundling here.
- README.md +0 -6
- docs/future_improvements.md +98 -0
README.md
CHANGED
|
@@ -13,16 +13,10 @@ preload_from_hub:
|
|
| 13 |
- Comfy-Org/ltx-2 split_files/text_encoders/gemma_3_12B_it.safetensors
|
| 14 |
- Kijai/LTX2.3_comfy diffusion_models/ltx-2.3-22b-dev_transformer_only_bf16.safetensors,loras/ltx-2.3-22b-distilled-lora-dynamic_fro09_avg_rank_105_bf16.safetensors,text_encoders/ltx-2.3_text_projection_bf16.safetensors,vae/LTX23_audio_vae_bf16.safetensors,vae/LTX23_video_vae_bf16.safetensors,vae/taeltx2_3.safetensors
|
| 15 |
- Lightricks/LTX-2-19b-IC-LoRA-Detailer ltx-2-19b-ic-lora-detailer.safetensors
|
| 16 |
-
- Lightricks/LTX-2-19b-IC-LoRA-Pose-Control ltx-2-19b-ic-lora-pose-control.safetensors
|
| 17 |
-
- Lightricks/LTX-2-19b-LoRA-Camera-Control-Dolly-In ltx-2-19b-lora-camera-control-dolly-in.safetensors
|
| 18 |
-
- Lightricks/LTX-2-19b-LoRA-Camera-Control-Dolly-Left ltx-2-19b-lora-camera-control-dolly-left.safetensors
|
| 19 |
-
- Lightricks/LTX-2-19b-LoRA-Camera-Control-Dolly-Out ltx-2-19b-lora-camera-control-dolly-out.safetensors
|
| 20 |
-
- Lightricks/LTX-2-19b-LoRA-Camera-Control-Dolly-Right ltx-2-19b-lora-camera-control-dolly-right.safetensors
|
| 21 |
- Lightricks/LTX-2-19b-LoRA-Camera-Control-Jib-Down ltx-2-19b-lora-camera-control-jib-down.safetensors
|
| 22 |
- Lightricks/LTX-2-19b-LoRA-Camera-Control-Jib-Up ltx-2-19b-lora-camera-control-jib-up.safetensors
|
| 23 |
- Lightricks/LTX-2-19b-LoRA-Camera-Control-Static ltx-2-19b-lora-camera-control-static.safetensors
|
| 24 |
- Lightricks/LTX-2.3 ltx-2.3-22b-dev.safetensors,ltx-2.3-22b-distilled-lora-384.safetensors,ltx-2.3-22b-distilled.safetensors,ltx-2.3-spatial-upscaler-x2-1.0.safetensors
|
| 25 |
-
- Lightricks/LTX-2.3-22b-IC-LoRA-Motion-Track-Control ltx-2.3-22b-ic-lora-motion-track-control-ref0.5.safetensors
|
| 26 |
- Lightricks/LTX-2.3-22b-IC-LoRA-Union-Control ltx-2.3-22b-ic-lora-union-control-ref0.5.safetensors
|
| 27 |
- google/gemma-3-12b-it-qat-q4_0-unquantized gemma-3-12b-it/model-00001-of-00005.safetensors,gemma-3-12b-it/model-00002-of-00005.safetensors,gemma-3-12b-it/model-00003-of-00005.safetensors,gemma-3-12b-it/model-00004-of-00005.safetensors,gemma-3-12b-it/model-00005-of-00005.safetensors,gemma-3-12b-it/model.safetensors.index.json,gemma-3-12b-it/preprocessor_config.json,gemma-3-12b-it/tokenizer.model
|
| 28 |
- unsloth/LTX-2.3-GGUF ltx-2.3-22b-dev-BF16.gguf
|
|
|
|
| 13 |
- Comfy-Org/ltx-2 split_files/text_encoders/gemma_3_12B_it.safetensors
|
| 14 |
- Kijai/LTX2.3_comfy diffusion_models/ltx-2.3-22b-dev_transformer_only_bf16.safetensors,loras/ltx-2.3-22b-distilled-lora-dynamic_fro09_avg_rank_105_bf16.safetensors,text_encoders/ltx-2.3_text_projection_bf16.safetensors,vae/LTX23_audio_vae_bf16.safetensors,vae/LTX23_video_vae_bf16.safetensors,vae/taeltx2_3.safetensors
|
| 15 |
- Lightricks/LTX-2-19b-IC-LoRA-Detailer ltx-2-19b-ic-lora-detailer.safetensors
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
- Lightricks/LTX-2-19b-LoRA-Camera-Control-Jib-Down ltx-2-19b-lora-camera-control-jib-down.safetensors
|
| 17 |
- Lightricks/LTX-2-19b-LoRA-Camera-Control-Jib-Up ltx-2-19b-lora-camera-control-jib-up.safetensors
|
| 18 |
- Lightricks/LTX-2-19b-LoRA-Camera-Control-Static ltx-2-19b-lora-camera-control-static.safetensors
|
| 19 |
- Lightricks/LTX-2.3 ltx-2.3-22b-dev.safetensors,ltx-2.3-22b-distilled-lora-384.safetensors,ltx-2.3-22b-distilled.safetensors,ltx-2.3-spatial-upscaler-x2-1.0.safetensors
|
|
|
|
| 20 |
- Lightricks/LTX-2.3-22b-IC-LoRA-Union-Control ltx-2.3-22b-ic-lora-union-control-ref0.5.safetensors
|
| 21 |
- google/gemma-3-12b-it-qat-q4_0-unquantized gemma-3-12b-it/model-00001-of-00005.safetensors,gemma-3-12b-it/model-00002-of-00005.safetensors,gemma-3-12b-it/model-00003-of-00005.safetensors,gemma-3-12b-it/model-00004-of-00005.safetensors,gemma-3-12b-it/model-00005-of-00005.safetensors,gemma-3-12b-it/model.safetensors.index.json,gemma-3-12b-it/preprocessor_config.json,gemma-3-12b-it/tokenizer.model
|
| 22 |
- unsloth/LTX-2.3-GGUF ltx-2.3-22b-dev-BF16.gguf
|
docs/future_improvements.md
ADDED
|
@@ -0,0 +1,98 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Future improvements
|
| 2 |
+
|
| 3 |
+
A backlog of optimizations that aren't blocking but would tighten the deploy.
|
| 4 |
+
None of these are required for current functionality. Order is rough priority,
|
| 5 |
+
not commitment.
|
| 6 |
+
|
| 7 |
+
## Spaces / preload
|
| 8 |
+
|
| 9 |
+
### 1. Stop preloading models that aren't referenced by any workflow
|
| 10 |
+
|
| 11 |
+
Audit on 2026-05-02 (`tools/audit-models` style script) showed two `Lightricks/LTX-2.3`
|
| 12 |
+
files in `preload_from_hub` that aren't actually referenced by any workflow JSON
|
| 13 |
+
we ship:
|
| 14 |
+
|
| 15 |
+
- `ltx-2.3-22b-dev.safetensors` (~42 GB)
|
| 16 |
+
- `ltx-2.3-22b-distilled.safetensors` (~42 GB)
|
| 17 |
+
|
| 18 |
+
The active path uses `Kijai/LTX2.3_comfy diffusion_models/ltx-2.3-22b-dev_transformer_only_bf16.safetensors`.
|
| 19 |
+
Removing both saves ~84 GB of preload bandwidth/storage. Risk: if a future
|
| 20 |
+
workflow update reintroduces the Lightricks-side filenames, lazy download
|
| 21 |
+
takes over (slow first inference) — acceptable for the tradeoff.
|
| 22 |
+
|
| 23 |
+
### 2. Drop `unsloth/LTX-2.3-GGUF` from preload (~39 GB)
|
| 24 |
+
|
| 25 |
+
The GGUF transformer is the low-VRAM alternative. ZeroGPU H200 has 70 GB so
|
| 26 |
+
the BF16 transformer always fits. Lazy-load when a future "Low VRAM" preset
|
| 27 |
+
actually wires the GGUF path.
|
| 28 |
+
|
| 29 |
+
### 3. Drop the `Lightricks/LTX-2-19b-LoRA-Camera-Control-Static/Jib-Up/Jib-Down` preload
|
| 30 |
+
|
| 31 |
+
Each is ~2 GB. The Power Lora Loader has them all listed but defaults all to
|
| 32 |
+
`on: false`, so they only load when the user picks one. Lazy-load is
|
| 33 |
+
appropriate. Currently kept in preload because of the 10-entry cap +
|
| 34 |
+
"easier to keep what we had".
|
| 35 |
+
|
| 36 |
+
### 4. Auto-generate `preload_from_hub` from `MODEL_REGISTRY`
|
| 37 |
+
|
| 38 |
+
Today the README list and `MODEL_REGISTRY` in `models.py` can drift. Build a
|
| 39 |
+
small `tools/sync_preload.py` that:
|
| 40 |
+
|
| 41 |
+
1. Reads `MODEL_REGISTRY`
|
| 42 |
+
2. Walks the workflow JSONs to find which entries are actually referenced
|
| 43 |
+
3. Sorts referenced entries by size (using `huggingface_hub` `repo_info`)
|
| 44 |
+
4. Picks the top N entries that fit in the 10-cap
|
| 45 |
+
5. Writes them back into the README YAML
|
| 46 |
+
|
| 47 |
+
Run as a pre-commit or CI step.
|
| 48 |
+
|
| 49 |
+
### 5. Bake custom-node clones into the build via `requirements.txt` git installs
|
| 50 |
+
|
| 51 |
+
We currently `git clone` 10 custom-node repos in `_bootstrap()` at runtime.
|
| 52 |
+
That's ~30 s of cold start. Some custom nodes ship as pip-installable; for
|
| 53 |
+
the others, we could write a small `tools/install_custom_nodes.py` that
|
| 54 |
+
runs at build time (via `pip install --no-deps` against git URLs) so the
|
| 55 |
+
repos land in the image instead of being fetched at boot.
|
| 56 |
+
|
| 57 |
+
Tradeoff: Spaces' build pipeline runs the gradio SDK Dockerfile which we
|
| 58 |
+
don't control directly. The custom-node clone has to happen at runtime
|
| 59 |
+
unless we can move it into the standard `requirements.txt` build step.
|
| 60 |
+
|
| 61 |
+
### 6. Persistent storage add-on as the "$25/mo button"
|
| 62 |
+
|
| 63 |
+
If iteration speed becomes the binding constraint, the persistent storage
|
| 64 |
+
add-on (Spaces > Settings) at $25/mo for 150 GB makes everything just work
|
| 65 |
+
— `/data` is writable, models live there forever, no preload dance.
|
| 66 |
+
Sketched approach: `HF_HOME=/data/hf-cache` env var + `_bootstrap()` mkdir
|
| 67 |
+
fallback. One-line code change.
|
| 68 |
+
|
| 69 |
+
## Workflow / runtime
|
| 70 |
+
|
| 71 |
+
### 7. Move ComfyUI custom-node `requirements.txt` install to build time
|
| 72 |
+
|
| 73 |
+
Bootstrap currently `pip install`s each custom node's requirements at
|
| 74 |
+
runtime. Most are no-ops (deps already in our top-level `requirements.txt`)
|
| 75 |
+
but the `pip install --quiet` calls still take a few seconds each. Could
|
| 76 |
+
audit and just merge them into the top-level `requirements.txt`.
|
| 77 |
+
|
| 78 |
+
### 8. Clean up `nodes_replacements.py` warning
|
| 79 |
+
|
| 80 |
+
ComfyUI core at our pinned commit (`eb0686bb`) emits
|
| 81 |
+
`'function' object has no attribute 'register'` because the node-replacement
|
| 82 |
+
API surface is incomplete at that SHA. Bumping `COMFYUI_COMMIT` to a newer
|
| 83 |
+
tag should silence it. Pure cosmetic — no functional impact.
|
| 84 |
+
|
| 85 |
+
### 9. Auto-close drawer when user navigates away from header
|
| 86 |
+
|
| 87 |
+
Currently relies on document-level click listener. Works but has a
|
| 88 |
+
microsecond race when the click target is between elements. Could use
|
| 89 |
+
`pointerleave` on the drawer instead.
|
| 90 |
+
|
| 91 |
+
## Cost-of-running
|
| 92 |
+
|
| 93 |
+
### 10. Trim ZeroGPU duration cap
|
| 94 |
+
|
| 95 |
+
Currently `@spaces.GPU(duration=300)` reserves 5 min per call. For Fast preset
|
| 96 |
+
(distilled 8 steps) actual usage is ~30 s. Could shorten to 120 s — improves
|
| 97 |
+
queue priority for the user (per HF docs). Use dynamic duration based on
|
| 98 |
+
preset.
|