Spaces:
Running on Zero
Running on Zero
docs: revamp README + add AGENTS.md + refresh SKILLS.md
Browse files- README.md: opensource-style intro with badges, modes table, architecture
diagram, project layout, tech stack, design, license + credits (Danielle
Falco / FutuTek for the original All-In-One ComfyUI workflow).
- AGENTS.md: new tool-agnostic agent rulebook β seven rules, project shape,
locked-architecture-decisions table, commit / verification / testing rules,
out-of-scope list.
- SKILLS.md: cross-refs AGENTS.md from intro; push section now flags the
master:main refspec (orphan-branch trap); repo-structure listing includes
AGENTS.md.
AGENTS.md
ADDED
|
@@ -0,0 +1,136 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# AGENTS.md
|
| 2 |
+
|
| 3 |
+
Tool-agnostic agent guidance for the `ltx2.3-AIO-generator` repo. If you're driving Claude Code, Cursor, Aider, Codex, or anything else with file-edit + shell access, **start here**.
|
| 4 |
+
|
| 5 |
+
This file is the authoritative project rulebook.
|
| 6 |
+
|
| 7 |
+
- `CLAUDE.md` β Claude-specific extensions and the full gotchas catalogue (what & why).
|
| 8 |
+
- `SKILLS.md` β process / how-to (debugging, deployment, when to commit, useful one-liners).
|
| 9 |
+
- `README.md` β public-facing project intro (different audience).
|
| 10 |
+
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
## TL;DR β the seven rules
|
| 14 |
+
|
| 15 |
+
1. **Mayank Gupta is sole author on every commit.** No agent co-author trailers. No "generated withβ¦" footers. No `--author=` flag. Strip any tool-suggested attribution.
|
| 16 |
+
2. **Backend is ComfyUI in library mode.** We call `comfy.execution.PromptExecutor` directly with our parameterized workflow JSONs. We do NOT subprocess ComfyUI and we do NOT swap to a different inference engine.
|
| 17 |
+
3. **Six mode workflows live in `workflows/` as user-exported API-format JSON.** Do not hand-edit. The user re-exports from the ComfyUI editor when the master changes; `python tools/extract_modes.py --master ... --out workflows` regenerates the six mode files.
|
| 18 |
+
4. **Models live in the HF cache.** Local: `~/.cache/huggingface/hub` symlinked into `comfyui/models/<comfy_type>/`. Spaces: build-time `preload_from_hub` + runtime mirror to `~/hf-cache-rw/`. Never commit `*.safetensors`, `*.gguf`, `*.bin`, `*.pt`.
|
| 19 |
+
5. **Don't pin `spaces` in `requirements.txt`.** HF Spaces' ZeroGPU build injects its own version; pinning causes pip-resolve failure.
|
| 20 |
+
6. **HF Space deploys from `main`. Local default branch is `master`.** Push with `git push space master:main` β bare `git push space master` creates an orphan remote branch that does NOT trigger a deploy.
|
| 21 |
+
7. **VRAM is ComfyUI's job.** The only `empty_cache()` calls live in `backend.py`'s try/finally. Don't sprinkle them elsewhere.
|
| 22 |
+
|
| 23 |
+
If you can't satisfy any of these without changing architectural shape, **ask the user before proceeding.**
|
| 24 |
+
|
| 25 |
+
---
|
| 26 |
+
|
| 27 |
+
## Project shape
|
| 28 |
+
|
| 29 |
+
Single-process Gradio 5.50 app, flat top-level Python layout, ~3.5 k LOC excluding the ComfyUI submodule. ComfyUI itself is vendored as a git submodule locally and runtime-cloned into `~/comfyui` on HF Spaces.
|
| 30 |
+
|
| 31 |
+
```
|
| 32 |
+
app.py Gradio Blocks entry Β· _bootstrap Β· _on_generate Β· header drawer
|
| 33 |
+
backend.py ComfyUILibraryBackend Β· @spaces.GPU Β· _execute_workflow Β· duration estimator
|
| 34 |
+
modes.py MODE_REGISTRY + per-mode parameterize_fn + node-id constants
|
| 35 |
+
models.py MODEL_REGISTRY Β· walk_workflow_for_models Β· ensure_models_for_mode
|
| 36 |
+
ui.py render_status Β· _render_idle Β· mode-form layout primitives
|
| 37 |
+
workflow.py load_template Β· set_input helpers
|
| 38 |
+
workflows/ Six API-format mode JSONs β DO NOT hand-edit
|
| 39 |
+
assets/ Seed input placeholders for cold-start staging
|
| 40 |
+
tools/ extract_modes.py β regenerate workflows/ from a master export
|
| 41 |
+
docs/superpowers/ Spec + plan + brainstorm artifacts (per feature)
|
| 42 |
+
tests/ L1 + L2 + L3; GPU smoke gated by --gpu
|
| 43 |
+
comfyui/ Submodule (local) / runtime clone target (Spaces)
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
Same code path everywhere. The only branching is in `_bootstrap()` (cache-mirror dance on Spaces; plain symlink locally) and the `@spaces.GPU` decorator (identity off Spaces).
|
| 47 |
+
|
| 48 |
+
---
|
| 49 |
+
|
| 50 |
+
## Locked architecture decisions
|
| 51 |
+
|
| 52 |
+
These came out of 100+ commits of iteration. Do not relitigate.
|
| 53 |
+
|
| 54 |
+
| Decision | Why | Code reference |
|
| 55 |
+
|---|---|---|
|
| 56 |
+
| ComfyUI library mode (no subprocess) | Direct executor access; shared Python process for model lifecycle and progress reporting | `backend.ComfyUILibraryBackend.__init__` |
|
| 57 |
+
| Six API-format workflow JSONs | API format (`{node_id: {class_type, inputs}}`) is what `PromptExecutor` + `walk_workflow_for_models` consume. Editor format silently fails. | `workflows/*.json` |
|
| 58 |
+
| Workflow parameterization via patches | The user owns workflow shape via ComfyUI editor exports; we only patch leaf inputs. Never hand-edit JSON. | `modes.parameterize_fn` |
|
| 59 |
+
| Custom nodes pinned to SHAs (not branches) | Reproducible builds; `git clone --branch <SHA>` is unsupported β we use a `_git_clone()` init+fetch+checkout helper | `app.CUSTOM_NODES_PINNED`, `app._git_clone` |
|
| 60 |
+
| HF cache β `comfyui/models/<type>/` symlinks | Avoids duplicate weight copies; HF cache stays the single source of truth | `models.symlink_hf_cache_to_comfy_layout` |
|
| 61 |
+
| `_mirror_preload_hf_cache()` on Spaces | preload_from_hub files are owned by the build user (root-ish); the runtime user (uid 1000) can't write to them. Hardlink blobs + copy refs into a writable mirror under `~/hf-cache-rw/`. | `app._mirror_preload_hf_cache` |
|
| 62 |
+
| `_comfy_dir()` per-platform | `~/comfyui` on Spaces (writable HOME); `<repo>/comfyui` locally. Both `app.py` and `backend.py` must agree. | `app._comfy_dir`, `backend._comfy_dir` |
|
| 63 |
+
| `@spaces.GPU(duration=callable)` applied module-level | Runtime decoration isn't detected by ZeroGPU's startup analyzer. Static decorator + dynamic-duration callable is the supported pattern. | `backend._execute_workflow` |
|
| 64 |
+
| Per-call duration estimator | A one-size-fits-all 600 s caps everything in the slow queue lane. Estimator reads frames + mode + preset + cold-cache buffer, clamps `[60, 900] s`. | `backend._duration_for` |
|
| 65 |
+
| Auto-retry once at 2Γ on timeout | `"GPU task aborted"` is the queue-eviction signal; one retry catches transient busy queues. | `app._on_generate` |
|
| 66 |
+
| `allowed_paths=[output_dir]` on launch | Gradio 5 refuses files outside cwd / temp / `allowed_paths`. ComfyUI writes to `~/comfyui/output/...` on Spaces β outside the app cwd. | `app.app.launch(...)` |
|
| 67 |
+
| Header `z-index: 15` default / `60` elevated | HF injects `#huggingface-space-header` at fixed z-index 20. Default keeps our header below it (HF widget visible); drawer-open bumps above the scrim. | `_CUSTOM_CSS` `.aio-header` |
|
| 68 |
+
| Click-outside dismisser in `gr.Blocks(head=β¦)` | Gradio strips `<script>` tags inside `gr.HTML`. Inline scripts have to live in `<head>` to actually run. | `app._HEAD_HTML` |
|
| 69 |
+
| Mode tag in header via inline `onclick` | `onclick="β¦"` attributes on plain HTML buttons survive sanitization (unlike inline `<script>`). Lets us update the tag without a server round-trip. | mode buttons in `build_app` |
|
| 70 |
+
| Topaz Cinema Slate theme | Locked from brainstorming round. Slate `#1A1F26` + amber accent `#E0A458` + IBM Plex Sans. | `app._TOPAZ_THEME`, `_CUSTOM_CSS` |
|
| 71 |
+
|
| 72 |
+
---
|
| 73 |
+
|
| 74 |
+
## Commit rules
|
| 75 |
+
|
| 76 |
+
- **Conventional Commits:** `<type>(<scope>): <subject>`
|
| 77 |
+
- types: `feat`, `fix`, `chore`, `docs`, `test`, `refactor`, `ci`, `perf`
|
| 78 |
+
- Subject is **imperative**, lowercase, **no trailing period**.
|
| 79 |
+
- Body explains **why** when not obvious. Reference plan task IDs when implementing a specific plan step.
|
| 80 |
+
- Frequent small commits; one logical change per commit.
|
| 81 |
+
- **No agent attribution** in commit message or body. See rule 1.
|
| 82 |
+
- Don't `git push --force` to `master` / `main` unless the user explicitly says so.
|
| 83 |
+
|
| 84 |
+
---
|
| 85 |
+
|
| 86 |
+
## Verification rules
|
| 87 |
+
|
| 88 |
+
- **Tests must pass before committing.** `python -m pytest tests/ -q` from the project root. Default skips GPU markers.
|
| 89 |
+
- **Ruff must be clean.** `ruff check . && ruff format --check .` β both no-op.
|
| 90 |
+
- **Smoke import after backend / app edits.** `python -c "import app; b = app.build_app(); print(type(b).__name__)"` should print `Blocks` without traceback. Catches most syntax / import-cycle issues without spinning up the full server.
|
| 91 |
+
- **For UI changes:** start the local server (`python app.py` β http://127.0.0.1:7860), click through the affected mode, verify visually. Don't trust a green test run + clean ruff as proof the UI works β the test suite doesn't render Gradio Blocks.
|
| 92 |
+
- **For deployment changes:** push to HF Space, watch the build stage transitions (`BUILDING` β `APP_STARTING` β `RUNNING`), verify the runtime stage hits `RUNNING` before claiming success.
|
| 93 |
+
|
| 94 |
+
If a change requires breaking these rules, write the reason in the commit body.
|
| 95 |
+
|
| 96 |
+
---
|
| 97 |
+
|
| 98 |
+
## Testing conventions
|
| 99 |
+
|
| 100 |
+
- **TDD per the plan.** Failing test first, then implementation.
|
| 101 |
+
- **L1** β unit tests on pure Python (mode registry, parameterize_fn, walker, extractor). Runs in CI without GPU.
|
| 102 |
+
- **L2** β graph validation against the bundled ComfyUI (`pytest --comfy-real`). Optional; runs locally + nightly.
|
| 103 |
+
- **L3** β backend smoke tests with real workflow JSONs but stubbed HTTP / filesystem boundaries. Runs in CI without GPU.
|
| 104 |
+
- **L4** β HF Space smoke. Manual click-through on the live Space after each deploy.
|
| 105 |
+
- **No mocks for ComfyUI core.** Tests run against real workflow JSONs. Stub only HTTP boundaries (HF Hub) and filesystem (use `tmp_path` and the `fake_hf_cache` fixture).
|
| 106 |
+
- `pyproject.toml` declares the `gpu` marker; pass `--gpu` to opt into GPU smoke.
|
| 107 |
+
|
| 108 |
+
---
|
| 109 |
+
|
| 110 |
+
## Out of scope (v1 β don't add without asking)
|
| 111 |
+
|
| 112 |
+
The spec at `docs/superpowers/specs/2026-04-30-ltx23-aio-generator-design.md` Β§ 11 calls these out as v1.1+. If you find yourself "while I'm here"-ing into one of them, stop.
|
| 113 |
+
|
| 114 |
+
- **Lite mode** (`LTX23_AIO_LITE=1`) for the free HF Spaces tier
|
| 115 |
+
- **Custom LoRA** add/remove rows (Power-Lora-Loader clone)
|
| 116 |
+
- **GGUF Q4 transformer** / "Low VRAM" preset (currently always BF16-served)
|
| 117 |
+
- **Auto-launch the user's external ComfyUI** (`LTX23_AIO_COMFYUI_URL`)
|
| 118 |
+
- **Multi-prompt queueing**
|
| 119 |
+
- **Output history persistence** across sessions
|
| 120 |
+
- **Visual regression tests** for the Gradio UI
|
| 121 |
+
- **Property-based / fuzz testing** of workflow parameters
|
| 122 |
+
- **Persistent Storage add-on** integration (see `docs/future_improvements.md` item 6)
|
| 123 |
+
- **Telemetry-driven duration estimator** (requires persistent storage)
|
| 124 |
+
|
| 125 |
+
If a feature you're adding requires one of these as a sub-step, **ask the user.**
|
| 126 |
+
|
| 127 |
+
---
|
| 128 |
+
|
| 129 |
+
## When you're not sure
|
| 130 |
+
|
| 131 |
+
1. Read `docs/superpowers/specs/2026-04-30-ltx23-aio-generator-design.md` β that's the architectural source of truth.
|
| 132 |
+
2. Read `docs/superpowers/plans/2026-04-30-ltx23-aio-generator.md` β task-by-task breakdown.
|
| 133 |
+
3. Read `SKILLS.md` β process rules, debugging patterns, deployment workflow, useful one-liners.
|
| 134 |
+
4. Read `CLAUDE.md` β gotchas catalogue and what-not-to-do.
|
| 135 |
+
5. `git log --oneline` β every non-obvious decision has a fix-commit explaining the reasoning.
|
| 136 |
+
6. **Ask the user.** A clarifying question costs the user ten seconds. A wrong implementation costs everyone an hour.
|
README.md
CHANGED
|
@@ -21,42 +21,211 @@ preload_from_hub:
|
|
| 21 |
- google/gemma-3-12b-it-qat-q4_0-unquantized gemma-3-12b-it/model-00001-of-00005.safetensors,gemma-3-12b-it/model-00002-of-00005.safetensors,gemma-3-12b-it/model-00003-of-00005.safetensors,gemma-3-12b-it/model-00004-of-00005.safetensors,gemma-3-12b-it/model-00005-of-00005.safetensors,gemma-3-12b-it/model.safetensors.index.json,gemma-3-12b-it/preprocessor_config.json,gemma-3-12b-it/tokenizer.model
|
| 22 |
---
|
| 23 |
|
| 24 |
-
# LTX 2.3
|
| 25 |
|
| 26 |
-
A Gradio app
|
| 27 |
|
| 28 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
-
|
| 31 |
-
2. **Audio β Video** (Text + Audio β Video + Audio)
|
| 32 |
-
3. **Image β Video** (+ optional Audio)
|
| 33 |
-
4. **Lipsync** (Image + Audio β Video + Audio)
|
| 34 |
-
5. **First / Last Frame β Video** (keyframe interpolation)
|
| 35 |
-
6. **Style Transfer** (Video β Video, motion control)
|
| 36 |
|
| 37 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
-
|
|
|
|
|
|
|
| 40 |
|
| 41 |
```bash
|
| 42 |
-
git clone --recurse-submodules https://github.com/
|
| 43 |
cd ltx2.3-AIO-generator
|
| 44 |
-
bash setup.sh
|
| 45 |
source .venv/bin/activate
|
| 46 |
-
python app.py
|
| 47 |
```
|
| 48 |
|
| 49 |
-
The first run
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
|
| 51 |
-
## HF Spaces
|
| 52 |
|
| 53 |
-
This repo is a Gradio Space. The
|
| 54 |
|
| 55 |
```bash
|
| 56 |
-
git remote add space https://huggingface.co/spaces/<your-handle>/
|
| 57 |
-
git push space main
|
| 58 |
```
|
| 59 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
## License
|
| 61 |
|
| 62 |
-
MIT for the AIO app code
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
- google/gemma-3-12b-it-qat-q4_0-unquantized gemma-3-12b-it/model-00001-of-00005.safetensors,gemma-3-12b-it/model-00002-of-00005.safetensors,gemma-3-12b-it/model-00003-of-00005.safetensors,gemma-3-12b-it/model-00004-of-00005.safetensors,gemma-3-12b-it/model-00005-of-00005.safetensors,gemma-3-12b-it/model.safetensors.index.json,gemma-3-12b-it/preprocessor_config.json,gemma-3-12b-it/tokenizer.model
|
| 22 |
---
|
| 23 |
|
| 24 |
+
# LTX 2.3 Studio
|
| 25 |
|
| 26 |
+
A single-process Gradio app that wraps [LTX-2.3](https://huggingface.co/Lightricks/LTX-2.3) β Lightricks' open 22B video generation model β under one focused UI. Six modes (text Β· image Β· audio Β· lipsync Β· keyframe Β· style) sharing the same ComfyUI All-In-One workflow. Runs locally on Apple Silicon (MPS) or NVIDIA (CUDA), deploys to Hugging Face Spaces (ZeroGPU).
|
| 27 |
|
| 28 |
+
[](https://huggingface.co/spaces/techfreakworm/LTX2.3-Studio)
|
| 29 |
+
[](https://github.com/techfreakworm/ltx2.3-AIO-generator/stargazers)
|
| 30 |
+
[](LICENSE)
|
| 31 |
+
[](pyproject.toml)
|
| 32 |
+
[](https://github.com/comfyanonymous/ComfyUI)
|
| 33 |
+
[](https://huggingface.co/Lightricks/LTX-2.3)
|
| 34 |
|
| 35 |
+
β **Live demo:** https://huggingface.co/spaces/techfreakworm/LTX2.3-Studio
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 |
|
| 37 |
+
---
|
| 38 |
+
|
| 39 |
+
## What's inside
|
| 40 |
+
|
| 41 |
+
Six modes wired through the same ComfyUI All-In-One workflow. Each mode exposes only the inputs it actually consumes β the form stays short and focused.
|
| 42 |
+
|
| 43 |
+
| Mode | Inputs | Output | Notes |
|
| 44 |
+
|---|---|---|---|
|
| 45 |
+
| **Text β Video** | Prompt (+ optional audio prompt) | mp4 (+ optional wav) | The core mode. Camera-control LoRAs auto-applied by keyword. |
|
| 46 |
+
| **Audio β Video** | Prompt + audio track | mp4 with the input audio preserved | Conditions motion on the audio waveform. |
|
| 47 |
+
| **Image β Video** | Image + prompt | mp4 (+ optional audio) | Image-conditioned generation. |
|
| 48 |
+
| **Lipsync** | Image + audio | mp4 with audio | Viseme-aligned mouth motion. |
|
| 49 |
+
| **Keyframe** | First + last frames + prompt | mp4 | Latent interpolation between two anchors. |
|
| 50 |
+
| **Style Transfer** | Source video + style image | mp4 | IC-LoRA restyle; motion preserved from source. |
|
| 51 |
+
|
| 52 |
+
Every mode carries **Fast / Balanced / Quality** presets (steps Γ 1, Γ 1.5, Γ 3). A per-mode ZeroGPU duration estimator adapts the call timeout to the requested workload.
|
| 53 |
+
|
| 54 |
+
---
|
| 55 |
|
| 56 |
+
## Quick start (local)
|
| 57 |
+
|
| 58 |
+
Requires **Python 3.11**, ~80 GB free disk for the weight set, and ~24 GB VRAM (CUDA) or ~32 GB unified memory (Apple Silicon).
|
| 59 |
|
| 60 |
```bash
|
| 61 |
+
git clone --recurse-submodules https://github.com/techfreakworm/ltx2.3-AIO-generator
|
| 62 |
cd ltx2.3-AIO-generator
|
| 63 |
+
bash setup.sh # creates .venv, installs ComfyUI + pinned custom nodes + app deps
|
| 64 |
source .venv/bin/activate
|
| 65 |
+
python app.py # http://127.0.0.1:7860
|
| 66 |
```
|
| 67 |
|
| 68 |
+
The first run resolves model weights into your HF cache (`~/.cache/huggingface/hub/`) and symlinks them into `comfyui/models/<comfy_type>/`. Subsequent starts skip the download. Expect ~70 GB of weights pulled on a cold first run.
|
| 69 |
+
|
| 70 |
+
**Apple Silicon notes.** `PYTORCH_ENABLE_MPS_FALLBACK=1` is set automatically so the few MPS-unsupported ops fall back to CPU. ComfyUI's VRAM autodetect picks the right tier; override with `LTX23_AIO_VRAM=lowvram|normalvram|highvram` if you need to force one.
|
| 71 |
+
|
| 72 |
+
**LAN access** (phone / tablet on the same WiFi): `python app.py` binds `0.0.0.0:7860`. Visit `http://<your-LAN-IP>:7860` from another device. On macOS, allow inbound for `python` in System Settings β Network β Firewall if the connection refuses.
|
| 73 |
|
| 74 |
+
## Quick start (HF Spaces)
|
| 75 |
|
| 76 |
+
This repo is a Gradio Space. The Pro tier provides ZeroGPU (A10G) access and the per-call duration budget needed for the Balanced and Quality presets.
|
| 77 |
|
| 78 |
```bash
|
| 79 |
+
git remote add space https://huggingface.co/spaces/<your-handle>/LTX2.3-Studio
|
| 80 |
+
git push space master:main # local branch is master; HF Space deploys from main
|
| 81 |
```
|
| 82 |
|
| 83 |
+
> β The refspec `master:main` matters. The local default branch is `master` (GitHub convention); the HF Space deploys from `main`. A bare `git push space master` creates an orphan remote branch that does NOT trigger a deploy.
|
| 84 |
+
|
| 85 |
+
The Space's `preload_from_hub` directive (see the YAML at the top of this file) bakes ~111 GB of weights into the build image. `app.py:_bootstrap()` then:
|
| 86 |
+
|
| 87 |
+
1. Clones ComfyUI + pinned custom nodes into `~/comfyui` on cold start (ZeroGPU container freezes preserve them across calls)
|
| 88 |
+
2. Mirrors the read-only preload cache into `~/hf-cache-rw/` β works around the build-user-vs-runtime-user permissions trap (preloaded files are root-owned; we run as uid 1000 and can't write to them, so any lazy download to the cache would fail with `Permission denied`)
|
| 89 |
+
3. Stages seed input files into `comfyui/input/` so workflow loaders don't error before any user upload arrives
|
| 90 |
+
|
| 91 |
+
Subsequent requests hit warm cache β no network traffic on inference 2+.
|
| 92 |
+
|
| 93 |
+
**ZeroGPU duration estimator.** Each generate call carries a dynamic `@spaces.GPU(duration=N)` calculated from mode, preset, and frame count. Clamped at `[60, 900] s`. On timeout (`"GPU task aborted"`), the handler auto-retries once at 2Γ duration.
|
| 94 |
+
|
| 95 |
+
---
|
| 96 |
+
|
| 97 |
+
## Architecture
|
| 98 |
+
|
| 99 |
+
```
|
| 100 |
+
ββββββββββββββββββββββββββββββββββββ
|
| 101 |
+
browser βββΆβ app.py β Gradio Blocks β
|
| 102 |
+
β header Β· drawer Β· 6 mode tabs β
|
| 103 |
+
ββββββββββββββββββββ¬ββββββββββββββββ
|
| 104 |
+
β
|
| 105 |
+
βΌ
|
| 106 |
+
ββββββββββββββββββββββββββββββββββββ
|
| 107 |
+
β backend.py β
|
| 108 |
+
β ComfyUILibraryBackend β
|
| 109 |
+
β @spaces.GPU(duration=callable) β
|
| 110 |
+
β calls PromptExecutor directly β
|
| 111 |
+
ββββββββββββββββββββ¬ββββββββββββββββ
|
| 112 |
+
β
|
| 113 |
+
ββββββββββββββββ¬βββββββββββββββ¬βββββββββββββββββββ΄βββββββ¬βββββββββββββββββββ
|
| 114 |
+
βΌ βΌ βΌ βΌ βΌ
|
| 115 |
+
modes.py models.py workflow.py ui.py tools/
|
| 116 |
+
per-mode walk + ensure load + patch per-mode form extract_modes.py
|
| 117 |
+
parameterize from HF cache API-format JSON builders (regen workflows/)
|
| 118 |
+
β
|
| 119 |
+
βΌ
|
| 120 |
+
ββββββββββββββββββββββββββββββββββββ
|
| 121 |
+
β comfyui/ β
|
| 122 |
+
β submodule (local) β
|
| 123 |
+
β runtime clone at ~/comfyui β
|
| 124 |
+
β on HF Spaces β
|
| 125 |
+
β β
|
| 126 |
+
β βββ custom_nodes/ (pinned SHAs)β
|
| 127 |
+
β βββ models/ β HF cache symlinksβ
|
| 128 |
+
ββββββββββββββββββββββββββββββββββββ
|
| 129 |
+
```
|
| 130 |
+
|
| 131 |
+
**One backend, one process.** The `@spaces.GPU` decorator is the only divergence between local and Spaces runtime. ComfyUI manages VRAM via its tiered presets β no `empty_cache()` sprinkling needed elsewhere.
|
| 132 |
+
|
| 133 |
+
**Workflow as data.** Each of the six modes is a user-exported API-format JSON in `workflows/`. The mode handler patches a deep-copied template (`modes.parameterize_fn`) and hands it to ComfyUI's `PromptExecutor`. Updating the master workflow is a three-step ritual: edit in the ComfyUI editor β export β `python tools/extract_modes.py --master ... --out workflows`.
|
| 134 |
+
|
| 135 |
+
---
|
| 136 |
+
|
| 137 |
+
## Project layout
|
| 138 |
+
|
| 139 |
+
```
|
| 140 |
+
.
|
| 141 |
+
βββ app.py # Gradio Blocks entry, _bootstrap, _on_generate, mode tabs
|
| 142 |
+
βββ backend.py # ComfyUILibraryBackend, @spaces.GPU, duration estimator
|
| 143 |
+
βββ modes.py # MODE_REGISTRY + per-mode parameterize_fn + node-id constants
|
| 144 |
+
βββ models.py # MODEL_REGISTRY, walk_workflow_for_models, ensure_models
|
| 145 |
+
βββ ui.py # render_status, _render_idle, mode-form layout primitives
|
| 146 |
+
βββ workflow.py # load_template, set_input helpers
|
| 147 |
+
βββ workflows/ # API-format mode JSONs (do not hand-edit)
|
| 148 |
+
β βββ t2v.json
|
| 149 |
+
β βββ i2v.json
|
| 150 |
+
β βββ a2v.json
|
| 151 |
+
β βββ lipsync.json
|
| 152 |
+
β βββ keyframe.json
|
| 153 |
+
β βββ style.json
|
| 154 |
+
βββ assets/seed_inputs/ # placeholder image / audio / video for cold-start staging
|
| 155 |
+
βββ tools/
|
| 156 |
+
β βββ extract_modes.py # regenerate workflows/ from a master ComfyUI export
|
| 157 |
+
βββ docs/
|
| 158 |
+
β βββ future_improvements.md
|
| 159 |
+
β βββ superpowers/{specs,plans}/ # spec + implementation plans per feature
|
| 160 |
+
βββ tests/ # L1 + L3 in CI; L2 with --comfy-real; L4 GPU smoke
|
| 161 |
+
βββ README.md # this file (HF Space YAML + project intro)
|
| 162 |
+
βββ CLAUDE.md # project facts + gotchas (what & why)
|
| 163 |
+
βββ AGENTS.md # tool-agnostic agent rulebook
|
| 164 |
+
βββ SKILLS.md # process / debugging / deployment (how)
|
| 165 |
+
βββ requirements.txt # pinned deps
|
| 166 |
+
βββ pyproject.toml # ruff + pytest config (py311)
|
| 167 |
+
βββ setup.sh # venv + ComfyUI + custom nodes bootstrap
|
| 168 |
+
βββ comfyui/ # git submodule (local) / runtime clone target (Spaces)
|
| 169 |
+
```
|
| 170 |
+
|
| 171 |
+
---
|
| 172 |
+
|
| 173 |
+
## Tech stack
|
| 174 |
+
|
| 175 |
+
- **[Gradio 5.50](https://gradio.app/)** β UI shell, native components, `gr.Progress(track_tqdm=True)`
|
| 176 |
+
- **[ComfyUI](https://github.com/comfyanonymous/ComfyUI)** β library-mode `PromptExecutor` (pinned commit; submodule locally, runtime-cloned on Spaces)
|
| 177 |
+
- **[LTX-2.3 22B](https://huggingface.co/Lightricks/LTX-2.3)** by Lightricks β primary diffusion transformer (BF16 weights via [Kijai/LTX2.3_comfy](https://huggingface.co/Kijai/LTX2.3_comfy))
|
| 178 |
+
- **[Gemma 3 12B](https://huggingface.co/google/gemma-3-12b-it)** by Google β multimodal text encoder (requires the full 5-shard model β text-only checkpoints crash on meta-tensor allocation in SDPA)
|
| 179 |
+
- **Custom nodes** (pinned SHAs in `app.CUSTOM_NODES_PINNED`):
|
| 180 |
+
- [Lightricks/ComfyUI-LTXVideo](https://github.com/Lightricks/ComfyUI-LTXVideo) β LTX sampler / decoder nodes
|
| 181 |
+
- [kijai/ComfyUI-KJNodes](https://github.com/kijai/ComfyUI-KJNodes) β utility nodes
|
| 182 |
+
- [rgthree/rgthree-comfy](https://github.com/rgthree/rgthree-comfy) β Power-Lora-Loader
|
| 183 |
+
- [Kosinkadink/ComfyUI-VideoHelperSuite](https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite) β video I/O
|
| 184 |
+
- [pythongosssss/ComfyUI-Custom-Scripts](https://github.com/pythongosssss/ComfyUI-Custom-Scripts) β string / dict helpers
|
| 185 |
+
- [city96/ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF) β GGUF transformer loader
|
| 186 |
+
- [Fannovel16/comfyui_controlnet_aux](https://github.com/Fannovel16/comfyui_controlnet_aux) β DWPose for Lipsync/Style preprocessors
|
| 187 |
+
- [evanspearman/ComfyMath](https://github.com/evanspearman/ComfyMath) β math nodes for the workflow's keyframe path
|
| 188 |
+
- [Smirnov75/ComfyUI-mxToolkit](https://github.com/Smirnov75/ComfyUI-mxToolkit) β utility nodes
|
| 189 |
+
- [DoctorDiffusion/ComfyUI-MediaMixer](https://github.com/DoctorDiffusion/ComfyUI-MediaMixer) β `FinalFrameSelector`
|
| 190 |
+
- **[HF Spaces ZeroGPU](https://huggingface.co/zero-gpu)** (A10G) β `@spaces.GPU(duration=β¦)` for queue-priority signalling and per-call timeout
|
| 191 |
+
|
| 192 |
+
---
|
| 193 |
+
|
| 194 |
+
## Design
|
| 195 |
+
|
| 196 |
+
Theme: **Topaz Cinema Slate** β slate substrate `#1A1F26`, warm amber accent `#E0A458` used sparingly, IBM Plex Sans throughout. Defined as `_TOPAZ_THEME` + `_CUSTOM_CSS` in `app.py`.
|
| 197 |
+
|
| 198 |
+
Layout: hamburger drawer. Pinned 220 px sidebar at β₯1024 px (mode buttons + model status + settings); below 1024 px it slides in as a fixed overlay via the `.aio-shell.drawer-open` class. The header carries a live mode tag (T2V/A2V/I2V/LIPSYNC/KEY/STYLE) updated by JS without a server round-trip.
|
| 199 |
+
|
| 200 |
+
Spec, plan, and design rationale live under `docs/superpowers/specs/` and `docs/superpowers/plans/`.
|
| 201 |
+
|
| 202 |
+
---
|
| 203 |
+
|
| 204 |
+
## Notes on running
|
| 205 |
+
|
| 206 |
+
- **First inference is slow.** Cold-start workflow validation + model load on the active node graph takes ~30 β 90 s. Subsequent calls within the same session reuse loaded models.
|
| 207 |
+
- **VRAM tier** is auto-detected; override with `LTX23_AIO_VRAM=lowvram|normalvram|highvram`.
|
| 208 |
+
- **ZeroGPU duration cap.** The per-call estimator clamps to `[60, 900] s`. If a generation aborts with `"GPU task aborted"`, the handler retries once at 2Γ duration. The duration field is the queue-priority signal, not a billing cap.
|
| 209 |
+
- **Output directory.** Local: `comfyui/output/LTX2.3/`. Spaces: `~/comfyui/output/LTX2.3/`. Both are whitelisted via `allowed_paths=` on launch (Gradio 5 file-access policy).
|
| 210 |
+
- **Local LAN testing.** Bound to `0.0.0.0:7860`. macOS firewall: allow inbound for `python` if a connection from your phone refuses.
|
| 211 |
+
|
| 212 |
+
---
|
| 213 |
+
|
| 214 |
## License
|
| 215 |
|
| 216 |
+
MIT for the AIO app code (see `LICENSE`).
|
| 217 |
+
|
| 218 |
+
- [ComfyUI](https://github.com/comfyanonymous/ComfyUI) is GPL-3.0.
|
| 219 |
+
- LTX-2.3 and Lightricks-published LoRAs / auxiliaries retain Lightricks' open-source licensing β see the individual model cards on Hugging Face.
|
| 220 |
+
- Gemma 3 weights are subject to Google's [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
|
| 221 |
+
- Each pinned custom node retains its own license; see the linked repositories.
|
| 222 |
+
|
| 223 |
+
## Credits
|
| 224 |
+
|
| 225 |
+
- **LTX-2.3** by [Lightricks](https://github.com/Lightricks)
|
| 226 |
+
- **ComfyUI** by [comfyanonymous](https://github.com/comfyanonymous)
|
| 227 |
+
- **Gemma 3** by [Google DeepMind](https://github.com/google-deepmind)
|
| 228 |
+
- **All-In-One ComfyUI workflow** that this app wraps β by [Danielle Falco](https://www.youtube.com/@FutuTek) (FutuTek)
|
| 229 |
+
- **Workflow nodes** by Lightricks, [kijai](https://github.com/kijai), [rgthree](https://github.com/rgthree), [Kosinkadink](https://github.com/Kosinkadink), [pythongosssss](https://github.com/pythongosssss), [city96](https://github.com/city96), [Fannovel16](https://github.com/Fannovel16), [evanspearman](https://github.com/evanspearman), [Smirnov75](https://github.com/Smirnov75), [DoctorDiffusion](https://github.com/DoctorDiffusion)
|
| 230 |
+
|
| 231 |
+
Built by [@techfreakworm](https://huggingface.co/techfreakworm) β drop a β₯ on the [Space](https://huggingface.co/spaces/techfreakworm/LTX2.3-Studio) if it's useful, and follow there for what's next.
|
SKILLS.md
CHANGED
|
@@ -1,8 +1,14 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
-
Process rules and habits for
|
| 4 |
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
|
| 7 |
---
|
| 8 |
|
|
@@ -152,12 +158,15 @@ lsof -nP -iTCP:7860 -sTCP:LISTEN | awk 'NR>1 {print $2}' | xargs -r kill -9
|
|
| 152 |
### Two remotes
|
| 153 |
|
| 154 |
```bash
|
| 155 |
-
git push origin master
|
| 156 |
-
|
| 157 |
-
git push "https://techfreakworm:${HF_TOKEN}@huggingface.co/spaces/techfreakworm/LTX2.3-Studio" master:main
|
| 158 |
```
|
| 159 |
|
| 160 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 161 |
|
| 162 |
### When to push
|
| 163 |
|
|
@@ -261,9 +270,10 @@ Do not loop on patches when you've patched twice and it's still broken.
|
|
| 261 |
β βββ future_improvements.md
|
| 262 |
βββ tools/extract_modes.py # regenerate workflows/ from master
|
| 263 |
βββ tests/
|
| 264 |
-
βββ README.md # HF Space YAML + project
|
| 265 |
-
βββ
|
| 266 |
-
βββ
|
|
|
|
| 267 |
βββ requirements.txt
|
| 268 |
βββ comfyui/ # git submodule (local) / runtime clone target (Spaces)
|
| 269 |
```
|
|
|
|
| 1 |
+
# SKILLS.md β how to make changes in this project
|
| 2 |
|
| 3 |
+
Process rules and habits for agents working on this repo. Sits alongside:
|
| 4 |
|
| 5 |
+
- `AGENTS.md` β the tool-agnostic rulebook (locked decisions, out-of-scope list, commit + verification rules).
|
| 6 |
+
- `CLAUDE.md` β Claude-specific extensions + full gotchas catalogue (*what & why*).
|
| 7 |
+
- `README.md` β public-facing intro (different audience).
|
| 8 |
+
|
| 9 |
+
This file is the *how* β debugging patterns, verification habits, deployment workflow, useful one-liners.
|
| 10 |
+
|
| 11 |
+
> **Default rule when in doubt:** stop and ask the user. The user prefers a question over wrong work.
|
| 12 |
|
| 13 |
---
|
| 14 |
|
|
|
|
| 158 |
### Two remotes
|
| 159 |
|
| 160 |
```bash
|
| 161 |
+
git push origin master # GitHub: techfreakworm/ltx2.3-AIO-generator
|
| 162 |
+
git push space master:main # HF Space: techfreakworm/LTX2.3-Studio (deploys from main)
|
|
|
|
| 163 |
```
|
| 164 |
|
| 165 |
+
The repo has both remotes pre-configured (`origin` + `space`). HF credentials live in `~/.cache/huggingface/token`; git's credential helper picks them up automatically β no need to embed the token in the URL.
|
| 166 |
+
|
| 167 |
+
> β **Refspec matters for the Space push.** Local default branch is `master`; the HF Space deploys from `main`. A bare `git push space master` succeeds but creates an orphan `refs/heads/master` on the remote that does NOT trigger a deploy β the Space silently stays on the old build. Always push with the `master:main` refspec form.
|
| 168 |
+
|
| 169 |
+
If unsure, verify with `git ls-remote space` β `HEAD` should point at `refs/heads/main`.
|
| 170 |
|
| 171 |
### When to push
|
| 172 |
|
|
|
|
| 270 |
β βββ future_improvements.md
|
| 271 |
βββ tools/extract_modes.py # regenerate workflows/ from master
|
| 272 |
βββ tests/
|
| 273 |
+
βββ README.md # HF Space YAML + project intro (public-facing)
|
| 274 |
+
βββ AGENTS.md # tool-agnostic agent rulebook (locked decisions, OoS)
|
| 275 |
+
βββ CLAUDE.md # what & why β full gotchas catalogue
|
| 276 |
+
βββ SKILLS.md # how β process, debugging, deployment (this file)
|
| 277 |
βββ requirements.txt
|
| 278 |
βββ comfyui/ # git submodule (local) / runtime clone target (Spaces)
|
| 279 |
```
|