techfreakworm commited on
Commit
723293f
·
unverified ·
1 Parent(s): 55921ac

docs: design spec for v1 — Gradio + bundled ComfyUI library backend

Browse files

Decision log: 6 modes, preset+accordion settings, ComfyUI as headless backend
in library mode, six mode-specific JSON templates, categorized LoRA chrome,
sidebar nav layout, bundled ComfyUI (submodule local + runtime clone Spaces),
HF cache symlinks local + lazy /data on Spaces, Pro tier deploy.

.gitignore ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Superpowers brainstorming session artifacts
2
+ .superpowers/
3
+
4
+ # Python
5
+ .venv/
6
+ venv/
7
+ __pycache__/
8
+ *.pyc
9
+ *.pyo
10
+ *.egg-info/
11
+ .pytest_cache/
12
+ .mypy_cache/
13
+ .ruff_cache/
14
+
15
+ # Models (downloaded to HF cache, never to repo)
16
+ models/
17
+ checkpoints/
18
+ *.safetensors
19
+ *.gguf
20
+
21
+ # Outputs
22
+ outputs/
23
+ generated/
24
+ *.mp4
25
+ *.wav
26
+ *.webm
27
+ !demo/**/*.mp4
28
+ !demo/**/*.wav
29
+
30
+ # OS
31
+ .DS_Store
32
+ Thumbs.db
33
+
34
+ # IDE
35
+ .vscode/
36
+ .idea/
37
+
38
+ # Env
39
+ .env
40
+ .env.local
41
+ *.log
42
+
43
+ # Gradio cache
44
+ gradio_cached_examples/
45
+ flagged/
CLAUDE.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Claude / Agent Working Notes — ltx2.3-AIO-generator
2
+
3
+ Project guidelines for AI assistants working in this repo.
4
+
5
+ ## Git authorship
6
+
7
+ Mayank Gupta is the **sole author** on every commit. Never:
8
+ - Append a `Co-Authored-By: Claude ...` trailer.
9
+ - Set `--author` to anything other than the user's existing git config.
10
+ - Add "Generated with Claude Code", "🤖", or any similar attribution lines to commit messages.
11
+ - Add similar attribution to PR descriptions.
12
+
13
+ If asked to amend or re-commit, strip any prior Claude attribution.
14
+
15
+ ## Project at a glance
16
+
17
+ Gradio app wrapping the existing ComfyUI LTX 2.3 All-In-One workflow. Same code runs locally (Apple Silicon MPS or NVIDIA CUDA) and deploys to Hugging Face Spaces (ZeroGPU, Pro tier).
18
+
19
+ **Key architectural facts (do not relitigate):**
20
+
21
+ 1. **Backend is ComfyUI in library mode**, always. We do not call `ltx-pipelines` directly. We call `comfy.execution.PromptExecutor` with workflow JSONs we parameterize. ComfyUI is bundled (git submodule locally, runtime clone on Spaces).
22
+ 2. **Six mode-specific workflow JSON files** in `workflows/`. They are derived from the master workflow at `~/Projects/comfyui/user/default/workflows/1. LTX 2.3 All-In-One 260406-05.json` via `tools/extract_modes.py`. Do not hand-edit the JSON files unless re-extracting from a new master.
23
+ 3. **Models live in HF cache (local) or `/data` (Spaces)**, never in this repo. `comfyui/models/` contains symlinks (local) or downloaded files (Spaces). Do not commit any `*.safetensors` / `*.gguf`.
24
+ 4. **Library mode means single-process.** No subprocess for ComfyUI. The `@spaces.GPU` decorator is the only difference between local and Spaces runtime.
25
+ 5. **VRAM management is ComfyUI's job.** Don't write `torch.cuda.empty_cache()` calls outside the `try/finally` in `backend.py`. Don't second-guess ComfyUI's offload tiers.
26
+
27
+ See `docs/superpowers/specs/2026-04-30-ltx23-aio-generator-design.md` for the full design.
28
+
29
+ ## Coding conventions
30
+
31
+ - **Python 3.11.** No `match` statements (compatibility with the Spaces Python pin).
32
+ - **Flat structure.** No `src/` layout, no nested packages. Each top-level `.py` is one module with one job.
33
+ - **No conda.** Use `python3.11 -m venv .venv`. Use `brew` for system binaries.
34
+ - **HF cache, not project-local.** Use `hf download <repo>` (the `hf` CLI, not deprecated `huggingface-cli`) without `--local-dir`. Symlink resolved snapshot paths.
35
+ - **No mocks for ComfyUI.** Tests against real workflow JSONs. Stubs only for HTTP / filesystem boundaries.
36
+ - **No emojis** in code or commit messages unless explicitly requested.
37
+ - **Comments only when WHY is non-obvious.** Don't narrate WHAT.
38
+
39
+ ## Editing the master workflow
40
+
41
+ When the user updates `~/Projects/comfyui/user/default/workflows/1. LTX 2.3 All-In-One 260406-05.json` (e.g., new LoRA, tweaked sampler), re-run:
42
+
43
+ ```bash
44
+ python tools/extract_modes.py --master ~/Projects/comfyui/user/default/workflows/"1. LTX 2.3 All-In-One 260406-05.json"
45
+ ```
46
+
47
+ This regenerates all six `workflows/<mode>.json` files. L2 graph-validation tests will catch any node that became invalid.
48
+
49
+ ## Out of scope (do not implement without asking)
50
+
51
+ - "Lite mode" for free HF Spaces tier (`LTX23_AIO_LITE=1`).
52
+ - Custom LoRA add/remove rows (Power-Lora-Loader clone).
53
+ - GGUF Q4 transformer / "Low VRAM" preset.
54
+ - Auto-launch of user's external ComfyUI install (`LTX23_AIO_COMFYUI_URL`).
55
+ - Multi-prompt queueing.
56
+ - Output history persistence across sessions.
57
+
58
+ These are documented as v1.1+ in the spec. Do not pre-build them.
59
+
60
+ ## When in doubt
61
+
62
+ Read `docs/superpowers/specs/2026-04-30-ltx23-aio-generator-design.md`. If still unclear, ask before changing architectural shape.
docs/superpowers/specs/2026-04-30-ltx23-aio-generator-design.md ADDED
@@ -0,0 +1,483 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LTX 2.3 All-In-One Generator — Design Spec
2
+
3
+ **Date:** 2026-04-30
4
+ **Status:** Design approved, awaiting implementation plan
5
+ **Repo:** `~/Projects/llm/ltx2.3-AIO-generator`
6
+
7
+ ## 1. Overview
8
+
9
+ A Gradio app that wraps the existing ComfyUI LTX 2.3 All-In-One workflow into a polished, mode-specific UI. Same code runs locally on Apple Silicon (MPS) or NVIDIA (CUDA) and deploys to Hugging Face Spaces with ZeroGPU. The Gradio frontend is a thin layer; ComfyUI is the inference engine — bundled and called as a Python library — so all of ComfyUI's smart model management, MPS handling, and node correctness are inherited rather than reimplemented.
10
+
11
+ Six generation modes ship in v1, mirroring the groups in `1. LTX 2.3 All-In-One 260406-05.json`:
12
+
13
+ | # | Mode | LTX-2 pipeline class |
14
+ |---|---|---|
15
+ | 1 | Text → Video (+optional Audio) | `TI2VidTwoStagesPipeline` / `DistilledPipeline` |
16
+ | 2 | Audio → Video (Text + Audio → Video + Audio) | `A2VidPipelineTwoStage` |
17
+ | 3 | Image → Video (+optional Audio) | `TI2VidTwoStagesPipeline` |
18
+ | 4 | Lipsync (Image + Audio → Video + Audio) | `A2VidPipelineTwoStage` |
19
+ | 5 | First / Last Frame → Video | `KeyframeInterpolationPipeline` |
20
+ | 6 | Style Transfer (Video → Video, motion control) | `ICLoraPipeline` |
21
+
22
+ ## 2. Decisions log (Q1–Q8 + path)
23
+
24
+ | # | Question | Decision | Rationale |
25
+ |---|---|---|---|
26
+ | Q1 | Modes scope | All 6 | Marginal cost per mode is small; the differentiator vs other Gradio LTX demos is the unified shell. |
27
+ | Q2 | Settings exposure | Preset (Fast/Balanced/Quality) + Advanced accordion | Clean Spaces demo without sacrificing local power-user control. |
28
+ | Q3 | Backend | ComfyUI as headless backend (library mode) | ComfyUI is the production path on MPS; pure-Python `ltx-pipelines` has known crashes (TI2Vid OOM, A2Vid stage 2 SIGUSR1). Re-using ComfyUI's path inherits the fixes. |
29
+ | Q4 | Workflow templates | Six mode-specific JSON files | Smaller diff surface, easier tests, evolves per mode. `tools/extract_modes.py` regenerates them from the master workflow. |
30
+ | Q5 | LoRA UI | Categorized chrome (Camera dropdown · Detailer toggle · IC-LoRA mode-specific) | Mode-aware, no rope to misconfigure. Custom LoRA escape hatch deferred to v1.1. |
31
+ | Q6 | Layout shell | Sidebar nav + 2-column body | Six tab labels are too wide horizontally; sidebar gives mode names room and accommodates global panels. |
32
+ | Q7 | ComfyUI install | Bundled (git submodule locally, runtime clone on Spaces) | Self-contained, no dependence on user's existing ComfyUI install. |
33
+ | Q8 | Model storage | Local: HF cache → symlinks. Spaces: lazy `hf_hub_download` to `/data`. | Honors HF cache preference; no duplicate downloads; lazy strategy keeps Spaces `/data` budget under control. |
34
+ | Path | Spaces tier | Path A — Pro tier | ~70 GB minimum model footprint exceeds free tier `/data`; Balanced preset needs longer per-call duration. |
35
+
36
+ ## 3. Architecture
37
+
38
+ ```
39
+ ┌────────────────────────────────────────────────────────────────┐
40
+ │ Gradio UI (sidebar nav · 2-col body · per-mode inputs) │
41
+ │ ─ Mode tabs: T2V · A2V · I2V · Lipsync · Keyframe · Style │
42
+ │ ─ Categorized LoRA chrome inside each mode's Advanced ▾ │
43
+ │ ─ Models / Settings / History panels in sidebar │
44
+ └────────────────────────────────┬───────────────────────────────┘
45
+ │ parameterize 1 of 6 templates
46
+
47
+ ┌────────────────────────────────────────────────────────────────┐
48
+ │ Workflow Builder (workflows/<mode>.json + UI parameters) │
49
+ │ ─ load_template(mode) → patch nodes → return JSON │
50
+ │ ─ Validates inputs against the mode's required nodes │
51
+ └────────────────────────────────┬───────────────────────────────┘
52
+ │ workflow JSON dict
53
+
54
+ ┌────────────────────────────────────────────────────────────────┐
55
+ │ Backend (single impl) ComfyUILibraryBackend │
56
+ │ ─ comfy.execution.PromptExecutor.execute(workflow) │
57
+ │ ─ Hooks comfy.utils.PROGRESS_BAR_HOOK → yields ProgressEvent │
58
+ │ ─ On Spaces: wrapped in @spaces.GPU(duration=N) │
59
+ │ ─ Locally: runs in a worker thread, GIL-released by torch │
60
+ └────────────────────────────────┬───────────────────────────────┘
61
+ │ progress events + outputs
62
+
63
+ ┌────────────────────────────────────────────────────────────────┐
64
+ │ Bundled ComfyUI (vendored as a git submodule) │
65
+ │ ─ ComfyUI core + ComfyUI-LTXVideo + KJNodes + rgthree │
66
+ │ ─ models/ symlinks → ~/.cache/huggingface/hub (local) │
67
+ │ ─ models/ files on /data persistent volume (Spaces) │
68
+ └────────────────────────────────────────────────────────────────┘
69
+ ```
70
+
71
+ ### 3.1 Key invariants
72
+
73
+ 1. **One backend interface, single implementation.** Library mode everywhere (`comfy.execution.PromptExecutor`). The `@spaces.GPU` decorator is the only divergence between local and Spaces.
74
+ 2. **Workflow JSON is the contract.** Six small templates, parameterized at the leaves only. We don't reinvent ComfyUI's node graph.
75
+ 3. **Models are never owned by the AIO repo.** Always either symlinked from HF cache (local) or downloaded to `/data` (Spaces). The bundled ComfyUI's `models/` is purely a view onto the cache.
76
+ 4. **Auto MPS/CUDA dispatch.** The bundled ComfyUI handles device selection and dtype casting. The AIO layer writes no device code.
77
+
78
+ ## 4. File structure
79
+
80
+ ```
81
+ ltx2.3-AIO-generator/
82
+ ├── app.py # Gradio entry — sidebar nav, mode rendering, generate handler
83
+ ├── backend.py # ComfyUI library backend; PromptExecutor wrapper; progress streaming
84
+ ├── workflow.py # load + parameterize a workflow JSON template
85
+ ├── modes.py # MODE_REGISTRY: 6 modes × (inputs, defaults, parameterize fn)
86
+ ├── models.py # symlink HF cache (local) / hf_hub_download to /data (Spaces)
87
+ ├── ui.py # reusable Gradio components: LoRA chrome, preset bar, status banner
88
+ ├── workflows/ # six mode-specific JSON templates (≤50 nodes each)
89
+ │ ├── t2v.json
90
+ │ ├── a2v.json
91
+ │ ├── i2v.json
92
+ │ ├── lipsync.json
93
+ │ ├── keyframe.json
94
+ │ └── style.json
95
+ ├── tools/
96
+ │ ├── extract_modes.py # rebuild templates from your master workflow
97
+ │ └── refresh_models.py # refresh HF cache symlinks if snapshot SHAs change
98
+ ├── tests/
99
+ │ ├── conftest.py
100
+ │ ├── test_workflow.py
101
+ │ └── test_modes.py
102
+ ├── comfyui/ # git submodule pinned to a known-good ComfyUI commit
103
+ ├── setup.sh # init submodule, venv, install reqs, symlink models
104
+ ├── requirements.txt # gradio, spaces, huggingface-hub, torch, comfyui's own reqs
105
+ ├── README.md # incl. HF Space front matter for one-touch deploy
106
+ ├── CLAUDE.md # project guidelines (incl. sole-author commit rule)
107
+ └── .gitignore
108
+ ```
109
+
110
+ ### 4.1 Module responsibilities
111
+
112
+ | File | Responsibility | LOC est. |
113
+ |---|---|---|
114
+ | `app.py` | Gradio Blocks; sidebar navigation; per-mode input forms; calls `backend.submit()` | ~400 |
115
+ | `backend.py` | One class `ComfyUILibraryBackend`. Constructor adds `comfyui/` to `sys.path`, loads custom nodes, instantiates `PromptExecutor`. `submit(workflow)` is an async generator yielding `ProgressEvent`s. Handles ZeroGPU detection — wraps `_execute()` in `@spaces.GPU` if env var set. | ~200 |
116
+ | `workflow.py` | `load_template(mode)`, `set_input(workflow, node_id, field, value)`, `validate(workflow)`. Pure functions over dicts. | ~120 |
117
+ | `modes.py` | One `Mode` dataclass (name, icon, input_specs, parameterize_fn). `MODE_REGISTRY = {"t2v": Mode(...), ...}`. The `parameterize_fn` is the only mode-specific code. | ~300 |
118
+ | `models.py` | `ensure_models_for_mode(mode)`: walks the mode's workflow, finds loader nodes, identifies HF repo+filename, downloads via `hf_hub_download`, symlinks into `comfyui/models/...`. On Spaces, downloads to `/data`. | ~150 |
119
+ | `ui.py` | `lora_chrome(mode)` returns the categorized LoRA component group. `preset_bar()` returns the Fast/Balanced/Quality radio. `status_banner()` returns the `gr.HTML` for progress + stage text. | ~200 |
120
+
121
+ Total app code (excluding ComfyUI submodule and workflow JSONs): **~1,400 LOC** across 6 modules.
122
+
123
+ ### 4.2 ComfyUI submodule + custom nodes
124
+
125
+ Pinned at a known-good commit. Custom nodes installed during `setup.sh` (local) or during runtime bootstrap (Spaces):
126
+
127
+ - `Lightricks/ComfyUI-LTXVideo` (LTX node implementations: `LTXICLoRALoaderModelOnly`, `LTXVChunkFeedForward`, `LTXVGemmaCLIPModelLoader`)
128
+ - `kijai/ComfyUI-KJNodes` (`VAELoaderKJ`, `ResizeImageMaskNode`, `INTConstant`, GetNode/SetNode helpers)
129
+ - `rgthree/rgthree-comfy` (`Power Lora Loader`, `Any Switch`, `Fast Groups Bypasser`, `Label`)
130
+ - `Kosinkadink/ComfyUI-VideoHelperSuite` (`VHS_VideoCombine`, `VHS_LoadVideo`, `VHS_LoadAudioUpload`)
131
+ - `pythongosssss/ComfyUI-Custom-Scripts` (`MathExpression|pysssss` — used by the master workflow for derived dimensions)
132
+
133
+ ## 5. Data flow
134
+
135
+ User clicks **Generate** in the I2V tab. The path:
136
+
137
+ ```
138
+ [1] app.py: on_generate(mode="i2v", **inputs)
139
+ │ Pulls Mode("i2v") from MODE_REGISTRY
140
+
141
+ [2] modes.i2v.parameterize_fn(inputs) → list[(node_id, field, value)]
142
+
143
+ [3] workflow.load_template("i2v") → dict
144
+ workflow.set_input(wf, *patch) for each patch
145
+ workflow.validate(wf)
146
+
147
+ [4] models.ensure_models_for_mode(wf)
148
+ yields DownloadEvent(filename, mb_done, mb_total)
149
+
150
+ [5] backend.submit(wf) — async generator
151
+ On Spaces: wrapped in @spaces.GPU(duration=preset_budget)
152
+ Calls comfy.execution.PromptExecutor.execute(wf)
153
+
154
+ [6] PromptExecutor walks node graph
155
+ Per-node: yields ProgressEvent(stage, step, total_steps)
156
+
157
+ [7] app.py: async for event in backend.submit(...):
158
+ status_banner.html = render(event)
159
+
160
+ [8] Final node (VHS_VideoCombine) writes /tmp/out_<ts>.mp4
161
+ yields OutputEvent(path)
162
+
163
+ [9] Gradio video component renders the file
164
+ History panel adds row: timestamp · seed · duration
165
+ ```
166
+
167
+ ### 5.1 Three event types
168
+
169
+ ```python
170
+ @dataclass
171
+ class DownloadEvent: filename: str; mb_done: float; mb_total: float
172
+ @dataclass
173
+ class ProgressEvent: stage: int; stage_label: str; step: int; total_steps: int
174
+ @dataclass
175
+ class OutputEvent: video_path: str; audio_path: Optional[str]; meta: dict
176
+ ```
177
+
178
+ The Gradio handler is one async generator that consumes these and yields `(status_html, video, history)` tuples.
179
+
180
+ ### 5.2 Cancellation
181
+
182
+ Gradio's `Button.click(..., cancels=[generate_event])` calls `backend.interrupt()` → `comfy.model_management.interrupt_current_processing()`. The async generator's `finally:` block always frees GPU memory before raising.
183
+
184
+ ## 6. Model loading & VRAM management
185
+
186
+ ComfyUI's `comfy.model_management` handles the heavy lifting — we write zero code for it.
187
+
188
+ **Inherited from ComfyUI:**
189
+ - Smart offload tiers (tracks total/free VRAM continuously; offloads largest non-live model when next load would overflow).
190
+ - Per-node load via `ModelPatcher`; LoRA patching applies deltas in-place without double-loading the base model.
191
+ - Automatic device dispatch and dtype casting (BF16/FP16/FP8 per `--force-*` args).
192
+ - ComfyUI-LTXVideo's existing MPS edge-case handling.
193
+
194
+ **AIO layer adds:**
195
+
196
+ | Concern | Implementation |
197
+ |---|---|
198
+ | Pre-flight download | `models.ensure_models_for_mode(wf)` walks loader nodes, resolves filenames via a `MODEL_REGISTRY` map, downloads via `hf_hub_download`, symlinks into `comfyui/models/<type>/<name>`. |
199
+ | VRAM tier hint | `comfy.cli_args.args.lowvram\|normalvram\|highvram` set at backend init based on detected GPU memory. Override via env var `LTX23_AIO_VRAM`. |
200
+ | Memory status badge | `ui.status_banner()` polls `comfy.model_management.get_free_memory()` every 2 s while idle. |
201
+ | Manual unload | Sidebar button **Unload all models** → `unload_all_models()` + `empty_cache()`. |
202
+ | Inter-mode caching | Single in-process ComfyUI keeps loaded models warm across mode switches. Free for us — ComfyUI's cache does it. |
203
+
204
+ ### 6.1 Memory math (BF16)
205
+
206
+ | Component | Size | Loaded when |
207
+ |---|---|---|
208
+ | Distilled 22B transformer | ~44 GB | Diffusion stages |
209
+ | Gemma 3 12B text encoder | ~24 GB | Prompt encoding |
210
+ | Video VAE | ~2 GB | Encode (i2v/keyframe) + final decode |
211
+ | Audio VAE | ~0.5 GB | A2V/Lipsync only |
212
+ | LoRAs | <1 GB each | Patched into transformer |
213
+ | Latents | ~3 GB at 512×768/81f | Diffusion |
214
+
215
+ Realistic peak resident: ~70 GB on MPS unified memory; ~45 GB GPU + 24 GB system RAM on H200 80 GB ZeroGPU.
216
+
217
+ ### 6.2 Out-of-scope (v1.1)
218
+
219
+ `UnetLoaderGGUF` for <24 GB consumer NVIDIA GPUs. The workflow templates already accommodate the GGUF node; v1.1 adds a "Low VRAM" preset that swaps the loader.
220
+
221
+ ## 7. Progress reporting
222
+
223
+ Two surfaces, layered:
224
+
225
+ ```
226
+ ┌── Status Banner (gr.HTML) ────────────────────────────────────┐
227
+ │ ⠋ Stage 4/6 · Diffusion (Stage 1) │
228
+ │ Step 18/30 · 1m 12s elapsed · ~2m 41s remaining │
229
+ │ MPS · 47 / 128 GB · transformer + gemma resident │
230
+ │ ████████████████░░░░░░░░░░░░░░ 60% │
231
+ └────────────────────────────────────────────────────────────────┘
232
+ ```
233
+
234
+ Below: a `gr.Progress(track_tqdm=True)` picks up ComfyUI's sampler tqdm bars natively.
235
+
236
+ ### 7.1 Stage map per mode
237
+
238
+ For each mode, `modes.py` declares the stage list mapping ComfyUI node ids → human-readable stage labels.
239
+
240
+ I2V Balanced preset stage map:
241
+
242
+ | # | Stage | ComfyUI node(s) | Typical share |
243
+ |---|---|---|---|
244
+ | 1 | Download missing models | (pre-flight) | 0–60s, only on first run |
245
+ | 2 | Encode prompt | `LTXVGemmaCLIPModelLoader` + `CLIPTextEncode` | ~5% |
246
+ | 3 | Encode image | `LoadImage` + image VAE encode | ~3% |
247
+ | 4 | Diffusion (Stage 1, half-res) | `KSampler` × N steps | ~55% |
248
+ | 5 | Spatial upscale (×2) | `LatentUpscaleModelLoader` + sampler | ~7% |
249
+ | 6 | Diffusion (Stage 2, full-res, 4 distilled steps) | `KSampler` × 4 | ~20% |
250
+ | 7 | Decode video | Video VAE decode + `VHS_VideoCombine` | ~10% |
251
+
252
+ T2V is shorter (no image encode); Lipsync adds audio encode; Style Transfer is single-stage.
253
+
254
+ ### 7.2 Plumbing
255
+
256
+ ComfyUI's `PromptExecutor` calls a per-node hook before each node runs. The backend translates `node_id → stage_index` via the mode's stage map. Within sampler nodes, `comfy.utils.PROGRESS_BAR_HOOK` fires per step. ETA: `(elapsed / progress) - elapsed` capped to a sensible minimum.
257
+
258
+ ## 8. Error handling
259
+
260
+ | # | Category | Surface | Recovery |
261
+ |---|---|---|---|
262
+ | 1 | Setup / install (`comfyui/` missing, custom node import failure, no torch CUDA/MPS) | Startup banner replaces the UI; red card with the failing component and exact `setup.sh` command. App refuses to start. | Local: `bash setup.sh`. Spaces: surfaces in build log. |
263
+ | 2 | Model download (network, HF auth, disk full) | Status banner inline error with retry button. Auth errors prompt for `HF_TOKEN`. | Auto-retry once with backoff for transient. Auth/disk are user-actionable. |
264
+ | 3 | Workflow validation (input not provided, frame count not 8k+1, resolution not /32, image too large) | Caught client-side; Gradio inline validation; generate button disabled. | Auto-snap where unambiguous (frame count to nearest 8k+1, resolution to nearest /32). |
265
+ | 4 | ComfyUI execution (node not found, shape mismatch, file format) | Status banner shows failing stage in red; collapsible `View full traceback ▾`. | Suggests `tools/refresh_models.py` for symlink issues, `bash setup.sh --update-comfy` for node issues. |
266
+ | 5 | OOM | Status banner with stage + memory at failure; **Try Fast preset** button. | On catch: `unload_all_models()` + `empty_cache()`. Next click starts clean. |
267
+ | 6 | ZeroGPU duration exceeded (Spaces) | Status banner: "Generation exceeded GPU budget"; suggests **Switch to Fast preset**. Partial output (if decoded) still shown. | `@spaces.GPU(duration=N)` raises a specific exception we catch and translate. |
268
+
269
+ ### 8.1 try/finally discipline
270
+
271
+ ```python
272
+ async def submit(self, workflow):
273
+ try:
274
+ async for event in self._execute_with_progress(workflow):
275
+ yield event
276
+ except OutOfMemoryError as e:
277
+ yield ErrorEvent(category="oom", stage=self._current_stage, ...)
278
+ except spaces.exceptions.GPUDurationExceededError as e:
279
+ yield ErrorEvent(category="zerogpu_timeout", ...)
280
+ except Exception as e:
281
+ yield ErrorEvent(category="execution", traceback=fmt(e), ...)
282
+ finally:
283
+ comfy.model_management.unload_all_models()
284
+ torch.mps.empty_cache() if mps else torch.cuda.empty_cache()
285
+ ```
286
+
287
+ The `finally` block is the single most important line for VRAM hygiene. Cancellation triggers the same path via `interrupt_current_processing()` raising `InterruptedError`.
288
+
289
+ ### 8.2 Logging
290
+
291
+ - Local: `comfyui/comfyui.log` + `logs/aio.log` (10 MB rotation).
292
+ - Spaces: stderr → Space logs panel; no file logging (Space disk is ephemeral except `/data`).
293
+ - Status banner's traceback expander reads the last error from `logs/aio.log` (local) or stderr buffer (Spaces).
294
+
295
+ ### 8.3 Deliberate non-goals
296
+
297
+ No silent retries on ambiguous errors. Surface loudly with a traceback rather than mask real bugs.
298
+
299
+ ## 9. Deployment
300
+
301
+ ### 9.1 Local
302
+
303
+ ```bash
304
+ git clone https://github.com/<your-handle>/ltx2.3-AIO-generator
305
+ cd ltx2.3-AIO-generator
306
+ bash setup.sh
307
+ source .venv/bin/activate
308
+ python app.py
309
+ ```
310
+
311
+ `setup.sh` (idempotent):
312
+
313
+ ```bash
314
+ #!/usr/bin/env bash
315
+ set -euo pipefail
316
+
317
+ python3.11 -m venv .venv
318
+ source .venv/bin/activate
319
+ pip install -U pip
320
+
321
+ git submodule update --init --recursive
322
+ pip install -r comfyui/requirements.txt
323
+
324
+ cd comfyui/custom_nodes
325
+ for repo in \
326
+ Lightricks/ComfyUI-LTXVideo \
327
+ kijai/ComfyUI-KJNodes \
328
+ rgthree/rgthree-comfy \
329
+ Kosinkadink/ComfyUI-VideoHelperSuite \
330
+ pythongosssss/ComfyUI-Custom-Scripts ; do
331
+ name="${repo##*/}"
332
+ [[ -d "$name" ]] || git clone "https://github.com/$repo.git" "$name"
333
+ [[ -f "$name/requirements.txt" ]] && pip install -r "$name/requirements.txt"
334
+ done
335
+ cd ../..
336
+
337
+ pip install -r requirements.txt
338
+ python tools/refresh_models.py
339
+
340
+ echo "Setup complete. Run: source .venv/bin/activate && python app.py"
341
+ ```
342
+
343
+ ### 9.2 HF Spaces (ZeroGPU, Pro tier)
344
+
345
+ `README.md` front matter:
346
+
347
+ ```yaml
348
+ ---
349
+ title: LTX 2.3 All-in-One Video Generator
350
+ emoji: 🎬
351
+ colorFrom: purple
352
+ colorTo: blue
353
+ sdk: gradio
354
+ sdk_version: "5.0"
355
+ app_file: app.py
356
+ python_version: "3.11"
357
+ suggested_hardware: zero-gpu
358
+ hf_oauth: false
359
+ ---
360
+ ```
361
+
362
+ Bootstrap inside `app.py` runs once on cold start:
363
+
364
+ ```python
365
+ def _bootstrap():
366
+ on_spaces = bool(os.environ.get("SPACES_ZERO_GPU"))
367
+ comfy_dir = pathlib.Path("/data/comfyui" if on_spaces else "comfyui")
368
+
369
+ if on_spaces and not comfy_dir.exists():
370
+ _git_clone(COMFYUI_REPO, comfy_dir, ref=COMFYUI_COMMIT)
371
+ for node_repo, node_ref in CUSTOM_NODES_PINNED:
372
+ _git_clone(node_repo, comfy_dir / "custom_nodes" / node_repo.split("/")[-1], ref=node_ref)
373
+ _pip_install_custom_node_reqs(comfy_dir)
374
+
375
+ sys.path.insert(0, str(comfy_dir))
376
+ os.environ["COMFY_MODELS_DIR"] = str(
377
+ pathlib.Path("/data/models") if on_spaces else (comfy_dir / "models")
378
+ )
379
+ ```
380
+
381
+ Storage budget: `/data` ~50 GB on Pro. Lazy per-mode download keeps usage under budget when only some modes are exercised.
382
+
383
+ Per-call duration: `@spaces.GPU(duration=...)` per preset:
384
+
385
+ | Preset | Duration |
386
+ |---|---|
387
+ | Fast | 60 s |
388
+ | Balanced | 120 s |
389
+ | Quality | 300 s |
390
+
391
+ UI auto-greys out presets whose duration exceeds the detected `SPACES_GPU_DURATION_LIMIT`.
392
+
393
+ ### 9.3 One-touch deploy (optional)
394
+
395
+ `.github/workflows/deploy-space.yml`:
396
+
397
+ ```yaml
398
+ on: { push: { branches: [main] } }
399
+ jobs:
400
+ push-to-space:
401
+ runs-on: ubuntu-latest
402
+ steps:
403
+ - uses: actions/checkout@v4
404
+ with: { lfs: true }
405
+ - name: Push to HF Space
406
+ env: { HF_TOKEN: ${{ secrets.HF_TOKEN }} }
407
+ run: |
408
+ git remote add space https://user:$HF_TOKEN@huggingface.co/spaces/<you>/ltx2.3-aio
409
+ git push --force space main
410
+ ```
411
+
412
+ ### 9.4 Local vs Spaces — what's identical, what differs
413
+
414
+ | Concern | Local | Spaces |
415
+ |---|---|---|
416
+ | Backend code | `ComfyUILibraryBackend` | `ComfyUILibraryBackend` (same class) |
417
+ | GPU decorator | none (worker thread) | `@spaces.GPU(duration=preset_budget)` |
418
+ | ComfyUI install | git submodule | runtime git clone to `/data` |
419
+ | Models location | symlinks → `~/.cache/huggingface` | direct files in `/data/models` |
420
+ | Logging | `logs/aio.log` + `comfyui/comfyui.log` | stderr → Space logs panel |
421
+ | First-run latency | seconds (deps installed by setup.sh) | minutes (clone + first-mode download) |
422
+ | Custom nodes update | re-run `setup.sh` | push commit; rebuild Space |
423
+
424
+ ## 10. Testing
425
+
426
+ Layered so most tests run on CPU in seconds; only smoke touches GPU.
427
+
428
+ | Layer | What it verifies | GPU? | Time |
429
+ |---|---|---|---|
430
+ | L1 — Unit | `workflow.load_template`, `set_input`, `validate` (pure functions over JSON dicts) | No | < 1 s |
431
+ | L1 — Unit | Each mode's `parameterize_fn`: known input → expected patch list | No | < 1 s |
432
+ | L1 — Unit | `MODEL_REGISTRY` lookups: every model in every workflow resolves to an HF repo+filename | No | < 1 s |
433
+ | L2 — Graph validation | `load_template + parameterize_fn(canonical_inputs)` produces a workflow that ComfyUI's `validate_prompt` accepts | No | < 5 s |
434
+ | L3 — Integration (CPU) | `models.ensure_models_for_mode()` against a fake HF cache; symlinks created correctly | No | < 2 s |
435
+ | L4 — Smoke (GPU, opt-in) | One end-to-end generation per mode at minimum viable settings (Fast preset, lowest legal resolution, 1 step). `pytest --gpu`. | Yes | ~3 min for all 6 |
436
+
437
+ ### 10.1 Fixtures
438
+
439
+ - `canonical_inputs(mode)` — known-good Gradio input dict per mode.
440
+ - `fake_hf_cache(tmp_path)` — fake `~/.cache/huggingface/hub` with placeholder files.
441
+ - `--gpu` flag enables L4. Default skips with a reason.
442
+ - `--comfy-real` flag uses bundled ComfyUI for L2; default uses a stubbed validator.
443
+
444
+ ### 10.2 CI
445
+
446
+ `.github/workflows/ci.yml` runs L1 + L2 + L3 on `ubuntu-latest`, Python 3.11, every push. ~30 s wall time. No GPU runner. Lint: `ruff check` + `ruff format --check`.
447
+
448
+ ### 10.3 Deliberate non-goals
449
+
450
+ - No mocks for ComfyUI itself.
451
+ - No visual regression tests for Gradio UI.
452
+ - No property-based / fuzz testing for workflow params.
453
+
454
+ ## 11. Out of scope (v1)
455
+
456
+ - **Lite mode for free Spaces tier** — `LTX23_AIO_LITE=1` env var that filters MODE_REGISTRY to T2V+I2V, locks Fast preset, swaps GGUF transformer. Designed in but not built in v1.
457
+ - **Custom LoRA escape hatch** — Power-Lora-Loader-style add/remove rows. Categorized chrome covers v1; custom is a v1.1 toggle.
458
+ - **GGUF Q4 transformer (`UnetLoaderGGUF`)** — for <24 GB consumer NVIDIA GPUs. Workflow templates accommodate the node; v1.1 adds the "Low VRAM" preset.
459
+ - **Auto-launch user's existing ComfyUI** — current design uses bundled ComfyUI exclusively. v1.1 could add `LTX23_AIO_COMFYUI_URL` env var to point at an external server.
460
+ - **Multi-prompt queueing** — Gradio default single-shot is fine. ComfyUI's queue isn't exposed.
461
+ - **History persistence across sessions** — sidebar history is in-memory. Local could read `outputs/` on startup; Spaces session storage is ephemeral.
462
+
463
+ ## 12. Open questions / follow-ups
464
+
465
+ - **Pinned ComfyUI commit:** select after a manual end-to-end run on the user's `~/Projects/comfyui/` install. Capture the commit SHA in `setup.sh` and the Spaces bootstrap.
466
+ - **Spaces secrets:** HF Space front matter doesn't include any secrets; `HF_TOKEN` only needed if a gated repo is used (not currently). Document in README.
467
+ - **Output retention on Spaces:** decide whether `/tmp/out_*.mp4` should also copy to `/data/outputs/` for download-after-restart. v1 default: no, ephemeral.
468
+ - **`MODEL_REGISTRY` source of truth:** the registry maps filename → HF repo. We populate it once at v1 from Lightricks' README + Kijai's repo and freeze it; updates require a code change + tests.
469
+
470
+ ## 13. Implementation order (preview — full breakdown in implementation plan)
471
+
472
+ 1. **Repo skeleton** — directory layout, `.gitignore`, `CLAUDE.md`, `README.md` stub, `requirements.txt`.
473
+ 2. **`tools/extract_modes.py`** — extract six mode templates from the master workflow. Validates by re-loading each in ComfyUI's parser.
474
+ 3. **`workflow.py`** — pure-function library with L1 + L2 tests.
475
+ 4. **`modes.py`** — MODE_REGISTRY with `parameterize_fn` per mode + L1 tests.
476
+ 5. **`models.py`** — registry + `ensure_models_for_mode` + L3 tests with fake HF cache.
477
+ 6. **`backend.py`** — ComfyUILibraryBackend, async submit, progress hook plumbing. Local smoke test (L4) for Fast/T2V.
478
+ 7. **`ui.py`** — LoRA chrome, preset bar, status banner.
479
+ 8. **`app.py`** — Gradio Blocks, sidebar nav, mode rendering, generate handler. Manual end-to-end on Mac for all 6 modes.
480
+ 9. **`setup.sh`** — idempotent local bootstrap.
481
+ 10. **`README.md` + Spaces front matter** — push to a test Space, verify cold-start and one Fast generation.
482
+ 11. **CI workflow** — L1 + L2 + L3 on push.
483
+ 12. **Optional `.github/workflows/deploy-space.yml`** — push-to-Space CI.