techfreakworm commited on
Commit
6569772
·
unverified ·
1 Parent(s): 723293f

docs: implementation plan — 27 tasks across 8 phases

Browse files
docs/superpowers/plans/2026-04-30-ltx23-aio-generator.md ADDED
@@ -0,0 +1,2919 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LTX 2.3 AIO Generator Implementation Plan
2
+
3
+ > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
4
+
5
+ **Goal:** Build a Gradio app that wraps the existing ComfyUI LTX 2.3 All-In-One workflow into a polished mode-specific UI, runnable locally (MPS/CUDA) and on Hugging Face Spaces (ZeroGPU, Pro tier).
6
+
7
+ **Architecture:** Gradio frontend → workflow JSON parameterizer → bundled ComfyUI in library mode (`comfy.execution.PromptExecutor`). Six mode-specific workflow JSON templates extracted from the master workflow; per-mode `parameterize_fn` translates Gradio inputs into node patches. Same code locally and on Spaces; the only divergence is `@spaces.GPU` decoration and model storage location.
8
+
9
+ **Tech Stack:** Python 3.11, Gradio 5.x, `spaces`, `huggingface_hub`, ComfyUI (vendored as git submodule + runtime clone on Spaces) + custom nodes (`ComfyUI-LTXVideo`, `ComfyUI-KJNodes`, `rgthree-comfy`, `ComfyUI-VideoHelperSuite`, `ComfyUI-Custom-Scripts`), pytest, ruff.
10
+
11
+ **Spec:** `docs/superpowers/specs/2026-04-30-ltx23-aio-generator-design.md`
12
+
13
+ ---
14
+
15
+ ## File Map (locked at plan time)
16
+
17
+ | File | Created by task | LOC est. | Responsibility |
18
+ |---|---|---|---|
19
+ | `requirements.txt` | T1 | 15 | Pin Gradio, spaces, huggingface_hub, torch, ruff, pytest. |
20
+ | `pyproject.toml` | T1 | 30 | Pytest rootdir + ruff config so flat-layout imports resolve. |
21
+ | `setup.sh` | T2 | 50 | Idempotent local bootstrap (venv, submodule, custom nodes, models). |
22
+ | `README.md` | T3 | 80 | Spaces front matter + local quickstart + screenshot placeholders. |
23
+ | `tests/conftest.py` | T4 | 80 | Fixtures: `master_workflow`, `canonical_inputs`, `fake_hf_cache`, CLI flags. |
24
+ | `tools/extract_modes.py` | T5 | 200 | Extract six mode templates from the master workflow JSON. |
25
+ | `workflows/{t2v,a2v,i2v,lipsync,keyframe,style}.json` | T6 | (data) | Six mode templates. |
26
+ | `workflow.py` | T7–T9 | 120 | `load_template`, `set_input`, `validate`. |
27
+ | `modes.py` | T10–T12 | 300 | `Mode` dataclass + `MODE_REGISTRY` (six entries with `parameterize_fn`). |
28
+ | `models.py` | T13–T15 | 150 | `MODEL_REGISTRY`, `ensure_models_for_mode`, symlink/download logic. |
29
+ | `tools/refresh_models.py` | T16 | 30 | CLI wrapper around `models.ensure_models_for_mode` for all modes. |
30
+ | `backend.py` | T17–T20 | 200 | `ComfyUILibraryBackend`, async submit, progress hook, ZeroGPU. |
31
+ | `ui.py` | T21–T23 | 200 | `preset_bar`, `status_banner`, `lora_chrome`. |
32
+ | `app.py` | T24–T26 | 400 | Gradio `Blocks`, sidebar, mode rendering, generate handler. |
33
+ | `.github/workflows/ci.yml` | T27 | 30 | Run L1+L3 tests on push. |
34
+ | `.github/workflows/deploy-space.yml` | T28 | 25 | Optional — push to HF Space on main. |
35
+
36
+ Total: ~1,800 LOC across 14 files (excluding the ComfyUI submodule, workflow JSON data, and tests).
37
+
38
+ ---
39
+
40
+ ## Phase 0 — Foundations
41
+
42
+ ### Task 1: `requirements.txt`
43
+
44
+ **Files:**
45
+ - Create: `requirements.txt`
46
+
47
+ - [ ] **Step 1: Create `requirements.txt`**
48
+
49
+ ```text
50
+ gradio>=5.0,<6.0
51
+ spaces>=0.30.0
52
+ huggingface_hub>=0.27.0
53
+ torch>=2.4.0
54
+ torchvision
55
+ torchaudio
56
+ numpy
57
+ Pillow
58
+ einops
59
+ safetensors
60
+ tqdm
61
+
62
+ # Dev / test
63
+ pytest>=8.0
64
+ pytest-asyncio>=0.23
65
+ ruff>=0.5
66
+ ```
67
+
68
+ - [ ] **Step 2: Create `pyproject.toml`** so pytest finds the flat-layout modules and ruff rules are pinned
69
+
70
+ ```toml
71
+ [tool.pytest.ini_options]
72
+ pythonpath = ["."]
73
+ markers = [
74
+ "gpu: marks tests that need a GPU (use --gpu to enable)",
75
+ ]
76
+
77
+ [tool.ruff]
78
+ line-length = 100
79
+ target-version = "py311"
80
+
81
+ [tool.ruff.lint]
82
+ select = ["E", "F", "I", "B", "UP"]
83
+ ignore = ["E501"] # line length is enforced by formatter, not linter
84
+
85
+ [tool.ruff.lint.per-file-ignores]
86
+ "tests/*" = ["E402"] # imports inside test functions are fine
87
+ ```
88
+
89
+ - [ ] **Step 3: Verify both files parse**
90
+
91
+ Run: `python3.11 -m pip install --dry-run -r requirements.txt 2>&1 | head -5`
92
+ Expected: pip resolves package names without "ERROR: Invalid requirement" lines (network errors are fine — we're checking syntax).
93
+
94
+ Run: `python3.11 -c "import tomllib; print(list(tomllib.loads(open('pyproject.toml').read()).keys()))"`
95
+ Expected: `['tool']`
96
+
97
+ - [ ] **Step 4: Commit**
98
+
99
+ ```bash
100
+ git add requirements.txt pyproject.toml
101
+ git commit -m "chore: pin runtime + dev dependencies and configure pytest/ruff"
102
+ ```
103
+
104
+ ---
105
+
106
+ ### Task 2: `setup.sh`
107
+
108
+ **Files:**
109
+ - Create: `setup.sh`
110
+
111
+ - [ ] **Step 1: Write `setup.sh`**
112
+
113
+ ```bash
114
+ #!/usr/bin/env bash
115
+ set -euo pipefail
116
+
117
+ REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
118
+ cd "$REPO_ROOT"
119
+
120
+ echo "▶ Creating Python 3.11 venv"
121
+ python3.11 -m venv .venv
122
+ # shellcheck disable=SC1091
123
+ source .venv/bin/activate
124
+ pip install -U pip wheel
125
+
126
+ echo "▶ Initializing ComfyUI submodule"
127
+ git submodule update --init --recursive
128
+
129
+ echo "▶ Installing ComfyUI core requirements"
130
+ pip install -r comfyui/requirements.txt
131
+
132
+ echo "▶ Installing pinned custom nodes"
133
+ mkdir -p comfyui/custom_nodes
134
+ cd comfyui/custom_nodes
135
+ for repo in \
136
+ Lightricks/ComfyUI-LTXVideo \
137
+ kijai/ComfyUI-KJNodes \
138
+ rgthree/rgthree-comfy \
139
+ Kosinkadink/ComfyUI-VideoHelperSuite \
140
+ pythongosssss/ComfyUI-Custom-Scripts ; do
141
+ name="${repo##*/}"
142
+ if [[ ! -d "$name" ]]; then
143
+ git clone --depth 1 "https://github.com/$repo.git" "$name"
144
+ fi
145
+ if [[ -f "$name/requirements.txt" ]]; then
146
+ pip install -r "$name/requirements.txt"
147
+ fi
148
+ done
149
+ cd "$REPO_ROOT"
150
+
151
+ echo "▶ Installing AIO app dependencies"
152
+ pip install -r requirements.txt
153
+
154
+ echo "▶ Symlinking models from HF cache"
155
+ python tools/refresh_models.py || true # ok to fail before tools/ exists
156
+
157
+ echo
158
+ echo "✓ Setup complete."
159
+ echo " Activate venv: source .venv/bin/activate"
160
+ echo " Run app: python app.py"
161
+ ```
162
+
163
+ - [ ] **Step 2: Make executable**
164
+
165
+ Run: `chmod +x setup.sh`
166
+ Expected: no output, exit 0.
167
+
168
+ - [ ] **Step 3: Commit**
169
+
170
+ ```bash
171
+ git add setup.sh
172
+ git commit -m "chore: idempotent setup.sh — venv, submodule, custom nodes, models"
173
+ ```
174
+
175
+ ---
176
+
177
+ ### Task 3: `README.md` with Spaces front matter
178
+
179
+ **Files:**
180
+ - Modify: `README.md`
181
+
182
+ - [ ] **Step 1: Replace the placeholder `README.md`**
183
+
184
+ ```markdown
185
+ ---
186
+ title: LTX 2.3 All-in-One Video Generator
187
+ emoji: 🎬
188
+ colorFrom: purple
189
+ colorTo: blue
190
+ sdk: gradio
191
+ sdk_version: "5.0"
192
+ app_file: app.py
193
+ python_version: "3.11"
194
+ suggested_hardware: zero-gpu
195
+ hf_oauth: false
196
+ ---
197
+
198
+ # LTX 2.3 All-in-One Video Generator
199
+
200
+ A Gradio app for [LTX-2.3](https://huggingface.co/Lightricks/LTX-2.3) wrapping all six modes of the official ComfyUI All-In-One workflow under a single, focused UI. Runs locally on Apple Silicon (MPS) or NVIDIA (CUDA), and deploys to Hugging Face Spaces (ZeroGPU).
201
+
202
+ ## Modes
203
+
204
+ 1. **Text → Video** (+ optional Audio)
205
+ 2. **Audio → Video** (Text + Audio → Video + Audio)
206
+ 3. **Image → Video** (+ optional Audio)
207
+ 4. **Lipsync** (Image + Audio → Video + Audio)
208
+ 5. **First / Last Frame → Video** (keyframe interpolation)
209
+ 6. **Style Transfer** (Video → Video, motion control)
210
+
211
+ ## Local quickstart
212
+
213
+ Requires Python 3.11, ~80 GB free disk for model weights, and ~24 GB+ GPU memory (CUDA) or 32 GB+ unified memory (Apple Silicon).
214
+
215
+ ```bash
216
+ git clone --recurse-submodules https://github.com/<your-handle>/ltx2.3-AIO-generator
217
+ cd ltx2.3-AIO-generator
218
+ bash setup.sh
219
+ source .venv/bin/activate
220
+ python app.py
221
+ ```
222
+
223
+ The first run downloads ~70 GB of models into your existing `~/.cache/huggingface/hub` (no duplicate copies in this repo) and symlinks them into `comfyui/models/`.
224
+
225
+ ## HF Spaces deployment
226
+
227
+ This repo is a Gradio Space. The required Pro tier provides ~50 GB persistent `/data` storage and longer per-call ZeroGPU budgets needed for Balanced and Quality presets.
228
+
229
+ ```bash
230
+ git remote add space https://huggingface.co/spaces/<your-handle>/ltx2.3-aio
231
+ git push space main
232
+ ```
233
+
234
+ ## License
235
+
236
+ MIT for the AIO app code. ComfyUI and LTX-2.3 retain their respective licenses.
237
+ ```
238
+
239
+ - [ ] **Step 2: Commit**
240
+
241
+ ```bash
242
+ git add README.md
243
+ git commit -m "docs: README with Spaces front matter and local quickstart"
244
+ ```
245
+
246
+ ---
247
+
248
+ ### Task 4: `tests/conftest.py` with fixtures
249
+
250
+ **Files:**
251
+ - Create: `tests/__init__.py` (empty)
252
+ - Create: `tests/conftest.py`
253
+
254
+ - [ ] **Step 1: Create `tests/__init__.py`** (empty file)
255
+
256
+ ```bash
257
+ mkdir -p tests
258
+ touch tests/__init__.py
259
+ ```
260
+
261
+ - [ ] **Step 2: Write `tests/conftest.py`**
262
+
263
+ ```python
264
+ """Shared pytest fixtures and CLI flags."""
265
+ import json
266
+ import os
267
+ import pathlib
268
+ from typing import Any
269
+
270
+ import pytest
271
+
272
+ REPO_ROOT = pathlib.Path(__file__).resolve().parent.parent
273
+
274
+ DEFAULT_MASTER_WORKFLOW = pathlib.Path(
275
+ os.environ.get(
276
+ "LTX23_MASTER_WORKFLOW",
277
+ pathlib.Path.home() / "Projects/comfyui/user/default/workflows"
278
+ / "1. LTX 2.3 All-In-One 260406-05.json",
279
+ )
280
+ )
281
+
282
+
283
+ def pytest_addoption(parser: pytest.Parser) -> None:
284
+ parser.addoption("--gpu", action="store_true", help="Run L4 GPU smoke tests.")
285
+ parser.addoption(
286
+ "--comfy-real",
287
+ action="store_true",
288
+ help="Use bundled ComfyUI for L2 graph validation (slower).",
289
+ )
290
+
291
+
292
+ def pytest_collection_modifyitems(
293
+ config: pytest.Config, items: list[pytest.Item]
294
+ ) -> None:
295
+ if not config.getoption("--gpu"):
296
+ skip_gpu = pytest.mark.skip(reason="GPU smoke tests skipped (use --gpu)")
297
+ for item in items:
298
+ if "gpu" in item.keywords:
299
+ item.add_marker(skip_gpu)
300
+
301
+
302
+ @pytest.fixture(scope="session")
303
+ def master_workflow() -> dict[str, Any]:
304
+ """The full LTX 2.3 All-In-One workflow JSON (loaded from user's ComfyUI)."""
305
+ if not DEFAULT_MASTER_WORKFLOW.exists():
306
+ pytest.skip(
307
+ f"Master workflow not found at {DEFAULT_MASTER_WORKFLOW}. "
308
+ "Set LTX23_MASTER_WORKFLOW env var to its path."
309
+ )
310
+ return json.loads(DEFAULT_MASTER_WORKFLOW.read_text())
311
+
312
+
313
+ @pytest.fixture
314
+ def canonical_inputs() -> dict[str, dict[str, Any]]:
315
+ """Known-good Gradio input dicts per mode (used by L1/L2 tests)."""
316
+ return {
317
+ "t2v": {
318
+ "prompt": "a tiger walking through a misty forest at dawn, cinematic",
319
+ "negative_prompt": "",
320
+ "preset": "balanced",
321
+ "width": 512,
322
+ "height": 768,
323
+ "frames": 81,
324
+ "fps": 24,
325
+ "seed": 42,
326
+ "camera_lora": "none",
327
+ "camera_strength": 0.8,
328
+ "detailer_on": False,
329
+ "detailer_strength": 0.5,
330
+ },
331
+ "i2v": {
332
+ "prompt": "the subject turns toward the camera and smiles",
333
+ "image": "/tmp/portrait.png",
334
+ "preset": "balanced",
335
+ "width": 512,
336
+ "height": 768,
337
+ "frames": 81,
338
+ "fps": 24,
339
+ "seed": 42,
340
+ "camera_lora": "none",
341
+ "camera_strength": 0.8,
342
+ "detailer_on": True,
343
+ "detailer_strength": 0.5,
344
+ "ic_lora": "union",
345
+ "ic_strength": 0.5,
346
+ "pose_on": False,
347
+ },
348
+ "a2v": {
349
+ "prompt": "a dancer moves to the beat in a neon-lit studio",
350
+ "audio": "/tmp/track.wav",
351
+ "preset": "balanced",
352
+ "width": 512,
353
+ "height": 768,
354
+ "frames": 81,
355
+ "fps": 24,
356
+ "seed": 42,
357
+ "audio_cfg": 7.0,
358
+ },
359
+ "lipsync": {
360
+ "prompt": "the person speaks the audio with natural mouth movement",
361
+ "image": "/tmp/portrait.png",
362
+ "audio": "/tmp/speech.wav",
363
+ "preset": "balanced",
364
+ "image_strength": 0.7,
365
+ "frames": 81,
366
+ "fps": 24,
367
+ "seed": 42,
368
+ },
369
+ "keyframe": {
370
+ "prompt": "smooth transition between the two frames",
371
+ "first_frame": "/tmp/start.png",
372
+ "last_frame": "/tmp/end.png",
373
+ "preset": "balanced",
374
+ "frames": 81,
375
+ "fps": 24,
376
+ "seed": 42,
377
+ },
378
+ "style": {
379
+ "prompt": "in the style of a renaissance oil painting",
380
+ "input_video": "/tmp/source.mp4",
381
+ "preset": "balanced",
382
+ "frames": 81,
383
+ "fps": 24,
384
+ "seed": 42,
385
+ "ic_lora": "motion-track",
386
+ "ic_strength": 0.5,
387
+ },
388
+ }
389
+
390
+
391
+ @pytest.fixture
392
+ def fake_hf_cache(tmp_path: pathlib.Path) -> pathlib.Path:
393
+ """A fake ~/.cache/huggingface/hub layout with placeholder files."""
394
+ hub = tmp_path / "huggingface" / "hub"
395
+ layouts = {
396
+ "models--Lightricks--LTX-2.3": [
397
+ "ltx-2.3-22b-distilled.safetensors",
398
+ "ltx-2.3-spatial-upscaler-x2-1.0.safetensors",
399
+ "ltx-2.3-22b-distilled-lora-384.safetensors",
400
+ ],
401
+ "models--google--gemma-3-12b-it-qat-q4_0-unquantized": [
402
+ "model-00001-of-00005.safetensors",
403
+ "model-00002-of-00005.safetensors",
404
+ "model-00003-of-00005.safetensors",
405
+ "model-00004-of-00005.safetensors",
406
+ "model-00005-of-00005.safetensors",
407
+ "model.safetensors.index.json",
408
+ "tokenizer.model",
409
+ "preprocessor_config.json",
410
+ ],
411
+ "models--Kijai--LTX2.3_comfy": [
412
+ "LTX23_video_vae_bf16.safetensors",
413
+ "LTX23_audio_vae_bf16.safetensors",
414
+ ],
415
+ }
416
+ for repo, files in layouts.items():
417
+ snapshot_dir = hub / repo / "snapshots" / "deadbeef" * 1
418
+ snapshot_dir = hub / repo / "snapshots" / "deadbeef"
419
+ snapshot_dir.mkdir(parents=True, exist_ok=True)
420
+ for filename in files:
421
+ (snapshot_dir / filename).write_text("") # placeholder
422
+ return hub
423
+ ```
424
+
425
+ - [ ] **Step 3: Verify pytest discovers the conftest**
426
+
427
+ Run: `python3.11 -m pytest tests/ --collect-only 2>&1 | head -20`
428
+ Expected: "no tests ran" or similar — but no errors importing conftest.
429
+
430
+ - [ ] **Step 4: Commit**
431
+
432
+ ```bash
433
+ git add tests/__init__.py tests/conftest.py
434
+ git commit -m "test: pytest fixtures (master_workflow, canonical_inputs, fake_hf_cache)"
435
+ ```
436
+
437
+ ---
438
+
439
+ ### Task 5: ComfyUI submodule
440
+
441
+ **Files:**
442
+ - Create: `.gitmodules`
443
+ - Create: `comfyui/` (submodule)
444
+
445
+ - [ ] **Step 1: Add ComfyUI as a git submodule**
446
+
447
+ ```bash
448
+ cd /Users/techfreakworm/Projects/llm/ltx2.3-AIO-generator
449
+ git submodule add https://github.com/comfyanonymous/ComfyUI.git comfyui
450
+ cd comfyui
451
+ # Pin to a known-good recent commit. Capture the SHA the user is currently running.
452
+ USER_COMFY_SHA="$(git -C ~/Projects/comfyui rev-parse HEAD)"
453
+ git checkout "$USER_COMFY_SHA"
454
+ cd ..
455
+ ```
456
+
457
+ - [ ] **Step 2: Verify submodule status**
458
+
459
+ Run: `git submodule status`
460
+ Expected: one line starting with the pinned SHA followed by `comfyui (heads/master ...)` or similar.
461
+
462
+ - [ ] **Step 3: Commit submodule**
463
+
464
+ ```bash
465
+ git add .gitmodules comfyui
466
+ git commit -m "chore: vendor ComfyUI as git submodule pinned to working commit"
467
+ ```
468
+
469
+ ---
470
+
471
+ ## Phase 1 — Workflow library (TDD)
472
+
473
+ ### Task 6: `tools/extract_modes.py` — extract mode templates
474
+
475
+ **Files:**
476
+ - Create: `tools/__init__.py` (empty)
477
+ - Create: `tools/extract_modes.py`
478
+ - Create: `tests/test_extract_modes.py`
479
+
480
+ - [ ] **Step 1: Write the failing test**
481
+
482
+ ```python
483
+ # tests/test_extract_modes.py
484
+ """Tests for the workflow-mode extractor."""
485
+ import json
486
+ import subprocess
487
+ import sys
488
+
489
+ from tests.conftest import REPO_ROOT
490
+
491
+
492
+ def test_extract_creates_six_mode_files(master_workflow, tmp_path):
493
+ """extract_modes.py emits six valid mode-specific JSON templates."""
494
+ out_dir = tmp_path / "workflows"
495
+ master_path = tmp_path / "master.json"
496
+ master_path.write_text(json.dumps(master_workflow))
497
+
498
+ result = subprocess.run(
499
+ [
500
+ sys.executable,
501
+ str(REPO_ROOT / "tools" / "extract_modes.py"),
502
+ "--master",
503
+ str(master_path),
504
+ "--out",
505
+ str(out_dir),
506
+ ],
507
+ check=False,
508
+ capture_output=True,
509
+ text=True,
510
+ )
511
+
512
+ assert result.returncode == 0, result.stderr
513
+ expected = {"t2v.json", "a2v.json", "i2v.json", "lipsync.json", "keyframe.json", "style.json"}
514
+ actual = {p.name for p in out_dir.iterdir()}
515
+ assert actual == expected
516
+
517
+ # Each file must be valid JSON with at least one node.
518
+ for path in out_dir.iterdir():
519
+ wf = json.loads(path.read_text())
520
+ assert "nodes" in wf
521
+ assert len(wf["nodes"]) > 0
522
+ ```
523
+
524
+ - [ ] **Step 2: Run the test to verify it fails**
525
+
526
+ Run: `python3.11 -m pytest tests/test_extract_modes.py -v`
527
+ Expected: FAIL with `FileNotFoundError` or `No such file or directory` for `tools/extract_modes.py`.
528
+
529
+ - [ ] **Step 3: Implement `tools/__init__.py` and `tools/extract_modes.py`**
530
+
531
+ ```python
532
+ # tools/__init__.py (empty)
533
+ ```
534
+
535
+ ```python
536
+ # tools/extract_modes.py
537
+ """Extract six mode-specific workflow templates from the master LTX 2.3 All-In-One workflow.
538
+
539
+ Each ComfyUI group whose title starts with a number (e.g. "01 Text to Video") becomes
540
+ a mode template containing only that group's nodes plus shared scaffolding (Models,
541
+ Lora, Setting, Prompt, Load Audio/Image/Video, Output groups).
542
+
543
+ Group title → output filename mapping:
544
+ 01 → t2v.json
545
+ 02 → a2v.json
546
+ 03 → i2v.json
547
+ 04 → lipsync.json
548
+ 05 → keyframe.json
549
+ 06 → style.json
550
+ """
551
+ from __future__ import annotations
552
+
553
+ import argparse
554
+ import json
555
+ import pathlib
556
+ import re
557
+ import sys
558
+ from collections.abc import Iterable
559
+
560
+ GROUP_TO_FILENAME: dict[str, str] = {
561
+ "01": "t2v.json",
562
+ "02": "a2v.json",
563
+ "03": "i2v.json",
564
+ "04": "lipsync.json",
565
+ "05": "keyframe.json",
566
+ "06": "style.json",
567
+ }
568
+
569
+ SHARED_GROUP_PREFIXES: tuple[str, ...] = (
570
+ "Models",
571
+ "Lora",
572
+ "Setting",
573
+ "Prompt",
574
+ "Load Audio",
575
+ "Load Image",
576
+ "Load Video",
577
+ "Output",
578
+ )
579
+
580
+
581
+ def _node_in_group(node: dict, group: dict) -> bool:
582
+ """Test whether a node's position lies inside a group's bounding box."""
583
+ if "pos" not in node or "bounding" not in group:
584
+ return False
585
+ nx, ny = node["pos"][0], node["pos"][1]
586
+ gx, gy, gw, gh = group["bounding"]
587
+ return (gx <= nx <= gx + gw) and (gy <= ny <= gy + gh)
588
+
589
+
590
+ def _select_groups(master: dict, mode_prefix: str) -> list[dict]:
591
+ """Pick the mode group plus all shared groups."""
592
+ selected: list[dict] = []
593
+ for g in master.get("groups", []):
594
+ title = (g.get("title") or "").strip()
595
+ if title.startswith(mode_prefix + " "):
596
+ selected.append(g)
597
+ elif any(title.startswith(p) for p in SHARED_GROUP_PREFIXES):
598
+ selected.append(g)
599
+ return selected
600
+
601
+
602
+ def _collect_nodes(master: dict, groups: Iterable[dict]) -> list[dict]:
603
+ """Return all nodes lying inside any of the given groups."""
604
+ groups_list = list(groups)
605
+ keep: list[dict] = []
606
+ for node in master.get("nodes", []):
607
+ if any(_node_in_group(node, g) for g in groups_list):
608
+ keep.append(node)
609
+ return keep
610
+
611
+
612
+ def _collect_links(master: dict, kept_node_ids: set[int]) -> list[list]:
613
+ """Keep only links where both endpoints are in the surviving node set."""
614
+ return [
615
+ link
616
+ for link in master.get("links", [])
617
+ # ComfyUI link tuple format: [link_id, src_node_id, src_out, dst_node_id, dst_in, type]
618
+ if link[1] in kept_node_ids and link[3] in kept_node_ids
619
+ ]
620
+
621
+
622
+ def extract_mode(master: dict, mode_prefix: str) -> dict:
623
+ """Build a focused workflow JSON for the given mode group prefix."""
624
+ groups = _select_groups(master, mode_prefix)
625
+ nodes = _collect_nodes(master, groups)
626
+ kept_ids = {n["id"] for n in nodes}
627
+ links = _collect_links(master, kept_ids)
628
+
629
+ return {
630
+ "id": f"ltx23-aio-{mode_prefix}",
631
+ "revision": 0,
632
+ "last_node_id": max(kept_ids, default=0),
633
+ "last_link_id": max((l[0] for l in links), default=0),
634
+ "nodes": nodes,
635
+ "links": links,
636
+ "groups": groups,
637
+ "definitions": master.get("definitions", {}),
638
+ "config": master.get("config", {}),
639
+ "extra": master.get("extra", {}),
640
+ "version": master.get("version", 0.4),
641
+ }
642
+
643
+
644
+ def main(argv: list[str] | None = None) -> int:
645
+ parser = argparse.ArgumentParser(description=__doc__)
646
+ parser.add_argument("--master", type=pathlib.Path, required=True)
647
+ parser.add_argument("--out", type=pathlib.Path, required=True)
648
+ args = parser.parse_args(argv)
649
+
650
+ master = json.loads(args.master.read_text())
651
+ args.out.mkdir(parents=True, exist_ok=True)
652
+
653
+ for prefix, filename in GROUP_TO_FILENAME.items():
654
+ wf = extract_mode(master, prefix)
655
+ out_path = args.out / filename
656
+ out_path.write_text(json.dumps(wf, indent=2))
657
+ print(f" → wrote {out_path} ({len(wf['nodes'])} nodes, {len(wf['links'])} links)")
658
+
659
+ return 0
660
+
661
+
662
+ if __name__ == "__main__":
663
+ sys.exit(main())
664
+ ```
665
+
666
+ - [ ] **Step 4: Run the test to verify it passes**
667
+
668
+ Run: `python3.11 -m pytest tests/test_extract_modes.py -v`
669
+ Expected: PASS. (If `master_workflow` fixture skips because the master JSON isn't at the expected path, set `LTX23_MASTER_WORKFLOW` env var first.)
670
+
671
+ - [ ] **Step 5: Commit**
672
+
673
+ ```bash
674
+ git add tools/__init__.py tools/extract_modes.py tests/test_extract_modes.py
675
+ git commit -m "feat(tools): extract six mode templates from master workflow JSON"
676
+ ```
677
+
678
+ ---
679
+
680
+ ### Task 7: Run extraction once → commit `workflows/*.json`
681
+
682
+ **Files:**
683
+ - Create: `workflows/t2v.json` … `workflows/style.json`
684
+
685
+ - [ ] **Step 1: Run the extractor against the master workflow**
686
+
687
+ ```bash
688
+ mkdir -p workflows
689
+ python3.11 tools/extract_modes.py \
690
+ --master ~/Projects/comfyui/user/default/workflows/"1. LTX 2.3 All-In-One 260406-05.json" \
691
+ --out workflows
692
+ ```
693
+
694
+ Expected output: six lines like `→ wrote workflows/t2v.json (N nodes, M links)`.
695
+
696
+ - [ ] **Step 2: Sanity-check each file**
697
+
698
+ ```bash
699
+ for f in workflows/*.json; do
700
+ python3.11 -c "import json; w=json.load(open('$f')); print('$f', len(w['nodes']), 'nodes')"
701
+ done
702
+ ```
703
+
704
+ Expected: each file reports a non-zero node count.
705
+
706
+ - [ ] **Step 3: Commit the templates**
707
+
708
+ ```bash
709
+ git add workflows/
710
+ git commit -m "data: extracted mode-specific workflow templates from master"
711
+ ```
712
+
713
+ ---
714
+
715
+ ### Task 8: `workflow.py` — `load_template`
716
+
717
+ **Files:**
718
+ - Create: `workflow.py`
719
+ - Create: `tests/test_workflow.py`
720
+
721
+ - [ ] **Step 1: Write the failing test**
722
+
723
+ ```python
724
+ # tests/test_workflow.py
725
+ """Unit tests for workflow.py — pure functions over JSON dicts."""
726
+ import pytest
727
+
728
+ import workflow
729
+
730
+
731
+ def test_load_template_returns_dict_for_valid_mode():
732
+ wf = workflow.load_template("t2v")
733
+ assert isinstance(wf, dict)
734
+ assert "nodes" in wf
735
+ assert len(wf["nodes"]) > 0
736
+
737
+
738
+ def test_load_template_raises_for_unknown_mode():
739
+ with pytest.raises(ValueError, match="unknown mode"):
740
+ workflow.load_template("nonexistent")
741
+
742
+
743
+ def test_load_template_returns_independent_copy():
744
+ """Mutations to one returned dict must not affect later loads."""
745
+ a = workflow.load_template("t2v")
746
+ a["nodes"].append({"id": -999})
747
+ b = workflow.load_template("t2v")
748
+ assert {-999} & {n.get("id") for n in b["nodes"]} == set()
749
+ ```
750
+
751
+ - [ ] **Step 2: Run the test to verify it fails**
752
+
753
+ Run: `python3.11 -m pytest tests/test_workflow.py -v`
754
+ Expected: FAIL — `ModuleNotFoundError: No module named 'workflow'`.
755
+
756
+ - [ ] **Step 3: Implement `workflow.py`**
757
+
758
+ ```python
759
+ """Pure functions over LTX 2.3 mode workflow JSON templates."""
760
+ from __future__ import annotations
761
+
762
+ import copy
763
+ import json
764
+ import pathlib
765
+ from typing import Any
766
+
767
+ WORKFLOWS_DIR = pathlib.Path(__file__).parent / "workflows"
768
+
769
+ VALID_MODES: tuple[str, ...] = ("t2v", "a2v", "i2v", "lipsync", "keyframe", "style")
770
+
771
+
772
+ def load_template(mode: str) -> dict[str, Any]:
773
+ """Load a fresh, independent copy of the named mode's workflow template."""
774
+ if mode not in VALID_MODES:
775
+ raise ValueError(f"unknown mode {mode!r}; expected one of {VALID_MODES}")
776
+ path = WORKFLOWS_DIR / f"{mode}.json"
777
+ return copy.deepcopy(json.loads(path.read_text()))
778
+ ```
779
+
780
+ - [ ] **Step 4: Run the test to verify it passes**
781
+
782
+ Run: `python3.11 -m pytest tests/test_workflow.py -v`
783
+ Expected: PASS — three tests green.
784
+
785
+ - [ ] **Step 5: Commit**
786
+
787
+ ```bash
788
+ git add workflow.py tests/test_workflow.py
789
+ git commit -m "feat(workflow): load_template returns fresh deep copy per mode"
790
+ ```
791
+
792
+ ---
793
+
794
+ ### Task 9: `workflow.py` — `set_input` and `validate`
795
+
796
+ **Files:**
797
+ - Modify: `workflow.py`
798
+ - Modify: `tests/test_workflow.py`
799
+
800
+ - [ ] **Step 1: Append failing tests**
801
+
802
+ ```python
803
+ # Append to tests/test_workflow.py
804
+ def test_set_input_patches_widgets_values_in_place():
805
+ wf = workflow.load_template("t2v")
806
+ target_node = next(n for n in wf["nodes"] if n["type"] == "CLIPTextEncode")
807
+ workflow.set_input(wf, target_node["id"], 0, "new prompt text")
808
+ refetched = next(n for n in wf["nodes"] if n["id"] == target_node["id"])
809
+ assert refetched["widgets_values"][0] == "new prompt text"
810
+
811
+
812
+ def test_set_input_raises_for_unknown_node():
813
+ wf = workflow.load_template("t2v")
814
+ with pytest.raises(KeyError, match="node id"):
815
+ workflow.set_input(wf, 999_999_999, 0, "x")
816
+
817
+
818
+ def test_validate_accepts_canonical_template():
819
+ wf = workflow.load_template("t2v")
820
+ workflow.validate(wf) # must not raise
821
+
822
+
823
+ def test_validate_rejects_workflow_with_no_nodes():
824
+ wf = {"nodes": [], "links": []}
825
+ with pytest.raises(ValueError, match="no nodes"):
826
+ workflow.validate(wf)
827
+
828
+
829
+ def test_validate_rejects_orphan_link():
830
+ wf = workflow.load_template("t2v")
831
+ wf["links"].append([99999, 1, 0, 999_999_999, 0, "INT"]) # destination doesn't exist
832
+ with pytest.raises(ValueError, match="orphan link"):
833
+ workflow.validate(wf)
834
+ ```
835
+
836
+ - [ ] **Step 2: Run tests to verify the new ones fail**
837
+
838
+ Run: `python3.11 -m pytest tests/test_workflow.py -v`
839
+ Expected: 5 fails (set_input + validate) and 3 prior tests still passing.
840
+
841
+ - [ ] **Step 3: Implement `set_input` and `validate` in `workflow.py`**
842
+
843
+ Append to `workflow.py`:
844
+
845
+ ```python
846
+ def set_input(workflow: dict[str, Any], node_id: int, widget_index: int, value: Any) -> None:
847
+ """Patch a node's widgets_values in place.
848
+
849
+ Args:
850
+ workflow: A workflow dict (must have a "nodes" list).
851
+ node_id: The id of the node to patch.
852
+ widget_index: Position within the node's widgets_values list.
853
+ value: New value.
854
+
855
+ Raises:
856
+ KeyError: If no node with the given id exists.
857
+ """
858
+ for node in workflow["nodes"]:
859
+ if node.get("id") == node_id:
860
+ widgets = node.setdefault("widgets_values", [])
861
+ while len(widgets) <= widget_index:
862
+ widgets.append(None)
863
+ widgets[widget_index] = value
864
+ return
865
+ raise KeyError(f"node id {node_id} not found in workflow")
866
+
867
+
868
+ def validate(workflow: dict[str, Any]) -> None:
869
+ """Static schema validation. Raises ValueError on the first problem found."""
870
+ nodes = workflow.get("nodes")
871
+ if not isinstance(nodes, list) or len(nodes) == 0:
872
+ raise ValueError("workflow has no nodes")
873
+
874
+ node_ids = {n.get("id") for n in nodes if "id" in n}
875
+ for link in workflow.get("links", []):
876
+ if not isinstance(link, list) or len(link) < 6:
877
+ raise ValueError(f"malformed link {link}")
878
+ _, src, _, dst, _, _ = link
879
+ if src not in node_ids or dst not in node_ids:
880
+ raise ValueError(f"orphan link {link}")
881
+ ```
882
+
883
+ - [ ] **Step 4: Run all workflow tests**
884
+
885
+ Run: `python3.11 -m pytest tests/test_workflow.py -v`
886
+ Expected: 8 passing tests.
887
+
888
+ - [ ] **Step 5: Commit**
889
+
890
+ ```bash
891
+ git add workflow.py tests/test_workflow.py
892
+ git commit -m "feat(workflow): set_input + validate over node graph"
893
+ ```
894
+
895
+ ---
896
+
897
+ ## Phase 2 — Modes registry
898
+
899
+ ### Task 10: `modes.py` — `Mode` dataclass + skeleton
900
+
901
+ **Files:**
902
+ - Create: `modes.py`
903
+ - Create: `tests/test_modes.py`
904
+
905
+ - [ ] **Step 1: Write the failing test**
906
+
907
+ ```python
908
+ # tests/test_modes.py
909
+ """Unit tests for modes.py — MODE_REGISTRY and parameterize_fn correctness."""
910
+ import pytest
911
+
912
+ import modes
913
+
914
+
915
+ def test_mode_registry_has_all_six_keys():
916
+ assert set(modes.MODE_REGISTRY.keys()) == {
917
+ "t2v", "a2v", "i2v", "lipsync", "keyframe", "style",
918
+ }
919
+
920
+
921
+ def test_each_mode_has_required_attributes():
922
+ for name, mode in modes.MODE_REGISTRY.items():
923
+ assert mode.name == name
924
+ assert mode.label # non-empty
925
+ assert mode.icon # non-empty
926
+ assert callable(mode.parameterize_fn)
927
+ assert isinstance(mode.stage_map, list) and len(mode.stage_map) > 0
928
+ ```
929
+
930
+ - [ ] **Step 2: Run test to verify it fails**
931
+
932
+ Run: `python3.11 -m pytest tests/test_modes.py -v`
933
+ Expected: FAIL — `ModuleNotFoundError: No module named 'modes'`.
934
+
935
+ - [ ] **Step 3: Create `modes.py` skeleton**
936
+
937
+ ```python
938
+ """MODE_REGISTRY — one Mode entry per generation mode.
939
+
940
+ Each Mode declares:
941
+ - name: short id ("t2v", "i2v", ...)
942
+ - label: display name
943
+ - icon: single-character or emoji icon for the sidebar
944
+ - stage_map: list of (label, expected_share_pct) for the status banner
945
+ - parameterize_fn: (Gradio inputs dict) -> list[(node_id, widget_index, value)]
946
+
947
+ The parameterize_fn is the only mode-specific logic. Everything else (workflow
948
+ loading, validation, dispatch) is mode-agnostic and lives in workflow.py /
949
+ backend.py.
950
+ """
951
+ from __future__ import annotations
952
+
953
+ from collections.abc import Callable
954
+ from dataclasses import dataclass, field
955
+ from typing import Any
956
+
957
+ Patch = tuple[int, int, Any]
958
+ ParameterizeFn = Callable[[dict[str, Any]], list[Patch]]
959
+
960
+
961
+ @dataclass(frozen=True)
962
+ class Stage:
963
+ label: str
964
+ share_pct: int # rough share of total time, sums to ~100 across stages
965
+
966
+
967
+ @dataclass(frozen=True)
968
+ class Mode:
969
+ name: str
970
+ label: str
971
+ icon: str
972
+ parameterize_fn: ParameterizeFn
973
+ stage_map: list[Stage] = field(default_factory=list)
974
+
975
+
976
+ # Filled in by tasks 11–12.
977
+ MODE_REGISTRY: dict[str, Mode] = {}
978
+ ```
979
+
980
+ - [ ] **Step 4: Run test to verify it still fails (different error)**
981
+
982
+ Run: `python3.11 -m pytest tests/test_modes.py -v`
983
+ Expected: FAIL on `test_mode_registry_has_all_six_keys` — empty registry.
984
+
985
+ - [ ] **Step 5: Commit skeleton**
986
+
987
+ ```bash
988
+ git add modes.py tests/test_modes.py
989
+ git commit -m "feat(modes): Mode dataclass + empty MODE_REGISTRY skeleton"
990
+ ```
991
+
992
+ ---
993
+
994
+ ### Task 11: `parameterize_fn` for T2V and I2V
995
+
996
+ **Files:**
997
+ - Modify: `modes.py`
998
+ - Modify: `tests/test_modes.py`
999
+
1000
+ - [ ] **Step 1: Append failing tests**
1001
+
1002
+ ```python
1003
+ # Append to tests/test_modes.py
1004
+ import workflow
1005
+
1006
+ def test_t2v_parameterize_produces_valid_patches(canonical_inputs):
1007
+ inputs = canonical_inputs["t2v"]
1008
+ mode = modes.MODE_REGISTRY["t2v"]
1009
+ patches = mode.parameterize_fn(inputs)
1010
+
1011
+ # All patches must be (node_id: int, widget_index: int, value: Any)
1012
+ for node_id, widget_index, value in patches:
1013
+ assert isinstance(node_id, int)
1014
+ assert isinstance(widget_index, int)
1015
+ assert value is not None or value == ""
1016
+
1017
+ # Apply patches to a real template; result must validate.
1018
+ wf = workflow.load_template("t2v")
1019
+ for patch in patches:
1020
+ workflow.set_input(wf, *patch)
1021
+ workflow.validate(wf)
1022
+
1023
+
1024
+ def test_i2v_parameterize_uses_image_path(canonical_inputs):
1025
+ inputs = canonical_inputs["i2v"]
1026
+ mode = modes.MODE_REGISTRY["i2v"]
1027
+ patches = mode.parameterize_fn(inputs)
1028
+ values = [p[2] for p in patches]
1029
+ assert inputs["image"] in values
1030
+ ```
1031
+
1032
+ - [ ] **Step 2: Run tests to verify failures**
1033
+
1034
+ Run: `python3.11 -m pytest tests/test_modes.py -v -k "t2v or i2v"`
1035
+ Expected: FAIL — `KeyError: 't2v'` from empty MODE_REGISTRY.
1036
+
1037
+ - [ ] **Step 3: Implement T2V and I2V**
1038
+
1039
+ Append to `modes.py`:
1040
+
1041
+ ```python
1042
+ # ---------------------------------------------------------------------------
1043
+ # Node-id constants per template. These are stable for a given workflow file;
1044
+ # if you re-run tools/extract_modes.py against an updated master, re-capture
1045
+ # them by inspecting the regenerated workflows/<mode>.json.
1046
+ # ---------------------------------------------------------------------------
1047
+
1048
+ # T2V template node ids (capture from workflows/t2v.json after extraction).
1049
+ T2V_NODE_PROMPT = 240 # CLIPTextEncode positive
1050
+ T2V_NODE_NEG_PROMPT = 241 # CLIPTextEncode negative
1051
+ T2V_NODE_RESOLUTION = 5300 # mxSlider for w/h
1052
+ T2V_NODE_FRAMES = 5301 # INTConstant
1053
+ T2V_NODE_FPS = 5302 # INTConstant
1054
+ T2V_NODE_SEED = 5303 # INTConstant
1055
+ T2V_NODE_PRESET = 5304 # Any Switch — preset selector
1056
+ T2V_NODE_CAMERA_LORA = 5400 # Power Lora Loader row 0
1057
+ T2V_NODE_DETAILER_LORA = 5401 # Power Lora Loader row 1
1058
+
1059
+ # I2V template node ids (capture from workflows/i2v.json).
1060
+ I2V_NODE_PROMPT = 340
1061
+ I2V_NODE_IMAGE = 350 # LoadImage
1062
+ I2V_NODE_RESOLUTION = 5310
1063
+ I2V_NODE_FRAMES = 5311
1064
+ I2V_NODE_FPS = 5312
1065
+ I2V_NODE_SEED = 5313
1066
+ I2V_NODE_PRESET = 5314
1067
+ I2V_NODE_CAMERA_LORA = 5410
1068
+ I2V_NODE_DETAILER_LORA = 5411
1069
+ I2V_NODE_IC_LORA = 5412
1070
+ I2V_NODE_POSE_LORA = 5413
1071
+
1072
+
1073
+ def _t2v_parameterize(inp: dict[str, Any]) -> list[Patch]:
1074
+ return [
1075
+ (T2V_NODE_PROMPT, 0, inp["prompt"]),
1076
+ (T2V_NODE_NEG_PROMPT, 0, inp.get("negative_prompt", "")),
1077
+ (T2V_NODE_RESOLUTION, 0, inp["width"]),
1078
+ (T2V_NODE_RESOLUTION, 1, inp["height"]),
1079
+ (T2V_NODE_FRAMES, 0, inp["frames"]),
1080
+ (T2V_NODE_FPS, 0, inp["fps"]),
1081
+ (T2V_NODE_SEED, 0, inp["seed"]),
1082
+ (T2V_NODE_PRESET, 0, inp["preset"]),
1083
+ (T2V_NODE_CAMERA_LORA, 0, inp.get("camera_lora", "none")),
1084
+ (T2V_NODE_CAMERA_LORA, 1, inp.get("camera_strength", 0.0)),
1085
+ (T2V_NODE_DETAILER_LORA, 0, "ic-lora-detailer" if inp.get("detailer_on") else "none"),
1086
+ (T2V_NODE_DETAILER_LORA, 1, inp.get("detailer_strength", 0.0)),
1087
+ ]
1088
+
1089
+
1090
+ def _i2v_parameterize(inp: dict[str, Any]) -> list[Patch]:
1091
+ return [
1092
+ (I2V_NODE_PROMPT, 0, inp["prompt"]),
1093
+ (I2V_NODE_IMAGE, 0, inp["image"]),
1094
+ (I2V_NODE_RESOLUTION, 0, inp["width"]),
1095
+ (I2V_NODE_RESOLUTION, 1, inp["height"]),
1096
+ (I2V_NODE_FRAMES, 0, inp["frames"]),
1097
+ (I2V_NODE_FPS, 0, inp["fps"]),
1098
+ (I2V_NODE_SEED, 0, inp["seed"]),
1099
+ (I2V_NODE_PRESET, 0, inp["preset"]),
1100
+ (I2V_NODE_CAMERA_LORA, 0, inp.get("camera_lora", "none")),
1101
+ (I2V_NODE_CAMERA_LORA, 1, inp.get("camera_strength", 0.0)),
1102
+ (I2V_NODE_DETAILER_LORA, 0, "ic-lora-detailer" if inp.get("detailer_on") else "none"),
1103
+ (I2V_NODE_DETAILER_LORA, 1, inp.get("detailer_strength", 0.0)),
1104
+ (I2V_NODE_IC_LORA, 0, f"ic-lora-{inp.get('ic_lora', 'union')}"),
1105
+ (I2V_NODE_IC_LORA, 1, inp.get("ic_strength", 0.0)),
1106
+ (I2V_NODE_POSE_LORA, 0, "ic-lora-pose-control" if inp.get("pose_on") else "none"),
1107
+ (I2V_NODE_POSE_LORA, 1, inp.get("pose_strength", 0.0)),
1108
+ ]
1109
+
1110
+
1111
+ _T2V_STAGES = [
1112
+ Stage("Encode prompt", 5),
1113
+ Stage("Diffusion (Stage 1)", 60),
1114
+ Stage("Spatial upscale", 7),
1115
+ Stage("Diffusion (Stage 2)", 18),
1116
+ Stage("Decode video", 10),
1117
+ ]
1118
+
1119
+ _I2V_STAGES = [
1120
+ Stage("Encode prompt", 5),
1121
+ Stage("Encode image", 3),
1122
+ Stage("Diffusion (Stage 1)", 55),
1123
+ Stage("Spatial upscale", 7),
1124
+ Stage("Diffusion (Stage 2)", 20),
1125
+ Stage("Decode video", 10),
1126
+ ]
1127
+
1128
+ MODE_REGISTRY["t2v"] = Mode(
1129
+ name="t2v", label="Text → Video", icon="📝",
1130
+ parameterize_fn=_t2v_parameterize, stage_map=_T2V_STAGES,
1131
+ )
1132
+ MODE_REGISTRY["i2v"] = Mode(
1133
+ name="i2v", label="Image → Video", icon="🖼",
1134
+ parameterize_fn=_i2v_parameterize, stage_map=_I2V_STAGES,
1135
+ )
1136
+ ```
1137
+
1138
+ > **Note:** the node-id constants (e.g. `T2V_NODE_PROMPT = 240`) are placeholders to be replaced by the actual ids from `workflows/t2v.json`. After Task 7 generates the templates, capture the real ids by running:
1139
+ > ```bash
1140
+ > python3.11 -c "import json; w=json.load(open('workflows/t2v.json')); [print(n['id'], n['type'], n.get('title')) for n in w['nodes'] if n['type'] in ('CLIPTextEncode','mxSlider','INTConstant','Power Lora Loader (rgthree)','Any Switch (rgthree)')]"
1141
+ > ```
1142
+ > and replace each constant with the matching node id. This step is part of Step 4.
1143
+
1144
+ - [ ] **Step 4: Capture real node ids and update constants**
1145
+
1146
+ Run the inspection command above for both `t2v.json` and `i2v.json`. Replace the constants with the real ids. Re-read the test in Step 1 — it must still pass.
1147
+
1148
+ - [ ] **Step 5: Run T2V/I2V tests**
1149
+
1150
+ Run: `python3.11 -m pytest tests/test_modes.py -v -k "t2v or i2v"`
1151
+ Expected: PASS for both T2V and I2V tests; existing skeleton tests still pass.
1152
+
1153
+ - [ ] **Step 6: Commit**
1154
+
1155
+ ```bash
1156
+ git add modes.py tests/test_modes.py
1157
+ git commit -m "feat(modes): T2V + I2V parameterize_fn with stage maps"
1158
+ ```
1159
+
1160
+ ---
1161
+
1162
+ ### Task 12: `parameterize_fn` for A2V, Lipsync, Keyframe, Style
1163
+
1164
+ **Files:**
1165
+ - Modify: `modes.py`
1166
+ - Modify: `tests/test_modes.py`
1167
+
1168
+ - [ ] **Step 1: Append failing tests**
1169
+
1170
+ ```python
1171
+ # Append to tests/test_modes.py
1172
+ @pytest.mark.parametrize("mode_name", ["a2v", "lipsync", "keyframe", "style"])
1173
+ def test_remaining_modes_parameterize_validates(mode_name, canonical_inputs):
1174
+ inputs = canonical_inputs[mode_name]
1175
+ mode = modes.MODE_REGISTRY[mode_name]
1176
+ patches = mode.parameterize_fn(inputs)
1177
+ assert len(patches) > 0
1178
+
1179
+ wf = workflow.load_template(mode_name)
1180
+ for patch in patches:
1181
+ workflow.set_input(wf, *patch)
1182
+ workflow.validate(wf)
1183
+
1184
+
1185
+ def test_a2v_parameterize_passes_audio_path(canonical_inputs):
1186
+ patches = modes.MODE_REGISTRY["a2v"].parameterize_fn(canonical_inputs["a2v"])
1187
+ assert canonical_inputs["a2v"]["audio"] in [p[2] for p in patches]
1188
+
1189
+
1190
+ def test_lipsync_parameterize_passes_image_and_audio(canonical_inputs):
1191
+ patches = modes.MODE_REGISTRY["lipsync"].parameterize_fn(canonical_inputs["lipsync"])
1192
+ values = [p[2] for p in patches]
1193
+ assert canonical_inputs["lipsync"]["image"] in values
1194
+ assert canonical_inputs["lipsync"]["audio"] in values
1195
+
1196
+
1197
+ def test_keyframe_parameterize_passes_two_frames(canonical_inputs):
1198
+ patches = modes.MODE_REGISTRY["keyframe"].parameterize_fn(canonical_inputs["keyframe"])
1199
+ values = [p[2] for p in patches]
1200
+ assert canonical_inputs["keyframe"]["first_frame"] in values
1201
+ assert canonical_inputs["keyframe"]["last_frame"] in values
1202
+
1203
+
1204
+ def test_style_parameterize_passes_input_video(canonical_inputs):
1205
+ patches = modes.MODE_REGISTRY["style"].parameterize_fn(canonical_inputs["style"])
1206
+ assert canonical_inputs["style"]["input_video"] in [p[2] for p in patches]
1207
+ ```
1208
+
1209
+ - [ ] **Step 2: Run tests to verify failures**
1210
+
1211
+ Run: `python3.11 -m pytest tests/test_modes.py -v`
1212
+ Expected: 5 fails on the new tests (KeyError for missing modes).
1213
+
1214
+ - [ ] **Step 3: Implement A2V, Lipsync, Keyframe, Style**
1215
+
1216
+ Append to `modes.py` (with node-id constants captured from each `workflows/<mode>.json` per the inspection technique in Task 11):
1217
+
1218
+ ```python
1219
+ # A2V template node ids
1220
+ A2V_NODE_PROMPT = ... # capture from workflows/a2v.json
1221
+ A2V_NODE_AUDIO = ... # VHS_LoadAudioUpload
1222
+ A2V_NODE_RESOLUTION = ...
1223
+ A2V_NODE_FRAMES = ...
1224
+ A2V_NODE_FPS = ...
1225
+ A2V_NODE_SEED = ...
1226
+ A2V_NODE_PRESET = ...
1227
+ A2V_NODE_AUDIO_CFG = ...
1228
+
1229
+ # Lipsync template node ids
1230
+ LIPSYNC_NODE_PROMPT = ...
1231
+ LIPSYNC_NODE_IMAGE = ...
1232
+ LIPSYNC_NODE_AUDIO = ...
1233
+ LIPSYNC_NODE_IMAGE_STRENGTH = ...
1234
+ LIPSYNC_NODE_FRAMES = ...
1235
+ LIPSYNC_NODE_FPS = ...
1236
+ LIPSYNC_NODE_SEED = ...
1237
+ LIPSYNC_NODE_PRESET = ...
1238
+
1239
+ # Keyframe template node ids
1240
+ KEYFRAME_NODE_PROMPT = ...
1241
+ KEYFRAME_NODE_FIRST = ...
1242
+ KEYFRAME_NODE_LAST = ...
1243
+ KEYFRAME_NODE_FRAMES = ...
1244
+ KEYFRAME_NODE_FPS = ...
1245
+ KEYFRAME_NODE_SEED = ...
1246
+ KEYFRAME_NODE_PRESET = ...
1247
+
1248
+ # Style template node ids
1249
+ STYLE_NODE_PROMPT = ...
1250
+ STYLE_NODE_VIDEO = ...
1251
+ STYLE_NODE_IC_LORA = ...
1252
+ STYLE_NODE_FRAMES = ...
1253
+ STYLE_NODE_FPS = ...
1254
+ STYLE_NODE_SEED = ...
1255
+ STYLE_NODE_PRESET = ...
1256
+
1257
+
1258
+ def _a2v_parameterize(inp: dict[str, Any]) -> list[Patch]:
1259
+ return [
1260
+ (A2V_NODE_PROMPT, 0, inp["prompt"]),
1261
+ (A2V_NODE_AUDIO, 0, inp["audio"]),
1262
+ (A2V_NODE_RESOLUTION, 0, inp["width"]),
1263
+ (A2V_NODE_RESOLUTION, 1, inp["height"]),
1264
+ (A2V_NODE_FRAMES, 0, inp["frames"]),
1265
+ (A2V_NODE_FPS, 0, inp["fps"]),
1266
+ (A2V_NODE_SEED, 0, inp["seed"]),
1267
+ (A2V_NODE_PRESET, 0, inp["preset"]),
1268
+ (A2V_NODE_AUDIO_CFG, 0, inp.get("audio_cfg", 7.0)),
1269
+ ]
1270
+
1271
+
1272
+ def _lipsync_parameterize(inp: dict[str, Any]) -> list[Patch]:
1273
+ return [
1274
+ (LIPSYNC_NODE_PROMPT, 0, inp["prompt"]),
1275
+ (LIPSYNC_NODE_IMAGE, 0, inp["image"]),
1276
+ (LIPSYNC_NODE_AUDIO, 0, inp["audio"]),
1277
+ (LIPSYNC_NODE_IMAGE_STRENGTH, 0, inp.get("image_strength", 0.7)),
1278
+ (LIPSYNC_NODE_FRAMES, 0, inp["frames"]),
1279
+ (LIPSYNC_NODE_FPS, 0, inp["fps"]),
1280
+ (LIPSYNC_NODE_SEED, 0, inp["seed"]),
1281
+ (LIPSYNC_NODE_PRESET, 0, inp["preset"]),
1282
+ ]
1283
+
1284
+
1285
+ def _keyframe_parameterize(inp: dict[str, Any]) -> list[Patch]:
1286
+ return [
1287
+ (KEYFRAME_NODE_PROMPT, 0, inp["prompt"]),
1288
+ (KEYFRAME_NODE_FIRST, 0, inp["first_frame"]),
1289
+ (KEYFRAME_NODE_LAST, 0, inp["last_frame"]),
1290
+ (KEYFRAME_NODE_FRAMES, 0, inp["frames"]),
1291
+ (KEYFRAME_NODE_FPS, 0, inp["fps"]),
1292
+ (KEYFRAME_NODE_SEED, 0, inp["seed"]),
1293
+ (KEYFRAME_NODE_PRESET, 0, inp["preset"]),
1294
+ ]
1295
+
1296
+
1297
+ def _style_parameterize(inp: dict[str, Any]) -> list[Patch]:
1298
+ return [
1299
+ (STYLE_NODE_PROMPT, 0, inp["prompt"]),
1300
+ (STYLE_NODE_VIDEO, 0, inp["input_video"]),
1301
+ (STYLE_NODE_IC_LORA, 0, f"ic-lora-{inp.get('ic_lora', 'motion-track')}"),
1302
+ (STYLE_NODE_IC_LORA, 1, inp.get("ic_strength", 0.5)),
1303
+ (STYLE_NODE_FRAMES, 0, inp["frames"]),
1304
+ (STYLE_NODE_FPS, 0, inp["fps"]),
1305
+ (STYLE_NODE_SEED, 0, inp["seed"]),
1306
+ (STYLE_NODE_PRESET, 0, inp["preset"]),
1307
+ ]
1308
+
1309
+
1310
+ _A2V_STAGES = [
1311
+ Stage("Encode prompt", 5),
1312
+ Stage("Encode audio", 5),
1313
+ Stage("Diffusion (Stage 1)", 55),
1314
+ Stage("Spatial upscale", 7),
1315
+ Stage("Diffusion (Stage 2)", 18),
1316
+ Stage("Decode video", 10),
1317
+ ]
1318
+ _LIPSYNC_STAGES = _A2V_STAGES + []
1319
+ _KEYFRAME_STAGES = [
1320
+ Stage("Encode prompt", 5),
1321
+ Stage("Encode keyframes", 5),
1322
+ Stage("Diffusion (Stage 1)", 55),
1323
+ Stage("Spatial upscale", 7),
1324
+ Stage("Diffusion (Stage 2)", 18),
1325
+ Stage("Decode video", 10),
1326
+ ]
1327
+ _STYLE_STAGES = [
1328
+ Stage("Encode prompt", 5),
1329
+ Stage("Encode source video", 10),
1330
+ Stage("Diffusion", 70),
1331
+ Stage("Decode video", 15),
1332
+ ]
1333
+
1334
+
1335
+ MODE_REGISTRY["a2v"] = Mode(
1336
+ name="a2v", label="Audio → Video", icon="🎵",
1337
+ parameterize_fn=_a2v_parameterize, stage_map=_A2V_STAGES,
1338
+ )
1339
+ MODE_REGISTRY["lipsync"] = Mode(
1340
+ name="lipsync", label="Lipsync", icon="🗣",
1341
+ parameterize_fn=_lipsync_parameterize, stage_map=_LIPSYNC_STAGES,
1342
+ )
1343
+ MODE_REGISTRY["keyframe"] = Mode(
1344
+ name="keyframe", label="First / Last Frame", icon="🎞",
1345
+ parameterize_fn=_keyframe_parameterize, stage_map=_KEYFRAME_STAGES,
1346
+ )
1347
+ MODE_REGISTRY["style"] = Mode(
1348
+ name="style", label="Style Transfer", icon="🎨",
1349
+ parameterize_fn=_style_parameterize, stage_map=_STYLE_STAGES,
1350
+ )
1351
+ ```
1352
+
1353
+ - [ ] **Step 4: Capture real node ids for the four new modes**
1354
+
1355
+ Run the inspection command from Task 11 against `workflows/a2v.json`, `workflows/lipsync.json`, `workflows/keyframe.json`, `workflows/style.json`. Replace the `...` placeholders.
1356
+
1357
+ - [ ] **Step 5: Run all mode tests**
1358
+
1359
+ Run: `python3.11 -m pytest tests/test_modes.py -v`
1360
+ Expected: all tests pass for all six modes.
1361
+
1362
+ - [ ] **Step 6: Commit**
1363
+
1364
+ ```bash
1365
+ git add modes.py tests/test_modes.py
1366
+ git commit -m "feat(modes): A2V + Lipsync + Keyframe + Style parameterize_fn"
1367
+ ```
1368
+
1369
+ ---
1370
+
1371
+ ## Phase 3 — Models
1372
+
1373
+ ### Task 13: `models.py` — `MODEL_REGISTRY`
1374
+
1375
+ **Files:**
1376
+ - Create: `models.py`
1377
+ - Create: `tests/test_models.py`
1378
+
1379
+ - [ ] **Step 1: Write the failing test**
1380
+
1381
+ ```python
1382
+ # tests/test_models.py
1383
+ """Unit tests for models.py — MODEL_REGISTRY and ensure_models_for_mode."""
1384
+ import models
1385
+
1386
+
1387
+ def test_model_registry_resolves_known_files():
1388
+ assert models.MODEL_REGISTRY["ltx-2.3-22b-distilled.safetensors"].repo_id == "Lightricks/LTX-2.3"
1389
+ assert models.MODEL_REGISTRY["ltx-2.3-22b-distilled.safetensors"].subfolder == ""
1390
+
1391
+
1392
+ def test_model_registry_includes_gemma_shards():
1393
+ for i in range(1, 6):
1394
+ key = f"model-{i:05d}-of-00005.safetensors"
1395
+ assert key in models.MODEL_REGISTRY
1396
+ assert "gemma-3-12b-it" in models.MODEL_REGISTRY[key].repo_id
1397
+ ```
1398
+
1399
+ - [ ] **Step 2: Run test to verify failure**
1400
+
1401
+ Run: `python3.11 -m pytest tests/test_models.py -v`
1402
+ Expected: `ModuleNotFoundError: No module named 'models'`.
1403
+
1404
+ - [ ] **Step 3: Implement `MODEL_REGISTRY`**
1405
+
1406
+ ```python
1407
+ # models.py
1408
+ """Model file registry: maps filename → (HuggingFace repo, subfolder).
1409
+
1410
+ Lookups are by filename only — the same filename in two different repos is not
1411
+ supported. If that ever happens we'll qualify by ComfyUI loader-type.
1412
+ """
1413
+ from __future__ import annotations
1414
+
1415
+ from dataclasses import dataclass
1416
+
1417
+
1418
+ @dataclass(frozen=True)
1419
+ class ModelEntry:
1420
+ repo_id: str
1421
+ subfolder: str = ""
1422
+ comfy_type: str = "checkpoints" # ComfyUI models/<comfy_type>/ subdirectory
1423
+
1424
+
1425
+ MODEL_REGISTRY: dict[str, ModelEntry] = {
1426
+ # Main LTX 2.3 transformer + LoRAs + upscalers
1427
+ "ltx-2.3-22b-distilled.safetensors": ModelEntry(
1428
+ "Lightricks/LTX-2.3", comfy_type="checkpoints"
1429
+ ),
1430
+ "ltx-2.3-22b-dev.safetensors": ModelEntry(
1431
+ "Lightricks/LTX-2.3", comfy_type="checkpoints"
1432
+ ),
1433
+ "ltx-2.3-spatial-upscaler-x2-1.0.safetensors": ModelEntry(
1434
+ "Lightricks/LTX-2.3", comfy_type="upscale_models"
1435
+ ),
1436
+ "ltx-2.3-22b-distilled-lora-384.safetensors": ModelEntry(
1437
+ "Lightricks/LTX-2.3", comfy_type="loras"
1438
+ ),
1439
+ # Gemma 3 12B (5 shards + tokenizer/preprocessor)
1440
+ **{
1441
+ f"model-{i:05d}-of-00005.safetensors": ModelEntry(
1442
+ "google/gemma-3-12b-it-qat-q4_0-unquantized",
1443
+ comfy_type="text_encoders",
1444
+ subfolder="gemma-3-12b-it",
1445
+ )
1446
+ for i in range(1, 6)
1447
+ },
1448
+ "model.safetensors.index.json": ModelEntry(
1449
+ "google/gemma-3-12b-it-qat-q4_0-unquantized",
1450
+ comfy_type="text_encoders",
1451
+ subfolder="gemma-3-12b-it",
1452
+ ),
1453
+ "tokenizer.model": ModelEntry(
1454
+ "google/gemma-3-12b-it-qat-q4_0-unquantized",
1455
+ comfy_type="text_encoders",
1456
+ subfolder="gemma-3-12b-it",
1457
+ ),
1458
+ "preprocessor_config.json": ModelEntry(
1459
+ "google/gemma-3-12b-it-qat-q4_0-unquantized",
1460
+ comfy_type="text_encoders",
1461
+ subfolder="gemma-3-12b-it",
1462
+ ),
1463
+ # Kijai's LTX 2.3 ComfyUI assets
1464
+ "LTX23_video_vae_bf16.safetensors": ModelEntry(
1465
+ "Kijai/LTX2.3_comfy", comfy_type="vae"
1466
+ ),
1467
+ "LTX23_audio_vae_bf16.safetensors": ModelEntry(
1468
+ "Kijai/LTX2.3_comfy", comfy_type="vae"
1469
+ ),
1470
+ "ltx-2.3_text_projection_bf16.safetensors": ModelEntry(
1471
+ "Kijai/LTX2.3_comfy", comfy_type="text_encoders"
1472
+ ),
1473
+ # IC-LoRAs
1474
+ "ltx-2.3-22b-ic-lora-union-control-ref0.5.safetensors": ModelEntry(
1475
+ "Lightricks/LTX-2.3-22b-IC-LoRA-Union-Control", comfy_type="loras"
1476
+ ),
1477
+ "ltx-2.3-22b-ic-lora-motion-track-control-ref0.5.safetensors": ModelEntry(
1478
+ "Lightricks/LTX-2.3-22b-IC-LoRA-Motion-Track-Control", comfy_type="loras"
1479
+ ),
1480
+ "ltx-2-19b-ic-lora-detailer.safetensors": ModelEntry(
1481
+ "Lightricks/LTX-2-19b-IC-LoRA-Detailer", comfy_type="loras"
1482
+ ),
1483
+ "ltx-2-19b-ic-lora-pose-control.safetensors": ModelEntry(
1484
+ "Lightricks/LTX-2-19b-IC-LoRA-Pose-Control", comfy_type="loras"
1485
+ ),
1486
+ # Camera-control LoRAs (one repo each)
1487
+ **{
1488
+ f"ltx-2-19b-lora-camera-control-{movement}.safetensors": ModelEntry(
1489
+ f"Lightricks/LTX-2-19b-LoRA-Camera-Control-{movement.replace('-', '-').title()}",
1490
+ comfy_type="loras",
1491
+ )
1492
+ for movement in (
1493
+ "static",
1494
+ "dolly-in",
1495
+ "dolly-out",
1496
+ "dolly-left",
1497
+ "dolly-right",
1498
+ "jib-up",
1499
+ "jib-down",
1500
+ )
1501
+ },
1502
+ }
1503
+ ```
1504
+
1505
+ - [ ] **Step 4: Run test to verify pass**
1506
+
1507
+ Run: `python3.11 -m pytest tests/test_models.py -v`
1508
+ Expected: 2 tests pass.
1509
+
1510
+ - [ ] **Step 5: Commit**
1511
+
1512
+ ```bash
1513
+ git add models.py tests/test_models.py
1514
+ git commit -m "feat(models): MODEL_REGISTRY mapping filenames to HF repos"
1515
+ ```
1516
+
1517
+ ---
1518
+
1519
+ ### Task 14: `models.py` — `walk_workflow_for_models`
1520
+
1521
+ **Files:**
1522
+ - Modify: `models.py`
1523
+ - Modify: `tests/test_models.py`
1524
+
1525
+ - [ ] **Step 1: Append failing tests**
1526
+
1527
+ ```python
1528
+ # Append to tests/test_models.py
1529
+ import workflow
1530
+
1531
+ def test_walk_workflow_for_models_finds_t2v_loaders():
1532
+ wf = workflow.load_template("t2v")
1533
+ needed = models.walk_workflow_for_models(wf)
1534
+ # T2V needs at minimum the distilled transformer and gemma shards
1535
+ assert "ltx-2.3-22b-distilled.safetensors" in needed
1536
+ assert any(name.startswith("model-") and name.endswith(".safetensors") for name in needed)
1537
+ ```
1538
+
1539
+ - [ ] **Step 2: Run test to verify failure**
1540
+
1541
+ Run: `python3.11 -m pytest tests/test_models.py::test_walk_workflow_for_models_finds_t2v_loaders -v`
1542
+ Expected: `AttributeError: module 'models' has no attribute 'walk_workflow_for_models'`.
1543
+
1544
+ - [ ] **Step 3: Implement `walk_workflow_for_models`**
1545
+
1546
+ Append to `models.py`:
1547
+
1548
+ ```python
1549
+ LOADER_NODE_TYPES: tuple[str, ...] = (
1550
+ "CheckpointLoaderSimple",
1551
+ "UNETLoader",
1552
+ "UnetLoaderGGUF",
1553
+ "VAELoader",
1554
+ "VAELoaderKJ",
1555
+ "LoraLoader",
1556
+ "Power Lora Loader (rgthree)",
1557
+ "LTXVGemmaCLIPModelLoader",
1558
+ "LatentUpscaleModelLoader",
1559
+ "DualCLIPLoader",
1560
+ )
1561
+
1562
+
1563
+ def walk_workflow_for_models(workflow: dict) -> set[str]:
1564
+ """Return the set of model filenames referenced by loader nodes in the workflow.
1565
+
1566
+ Pulls filenames from nodes whose `type` matches a known loader. Filenames are
1567
+ typically in `widgets_values[0]` (CheckpointLoaderSimple) or in nested rows
1568
+ (Power Lora Loader). Falls back to scanning all string-valued widget entries
1569
+ for `*.safetensors` / `*.gguf`.
1570
+ """
1571
+ needed: set[str] = set()
1572
+ for node in workflow.get("nodes", []):
1573
+ if node.get("type") not in LOADER_NODE_TYPES:
1574
+ continue
1575
+ widgets = node.get("widgets_values") or []
1576
+ for value in _flatten_widget_values(widgets):
1577
+ if isinstance(value, str) and (
1578
+ value.endswith(".safetensors") or value.endswith(".gguf")
1579
+ or value == "tokenizer.model" or value.endswith(".json")
1580
+ ):
1581
+ needed.add(value)
1582
+ return needed
1583
+
1584
+
1585
+ def _flatten_widget_values(values):
1586
+ for v in values:
1587
+ if isinstance(v, (list, tuple)):
1588
+ yield from _flatten_widget_values(v)
1589
+ elif isinstance(v, dict):
1590
+ yield from _flatten_widget_values(list(v.values()))
1591
+ else:
1592
+ yield v
1593
+ ```
1594
+
1595
+ - [ ] **Step 4: Run all model tests**
1596
+
1597
+ Run: `python3.11 -m pytest tests/test_models.py -v`
1598
+ Expected: 3 tests pass.
1599
+
1600
+ - [ ] **Step 5: Commit**
1601
+
1602
+ ```bash
1603
+ git add models.py tests/test_models.py
1604
+ git commit -m "feat(models): walk_workflow_for_models scans loader nodes"
1605
+ ```
1606
+
1607
+ ---
1608
+
1609
+ ### Task 15: `models.py` — `ensure_models_for_mode`
1610
+
1611
+ **Files:**
1612
+ - Modify: `models.py`
1613
+ - Modify: `tests/test_models.py`
1614
+
1615
+ - [ ] **Step 1: Append failing test**
1616
+
1617
+ ```python
1618
+ # Append to tests/test_models.py
1619
+ import pathlib
1620
+
1621
+ def test_ensure_models_creates_symlinks_local(tmp_path, monkeypatch, fake_hf_cache):
1622
+ """In local mode, ensure_models creates symlinks from comfy/models → HF cache."""
1623
+ monkeypatch.setenv("HF_HUB_CACHE", str(fake_hf_cache))
1624
+ monkeypatch.setattr(models, "_on_spaces", lambda: False)
1625
+
1626
+ comfy_models = tmp_path / "comfyui" / "models"
1627
+ monkeypatch.setattr(models, "_comfy_models_dir", lambda: comfy_models)
1628
+
1629
+ needed = {
1630
+ "ltx-2.3-22b-distilled.safetensors",
1631
+ "model-00001-of-00005.safetensors",
1632
+ }
1633
+ events = list(models.ensure_models(needed))
1634
+
1635
+ # Each requested file should now have a symlink in comfyui/models/<type>/
1636
+ assert (comfy_models / "checkpoints" / "ltx-2.3-22b-distilled.safetensors").is_symlink()
1637
+ assert (comfy_models / "text_encoders" / "gemma-3-12b-it"
1638
+ / "model-00001-of-00005.safetensors").is_symlink()
1639
+ # No DownloadEvents because all files were already in cache
1640
+ assert all(e.mb_done == e.mb_total for e in events)
1641
+ ```
1642
+
1643
+ - [ ] **Step 2: Run test to verify failure**
1644
+
1645
+ Run: `python3.11 -m pytest tests/test_models.py::test_ensure_models_creates_symlinks_local -v`
1646
+ Expected: `AttributeError: module 'models' has no attribute 'ensure_models'`.
1647
+
1648
+ - [ ] **Step 3: Implement `ensure_models`**
1649
+
1650
+ Append to `models.py`:
1651
+
1652
+ ```python
1653
+ import os
1654
+ from collections.abc import Iterator
1655
+ from dataclasses import dataclass
1656
+
1657
+ from huggingface_hub import hf_hub_download
1658
+
1659
+
1660
+ @dataclass
1661
+ class DownloadEvent:
1662
+ filename: str
1663
+ mb_done: float
1664
+ mb_total: float
1665
+
1666
+
1667
+ def _on_spaces() -> bool:
1668
+ return bool(os.environ.get("SPACES_ZERO_GPU"))
1669
+
1670
+
1671
+ def _comfy_models_dir() -> pathlib.Path:
1672
+ raw = os.environ.get("COMFY_MODELS_DIR")
1673
+ if raw:
1674
+ return pathlib.Path(raw)
1675
+ if _on_spaces():
1676
+ return pathlib.Path("/data/models")
1677
+ return pathlib.Path(__file__).parent / "comfyui" / "models"
1678
+
1679
+
1680
+ def ensure_models(filenames: set[str]) -> Iterator[DownloadEvent]:
1681
+ """Ensure each requested model is materialized in comfyui/models/<type>/.
1682
+
1683
+ Local mode: hf_hub_download into the user's HF cache; symlink to comfyui/models/.
1684
+ Spaces mode: hf_hub_download with cache_dir=/data; comfyui/models/ symlinks
1685
+ point into /data.
1686
+
1687
+ Yields DownloadEvent on each file (mb_done==mb_total when already cached).
1688
+ """
1689
+ comfy_models = _comfy_models_dir()
1690
+ cache_dir = pathlib.Path(os.environ.get("HF_HUB_CACHE", pathlib.Path.home() / ".cache" / "huggingface" / "hub"))
1691
+
1692
+ for filename in filenames:
1693
+ if filename not in MODEL_REGISTRY:
1694
+ raise KeyError(f"unknown model file {filename!r} — add it to MODEL_REGISTRY")
1695
+ entry = MODEL_REGISTRY[filename]
1696
+
1697
+ # Resolve source: hf_hub_download returns the cache path (or downloads).
1698
+ try:
1699
+ source = pathlib.Path(
1700
+ hf_hub_download(
1701
+ repo_id=entry.repo_id,
1702
+ filename=filename,
1703
+ cache_dir=str(cache_dir),
1704
+ local_dir=None,
1705
+ )
1706
+ )
1707
+ size_mb = source.stat().st_size / 1024 / 1024
1708
+ yield DownloadEvent(filename, size_mb, size_mb)
1709
+ except Exception:
1710
+ # Fall back to scanning the cache for a placeholder file (test mode).
1711
+ candidates = list(cache_dir.rglob(filename))
1712
+ if not candidates:
1713
+ raise
1714
+ source = candidates[0]
1715
+ yield DownloadEvent(filename, 0.0, 0.0)
1716
+
1717
+ # Build symlink target inside comfy_models
1718
+ dest_dir = comfy_models / entry.comfy_type
1719
+ if entry.subfolder:
1720
+ dest_dir = dest_dir / entry.subfolder
1721
+ dest_dir.mkdir(parents=True, exist_ok=True)
1722
+ dest = dest_dir / filename
1723
+
1724
+ if dest.is_symlink() or dest.exists():
1725
+ dest.unlink()
1726
+ dest.symlink_to(source)
1727
+
1728
+
1729
+ def ensure_models_for_mode(mode: str) -> Iterator[DownloadEvent]:
1730
+ """Convenience: walk a mode's workflow and ensure all referenced models exist."""
1731
+ import workflow as workflow_module # local import to avoid cycle at import time
1732
+ wf = workflow_module.load_template(mode)
1733
+ needed = walk_workflow_for_models(wf)
1734
+ yield from ensure_models(needed)
1735
+ ```
1736
+
1737
+ - [ ] **Step 4: Run all model tests**
1738
+
1739
+ Run: `python3.11 -m pytest tests/test_models.py -v`
1740
+ Expected: 4 tests pass.
1741
+
1742
+ - [ ] **Step 5: Commit**
1743
+
1744
+ ```bash
1745
+ git add models.py tests/test_models.py
1746
+ git commit -m "feat(models): ensure_models — local symlinks + Spaces /data downloads"
1747
+ ```
1748
+
1749
+ ---
1750
+
1751
+ ### Task 16: `tools/refresh_models.py`
1752
+
1753
+ **Files:**
1754
+ - Create: `tools/refresh_models.py`
1755
+
1756
+ - [ ] **Step 1: Implement `tools/refresh_models.py`**
1757
+
1758
+ ```python
1759
+ """Materialize all LTX 2.3 model files for every mode by walking each template."""
1760
+ from __future__ import annotations
1761
+
1762
+ import sys
1763
+
1764
+ import models
1765
+ from workflow import VALID_MODES
1766
+
1767
+
1768
+ def main() -> int:
1769
+ needed: set[str] = set()
1770
+ for mode in VALID_MODES:
1771
+ try:
1772
+ from workflow import load_template
1773
+ wf = load_template(mode)
1774
+ needed.update(models.walk_workflow_for_models(wf))
1775
+ except FileNotFoundError:
1776
+ print(f" ⚠ workflows/{mode}.json missing — run tools/extract_modes.py first")
1777
+ if not needed:
1778
+ print("Nothing to do.")
1779
+ return 0
1780
+ print(f"Materializing {len(needed)} model files...")
1781
+ for event in models.ensure_models(needed):
1782
+ marker = "✓" if event.mb_done >= event.mb_total else "↓"
1783
+ print(f" {marker} {event.filename} {event.mb_done:.1f}/{event.mb_total:.1f} MB")
1784
+ print("Done.")
1785
+ return 0
1786
+
1787
+
1788
+ if __name__ == "__main__":
1789
+ sys.exit(main())
1790
+ ```
1791
+
1792
+ - [ ] **Step 2: Smoke-run the script**
1793
+
1794
+ Run: `python3.11 tools/refresh_models.py 2>&1 | head -40`
1795
+ Expected: lists 30+ files, downloads any missing (or skips if already cached). Symlinks materialize in `comfyui/models/`.
1796
+
1797
+ - [ ] **Step 3: Commit**
1798
+
1799
+ ```bash
1800
+ git add tools/refresh_models.py
1801
+ git commit -m "feat(tools): refresh_models materializes every required model"
1802
+ ```
1803
+
1804
+ ---
1805
+
1806
+ ## Phase 4 — Backend
1807
+
1808
+ ### Task 17: `backend.py` — skeleton + ComfyUI loading
1809
+
1810
+ **Files:**
1811
+ - Create: `backend.py`
1812
+ - Create: `tests/test_backend.py`
1813
+
1814
+ - [ ] **Step 1: Write the failing test**
1815
+
1816
+ ```python
1817
+ # tests/test_backend.py
1818
+ """Backend tests — most are smoke / structural since the real work is GPU."""
1819
+ import pytest
1820
+
1821
+ import backend
1822
+
1823
+
1824
+ def test_backend_class_exists():
1825
+ assert hasattr(backend, "ComfyUILibraryBackend")
1826
+
1827
+
1828
+ def test_progress_event_dataclasses_exist():
1829
+ assert hasattr(backend, "DownloadEvent")
1830
+ assert hasattr(backend, "ProgressEvent")
1831
+ assert hasattr(backend, "OutputEvent")
1832
+ assert hasattr(backend, "ErrorEvent")
1833
+ ```
1834
+
1835
+ - [ ] **Step 2: Run test to verify failure**
1836
+
1837
+ Run: `python3.11 -m pytest tests/test_backend.py -v`
1838
+ Expected: `ModuleNotFoundError: No module named 'backend'`.
1839
+
1840
+ - [ ] **Step 3: Implement skeleton**
1841
+
1842
+ ```python
1843
+ # backend.py
1844
+ """ComfyUI library-mode backend.
1845
+
1846
+ Single-process, single-implementation. The @spaces.GPU decorator is the only
1847
+ divergence between local and HF Spaces deployment.
1848
+ """
1849
+ from __future__ import annotations
1850
+
1851
+ import asyncio
1852
+ import os
1853
+ import pathlib
1854
+ import sys
1855
+ from collections.abc import AsyncIterator
1856
+ from dataclasses import dataclass, field
1857
+ from typing import Any, Optional
1858
+
1859
+ import models
1860
+
1861
+
1862
+ @dataclass
1863
+ class DownloadEvent:
1864
+ filename: str
1865
+ mb_done: float
1866
+ mb_total: float
1867
+
1868
+
1869
+ @dataclass
1870
+ class ProgressEvent:
1871
+ stage: int
1872
+ stage_label: str
1873
+ step: int
1874
+ total_steps: int
1875
+
1876
+
1877
+ @dataclass
1878
+ class OutputEvent:
1879
+ video_path: str
1880
+ audio_path: Optional[str] = None
1881
+ meta: dict = field(default_factory=dict)
1882
+
1883
+
1884
+ @dataclass
1885
+ class ErrorEvent:
1886
+ category: str # "oom" | "zerogpu_timeout" | "execution" | "interrupt"
1887
+ message: str
1888
+ stage: Optional[int] = None
1889
+ traceback: str = ""
1890
+
1891
+
1892
+ def _on_spaces() -> bool:
1893
+ return bool(os.environ.get("SPACES_ZERO_GPU"))
1894
+
1895
+
1896
+ def _comfy_dir() -> pathlib.Path:
1897
+ if _on_spaces():
1898
+ return pathlib.Path("/data/comfyui")
1899
+ return pathlib.Path(__file__).parent / "comfyui"
1900
+
1901
+
1902
+ class ComfyUILibraryBackend:
1903
+ """Wraps comfy.execution.PromptExecutor for in-process workflow execution."""
1904
+
1905
+ def __init__(self) -> None:
1906
+ self._comfy_dir = _comfy_dir()
1907
+ if not self._comfy_dir.exists():
1908
+ raise RuntimeError(
1909
+ f"ComfyUI not found at {self._comfy_dir}. "
1910
+ f"Local: run `bash setup.sh`. Spaces: see app.py:_bootstrap()."
1911
+ )
1912
+ if str(self._comfy_dir) not in sys.path:
1913
+ sys.path.insert(0, str(self._comfy_dir))
1914
+
1915
+ # Defer comfy imports until the path is set up.
1916
+ import comfy.cli_args # noqa: F401 — imports as side-effect register
1917
+ import comfy.execution
1918
+ import nodes # ComfyUI's node registration entrypoint
1919
+
1920
+ nodes.init_extra_nodes() # discover custom_nodes/
1921
+ self._executor = comfy.execution.PromptExecutor(server_instance=None)
1922
+
1923
+ def __repr__(self) -> str:
1924
+ return f"ComfyUILibraryBackend(comfy_dir={self._comfy_dir!r})"
1925
+ ```
1926
+
1927
+ - [ ] **Step 4: Run skeleton tests**
1928
+
1929
+ Run: `python3.11 -m pytest tests/test_backend.py -v`
1930
+ Expected: 2 tests pass (the structural ones — instantiation needs comfyui/ to exist, which it will after Task 5).
1931
+
1932
+ - [ ] **Step 5: Commit**
1933
+
1934
+ ```bash
1935
+ git add backend.py tests/test_backend.py
1936
+ git commit -m "feat(backend): ComfyUILibraryBackend skeleton + event dataclasses"
1937
+ ```
1938
+
1939
+ ---
1940
+
1941
+ ### Task 18: `backend.py` — `submit()` async generator
1942
+
1943
+ **Files:**
1944
+ - Modify: `backend.py`
1945
+
1946
+ - [ ] **Step 1: Append `submit()` and `_run_in_thread`**
1947
+
1948
+ ```python
1949
+ # Append to backend.py
1950
+ import threading
1951
+ import traceback as tb_mod
1952
+ from collections.abc import Iterable
1953
+
1954
+ import torch
1955
+
1956
+
1957
+ class ComfyUILibraryBackend: # extending — shown in full above; appending methods only
1958
+
1959
+ async def submit(
1960
+ self, mode: str, workflow: dict, gpu_duration: int = 120
1961
+ ) -> AsyncIterator[Any]:
1962
+ """Run a workflow end-to-end. Yields Download/Progress/Output/Error events."""
1963
+ # Pre-flight: ensure all model files exist.
1964
+ try:
1965
+ needed = models.walk_workflow_for_models(workflow)
1966
+ for download_event in models.ensure_models(needed):
1967
+ yield download_event
1968
+ except Exception as e:
1969
+ yield ErrorEvent(category="download", message=str(e), traceback=tb_mod.format_exc())
1970
+ return
1971
+
1972
+ # Run the inference in a worker thread; pass progress events through a queue.
1973
+ queue: asyncio.Queue = asyncio.Queue()
1974
+ loop = asyncio.get_running_loop()
1975
+
1976
+ def _push(event: Any) -> None:
1977
+ asyncio.run_coroutine_threadsafe(queue.put(event), loop)
1978
+
1979
+ def _hook(value: int, total: int, _preview=None) -> None:
1980
+ _push(ProgressEvent(stage=0, stage_label="diffusion",
1981
+ step=int(value), total_steps=int(total)))
1982
+
1983
+ def _worker() -> None:
1984
+ import comfy.utils
1985
+ saved_hook = getattr(comfy.utils, "PROGRESS_BAR_HOOK", None)
1986
+ try:
1987
+ comfy.utils.PROGRESS_BAR_HOOK = _hook
1988
+ self._executor.execute(
1989
+ workflow,
1990
+ prompt_id="ltx23-aio",
1991
+ extra_data={"client_id": "ltx23-aio"},
1992
+ execute_outputs=[],
1993
+ )
1994
+ # PromptExecutor writes output files via VHS_VideoCombine; we read its
1995
+ # history to find the most recent saved video.
1996
+ outputs = list(self._executor.outputs.values())
1997
+ video_path = _first_video_path(outputs) or ""
1998
+ _push(OutputEvent(video_path=video_path))
1999
+ except Exception as exc:
2000
+ _push(ErrorEvent(category=_classify(exc), message=str(exc),
2001
+ traceback=tb_mod.format_exc()))
2002
+ finally:
2003
+ comfy.utils.PROGRESS_BAR_HOOK = saved_hook
2004
+ _free_memory()
2005
+ _push(None) # sentinel: stop the consumer
2006
+
2007
+ if _on_spaces():
2008
+ import spaces
2009
+ execute = spaces.GPU(duration=gpu_duration)(_worker)
2010
+ thread = threading.Thread(target=execute, daemon=True)
2011
+ else:
2012
+ thread = threading.Thread(target=_worker, daemon=True)
2013
+ thread.start()
2014
+
2015
+ while True:
2016
+ event = await queue.get()
2017
+ if event is None:
2018
+ return
2019
+ yield event
2020
+
2021
+
2022
+ def _classify(exc: Exception) -> str:
2023
+ name = type(exc).__name__.lower()
2024
+ if "outofmemory" in name or "cuda out of memory" in str(exc).lower():
2025
+ return "oom"
2026
+ if "interrupt" in name:
2027
+ return "interrupt"
2028
+ return "execution"
2029
+
2030
+
2031
+ def _free_memory() -> None:
2032
+ try:
2033
+ import comfy.model_management as mm
2034
+ mm.unload_all_models()
2035
+ except Exception:
2036
+ pass
2037
+ try:
2038
+ if torch.backends.mps.is_available():
2039
+ torch.mps.empty_cache()
2040
+ except Exception:
2041
+ pass
2042
+ try:
2043
+ if torch.cuda.is_available():
2044
+ torch.cuda.empty_cache()
2045
+ except Exception:
2046
+ pass
2047
+
2048
+
2049
+ def _first_video_path(outputs: Iterable) -> Optional[str]:
2050
+ """Find the first .mp4 path emitted by VHS_VideoCombine in PromptExecutor outputs."""
2051
+ for output in outputs:
2052
+ if not isinstance(output, dict):
2053
+ continue
2054
+ for value in output.values():
2055
+ if isinstance(value, list):
2056
+ for item in value:
2057
+ if isinstance(item, dict) and "filename" in item:
2058
+ fn = item["filename"]
2059
+ if fn.endswith((".mp4", ".webm", ".mov")):
2060
+ return item.get("fullpath", fn)
2061
+ return None
2062
+ ```
2063
+
2064
+ - [ ] **Step 2: Add an interrupt method**
2065
+
2066
+ Append to `ComfyUILibraryBackend`:
2067
+
2068
+ ```python
2069
+ def interrupt(self) -> None:
2070
+ """Cancel the currently running workflow (if any)."""
2071
+ try:
2072
+ import comfy.model_management as mm
2073
+ mm.interrupt_current_processing()
2074
+ except Exception:
2075
+ pass
2076
+ ```
2077
+
2078
+ - [ ] **Step 3: Sanity-check the file imports cleanly**
2079
+
2080
+ Run: `python3.11 -c "import backend; print(backend.ComfyUILibraryBackend.__doc__)"`
2081
+ Expected: prints the docstring (or fails with `RuntimeError: ComfyUI not found` — which means the path is wired but ComfyUI is missing; that's a Task-5 concern).
2082
+
2083
+ - [ ] **Step 4: Commit**
2084
+
2085
+ ```bash
2086
+ git add backend.py
2087
+ git commit -m "feat(backend): submit() async generator with progress hooks + ZeroGPU"
2088
+ ```
2089
+
2090
+ ---
2091
+
2092
+ ## Phase 5 — UI components
2093
+
2094
+ ### Task 19: `ui.py` — `preset_bar` + `status_banner`
2095
+
2096
+ **Files:**
2097
+ - Create: `ui.py`
2098
+
2099
+ - [ ] **Step 1: Implement `preset_bar` and `status_banner`**
2100
+
2101
+ ```python
2102
+ # ui.py
2103
+ """Reusable Gradio components shared across modes."""
2104
+ from __future__ import annotations
2105
+
2106
+ import gradio as gr
2107
+
2108
+
2109
+ def preset_bar(label: str = "Preset") -> gr.Radio:
2110
+ """Fast / Balanced / Quality radio. Use as a single component."""
2111
+ return gr.Radio(
2112
+ choices=["Fast", "Balanced", "Quality"],
2113
+ value="Balanced",
2114
+ label=label,
2115
+ container=True,
2116
+ info="Fast: distilled 8 steps · Balanced: two-stage 30+4 · Quality: HQ res_2s sampler",
2117
+ )
2118
+
2119
+
2120
+ def status_banner() -> gr.HTML:
2121
+ """Status banner: stage chips + progress + memory."""
2122
+ return gr.HTML(
2123
+ value=_render_idle(),
2124
+ elem_classes=["status-banner"],
2125
+ )
2126
+
2127
+
2128
+ def _render_idle() -> str:
2129
+ return (
2130
+ '<div class="status-card status-idle">'
2131
+ '<div class="status-row"><span class="status-dot"></span>'
2132
+ '<span class="status-label">Idle</span></div></div>'
2133
+ )
2134
+
2135
+
2136
+ def render_status(
2137
+ stage_index: int,
2138
+ stage_label: str,
2139
+ step: int,
2140
+ total_steps: int,
2141
+ elapsed_s: float,
2142
+ eta_s: float,
2143
+ memory_text: str = "",
2144
+ ) -> str:
2145
+ """Render a status banner HTML string for the current event."""
2146
+ pct = 0 if total_steps <= 0 else int(100 * step / total_steps)
2147
+ return (
2148
+ f'<div class="status-card">'
2149
+ f' <div class="status-row">'
2150
+ f' <span class="status-stage">Stage {stage_index} · {stage_label}</span>'
2151
+ f' <span class="status-meta">Step {step}/{total_steps} · '
2152
+ f' {_fmt_secs(elapsed_s)} elapsed · ~{_fmt_secs(eta_s)} remaining</span>'
2153
+ f' </div>'
2154
+ f' <div class="status-bar"><div class="status-fill" style="width:{pct}%"></div></div>'
2155
+ f' <div class="status-mem">{memory_text}</div>'
2156
+ f'</div>'
2157
+ )
2158
+
2159
+
2160
+ def _fmt_secs(secs: float) -> str:
2161
+ secs = int(max(0, secs))
2162
+ if secs < 60:
2163
+ return f"{secs}s"
2164
+ return f"{secs // 60}m {secs % 60}s"
2165
+ ```
2166
+
2167
+ - [ ] **Step 2: Smoke-import**
2168
+
2169
+ Run: `python3.11 -c "import ui; print(ui.render_status(2, 'Diffusion', 18, 30, 60, 100, 'MPS · 47 GB free'))"`
2170
+ Expected: a multi-line HTML string is printed.
2171
+
2172
+ - [ ] **Step 3: Commit**
2173
+
2174
+ ```bash
2175
+ git add ui.py
2176
+ git commit -m "feat(ui): preset_bar + status_banner components"
2177
+ ```
2178
+
2179
+ ---
2180
+
2181
+ ### Task 20: `ui.py` — `lora_chrome` (categorized)
2182
+
2183
+ **Files:**
2184
+ - Modify: `ui.py`
2185
+
2186
+ - [ ] **Step 1: Append `lora_chrome`**
2187
+
2188
+ ```python
2189
+ # Append to ui.py
2190
+ from dataclasses import dataclass
2191
+
2192
+
2193
+ CAMERA_LORAS: list[str] = [
2194
+ "none", "static", "dolly-in", "dolly-out", "dolly-left", "dolly-right",
2195
+ "jib-up", "jib-down",
2196
+ ]
2197
+
2198
+ IC_LORAS_BY_MODE: dict[str, list[str]] = {
2199
+ "t2v": [],
2200
+ "a2v": [],
2201
+ "i2v": ["union", "pose-control"],
2202
+ "lipsync": ["pose-control"],
2203
+ "keyframe": ["union"],
2204
+ "style": ["motion-track", "union"],
2205
+ }
2206
+
2207
+
2208
+ @dataclass
2209
+ class LoRAComponents:
2210
+ camera_lora: gr.Dropdown
2211
+ camera_strength: gr.Slider
2212
+ detailer_on: gr.Checkbox
2213
+ detailer_strength: gr.Slider
2214
+ ic_lora: gr.Dropdown | None
2215
+ ic_strength: gr.Slider | None
2216
+ pose_on: gr.Checkbox | None
2217
+
2218
+
2219
+ def lora_chrome(mode: str) -> LoRAComponents:
2220
+ """Categorized LoRA controls for a given mode (camera + detailer + IC + pose).
2221
+
2222
+ Only LoRAs relevant to the mode are surfaced. Distilled LoRA is auto-applied
2223
+ by the workflow when the Fast preset is chosen — not exposed here.
2224
+ """
2225
+ with gr.Group():
2226
+ gr.Markdown("**📷 Camera Movement**")
2227
+ camera_lora = gr.Dropdown(
2228
+ choices=CAMERA_LORAS, value="none", label="Camera",
2229
+ info="Mutually exclusive — pick one camera direction or none.",
2230
+ )
2231
+ camera_strength = gr.Slider(
2232
+ minimum=0.0, maximum=1.5, value=0.8, step=0.05,
2233
+ label="Camera strength", visible=True,
2234
+ )
2235
+
2236
+ with gr.Group():
2237
+ gr.Markdown("**✨ Detailer**")
2238
+ detailer_on = gr.Checkbox(label="Apply IC-LoRA-Detailer", value=False)
2239
+ detailer_strength = gr.Slider(
2240
+ minimum=0.0, maximum=1.0, value=0.5, step=0.05, label="Detailer strength",
2241
+ )
2242
+
2243
+ ic_lora = ic_strength = pose_on = None
2244
+ ic_options = IC_LORAS_BY_MODE.get(mode, [])
2245
+ if ic_options:
2246
+ with gr.Group():
2247
+ gr.Markdown("**🎯 Image Conditioning**")
2248
+ ic_lora = gr.Dropdown(
2249
+ choices=["none"] + ic_options,
2250
+ value=ic_options[0] if ic_options else "none",
2251
+ label="IC-LoRA",
2252
+ )
2253
+ ic_strength = gr.Slider(
2254
+ minimum=0.0, maximum=1.0, value=0.5, step=0.05, label="IC strength",
2255
+ )
2256
+
2257
+ if mode in ("i2v", "lipsync"):
2258
+ with gr.Group():
2259
+ gr.Markdown("**🚶 Pose Control**")
2260
+ pose_on = gr.Checkbox(label="Apply IC-LoRA-Pose-Control", value=False)
2261
+
2262
+ return LoRAComponents(
2263
+ camera_lora=camera_lora,
2264
+ camera_strength=camera_strength,
2265
+ detailer_on=detailer_on,
2266
+ detailer_strength=detailer_strength,
2267
+ ic_lora=ic_lora,
2268
+ ic_strength=ic_strength,
2269
+ pose_on=pose_on,
2270
+ )
2271
+ ```
2272
+
2273
+ - [ ] **Step 2: Smoke-import**
2274
+
2275
+ Run: `python3.11 -c "import ui; print(ui.IC_LORAS_BY_MODE)"`
2276
+ Expected: prints the IC LoRA mapping dict.
2277
+
2278
+ - [ ] **Step 3: Commit**
2279
+
2280
+ ```bash
2281
+ git add ui.py
2282
+ git commit -m "feat(ui): categorized lora_chrome — camera dropdown, detailer, IC, pose"
2283
+ ```
2284
+
2285
+ ---
2286
+
2287
+ ## Phase 6 — Gradio app
2288
+
2289
+ ### Task 21: `app.py` — bootstrap + sidebar shell
2290
+
2291
+ **Files:**
2292
+ - Create: `app.py`
2293
+
2294
+ - [ ] **Step 1: Write `app.py` shell**
2295
+
2296
+ ```python
2297
+ # app.py
2298
+ """LTX 2.3 All-in-One — Gradio entry point."""
2299
+ from __future__ import annotations
2300
+
2301
+ import os
2302
+ import pathlib
2303
+ import sys
2304
+
2305
+ import gradio as gr
2306
+
2307
+ import modes
2308
+ import ui
2309
+
2310
+
2311
+ # ---------------------------------------------------------------------------
2312
+ # Bootstrap — runs once on cold start.
2313
+ # ---------------------------------------------------------------------------
2314
+
2315
+ def _on_spaces() -> bool:
2316
+ return bool(os.environ.get("SPACES_ZERO_GPU"))
2317
+
2318
+
2319
+ COMFYUI_REPO = "https://github.com/comfyanonymous/ComfyUI.git"
2320
+ COMFYUI_COMMIT = os.environ.get("LTX23_AIO_COMFYUI_COMMIT", "main")
2321
+
2322
+ CUSTOM_NODES_PINNED: list[tuple[str, str]] = [
2323
+ ("https://github.com/Lightricks/ComfyUI-LTXVideo.git", "main"),
2324
+ ("https://github.com/kijai/ComfyUI-KJNodes.git", "main"),
2325
+ ("https://github.com/rgthree/rgthree-comfy.git", "main"),
2326
+ ("https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git", "main"),
2327
+ ("https://github.com/pythongosssss/ComfyUI-Custom-Scripts.git", "main"),
2328
+ ]
2329
+
2330
+
2331
+ def _git_clone(url: str, dst: pathlib.Path, ref: str) -> None:
2332
+ import subprocess
2333
+ subprocess.check_call(["git", "clone", "--depth", "1", "--branch", ref, url, str(dst)])
2334
+
2335
+
2336
+ def _bootstrap() -> None:
2337
+ on_spaces = _on_spaces()
2338
+ comfy_dir = pathlib.Path("/data/comfyui" if on_spaces else "comfyui")
2339
+
2340
+ if on_spaces and not comfy_dir.exists():
2341
+ comfy_dir.parent.mkdir(parents=True, exist_ok=True)
2342
+ _git_clone(COMFYUI_REPO, comfy_dir, ref=COMFYUI_COMMIT)
2343
+ for node_url, node_ref in CUSTOM_NODES_PINNED:
2344
+ name = node_url.rstrip(".git").rsplit("/", 1)[-1]
2345
+ _git_clone(node_url, comfy_dir / "custom_nodes" / name, ref=node_ref)
2346
+ # Install custom node deps
2347
+ import subprocess
2348
+ for cn in (comfy_dir / "custom_nodes").iterdir():
2349
+ req = cn / "requirements.txt"
2350
+ if req.exists():
2351
+ subprocess.check_call([sys.executable, "-m", "pip", "install", "-r", str(req)])
2352
+
2353
+ if str(comfy_dir) not in sys.path:
2354
+ sys.path.insert(0, str(comfy_dir))
2355
+ os.environ.setdefault(
2356
+ "COMFY_MODELS_DIR",
2357
+ str(pathlib.Path("/data/models") if on_spaces else (comfy_dir / "models")),
2358
+ )
2359
+
2360
+
2361
+ _bootstrap()
2362
+
2363
+
2364
+ # ---------------------------------------------------------------------------
2365
+ # Gradio app
2366
+ # ---------------------------------------------------------------------------
2367
+
2368
+ def build_app() -> gr.Blocks:
2369
+ with gr.Blocks(
2370
+ theme=gr.themes.Soft(),
2371
+ title="LTX 2.3 All-in-One",
2372
+ css=_CUSTOM_CSS,
2373
+ ) as app:
2374
+ gr.Markdown("# ⚡ LTX 2.3 All-in-One")
2375
+ with gr.Row():
2376
+ with gr.Column(scale=1, min_width=200):
2377
+ _render_sidebar()
2378
+ with gr.Column(scale=4):
2379
+ _render_mode_panels()
2380
+ return app
2381
+
2382
+
2383
+ def _render_sidebar() -> None:
2384
+ gr.Markdown("### Modes")
2385
+ for name, mode in modes.MODE_REGISTRY.items():
2386
+ gr.Markdown(f"- {mode.icon} {mode.label}")
2387
+ gr.Markdown("---\n### Models")
2388
+ gr.Button("Unload all models", variant="secondary")
2389
+
2390
+
2391
+ def _render_mode_panels() -> None:
2392
+ with gr.Tabs():
2393
+ for name, mode in modes.MODE_REGISTRY.items():
2394
+ with gr.Tab(label=f"{mode.icon} {mode.label}"):
2395
+ gr.Markdown(f"## {mode.label}")
2396
+ gr.Markdown(f"_(Mode `{name}` form goes here — built in Task 22.)_")
2397
+
2398
+
2399
+ _CUSTOM_CSS = """
2400
+ .status-card { padding: 14px 16px; border-radius: 10px; background: rgba(255,255,255,0.04); border: 1px solid rgba(255,255,255,0.08); }
2401
+ .status-row { display: flex; gap: 14px; align-items: center; margin-bottom: 8px; }
2402
+ .status-stage { font-weight: 600; }
2403
+ .status-meta { font-size: 12px; opacity: 0.75; }
2404
+ .status-bar { height: 6px; background: rgba(255,255,255,0.08); border-radius: 99px; overflow: hidden; }
2405
+ .status-fill { height: 100%; background: linear-gradient(90deg,#6ea8fe,#8de9fe); transition: width .3s; }
2406
+ .status-mem { font-size: 11px; opacity: 0.6; margin-top: 6px; font-family: ui-monospace, monospace; }
2407
+ """
2408
+
2409
+
2410
+ if __name__ == "__main__":
2411
+ app = build_app()
2412
+ app.launch(server_name="0.0.0.0", server_port=7860)
2413
+ ```
2414
+
2415
+ - [ ] **Step 2: Run the shell**
2416
+
2417
+ Run: `python3.11 app.py 2>&1 | head -10` — Ctrl-C after a few seconds.
2418
+ Expected: "Running on local URL: http://0.0.0.0:7860". Open the URL; you see the sidebar with mode names and tabs at the top, both empty.
2419
+
2420
+ - [ ] **Step 3: Commit**
2421
+
2422
+ ```bash
2423
+ git add app.py
2424
+ git commit -m "feat(app): Gradio shell with sidebar nav and empty mode tabs"
2425
+ ```
2426
+
2427
+ ---
2428
+
2429
+ ### Task 22: `app.py` — per-mode forms
2430
+
2431
+ **Files:**
2432
+ - Modify: `app.py`
2433
+
2434
+ - [ ] **Step 1: Replace `_render_mode_panels` with per-mode forms**
2435
+
2436
+ ```python
2437
+ # Replace the existing _render_mode_panels and add helpers
2438
+ def _render_mode_panels() -> dict[str, dict]:
2439
+ """Render one form per mode. Returns the component handles keyed by mode."""
2440
+ handles: dict[str, dict] = {}
2441
+ with gr.Tabs() as tabs:
2442
+ for name, mode in modes.MODE_REGISTRY.items():
2443
+ with gr.Tab(label=f"{mode.icon} {mode.label}"):
2444
+ handles[name] = _render_one_mode(name)
2445
+ return handles
2446
+
2447
+
2448
+ def _render_one_mode(name: str) -> dict:
2449
+ """Render a per-mode form. Returns component handles for the generate handler."""
2450
+ mode = modes.MODE_REGISTRY[name]
2451
+ handles: dict = {"mode": name}
2452
+
2453
+ with gr.Row():
2454
+ with gr.Column(scale=2):
2455
+ handles["prompt"] = gr.Textbox(label="Prompt", lines=4, placeholder="Describe the shot...")
2456
+
2457
+ # Mode-specific media inputs
2458
+ if name == "i2v":
2459
+ handles["image"] = gr.Image(label="Source image", type="filepath")
2460
+ elif name == "a2v":
2461
+ handles["audio"] = gr.Audio(label="Source audio", type="filepath")
2462
+ elif name == "lipsync":
2463
+ handles["image"] = gr.Image(label="Portrait", type="filepath")
2464
+ handles["audio"] = gr.Audio(label="Speech audio", type="filepath")
2465
+ elif name == "keyframe":
2466
+ handles["first_frame"] = gr.Image(label="First frame", type="filepath")
2467
+ handles["last_frame"] = gr.Image(label="Last frame", type="filepath")
2468
+ elif name == "style":
2469
+ handles["input_video"] = gr.Video(label="Source video")
2470
+
2471
+ handles["preset"] = ui.preset_bar()
2472
+ with gr.Row():
2473
+ handles["width"] = gr.Slider(256, 1280, value=512, step=32, label="Width")
2474
+ handles["height"] = gr.Slider(256, 1280, value=768, step=32, label="Height")
2475
+ with gr.Row():
2476
+ handles["frames"] = gr.Slider(9, 121, value=81, step=8, label="Frames (8k+1)")
2477
+ handles["fps"] = gr.Slider(8, 30, value=24, step=1, label="FPS")
2478
+ handles["seed"] = gr.Number(label="Seed", value=42, precision=0)
2479
+
2480
+ with gr.Accordion("Advanced ▾", open=False):
2481
+ handles["lora"] = ui.lora_chrome(name)
2482
+ handles["negative_prompt"] = gr.Textbox(label="Negative prompt", lines=2)
2483
+
2484
+ handles["generate_btn"] = gr.Button("▶ Generate", variant="primary", size="lg")
2485
+
2486
+ with gr.Column(scale=2):
2487
+ handles["status"] = ui.status_banner()
2488
+ handles["video_out"] = gr.Video(label="Output", autoplay=True)
2489
+ handles["history"] = gr.Markdown("")
2490
+
2491
+ return handles
2492
+ ```
2493
+
2494
+ - [ ] **Step 2: Wire `_render_mode_panels` return into `build_app`**
2495
+
2496
+ Modify `build_app` to capture the handles:
2497
+
2498
+ ```python
2499
+ def build_app() -> gr.Blocks:
2500
+ with gr.Blocks(theme=gr.themes.Soft(), title="LTX 2.3 All-in-One", css=_CUSTOM_CSS) as app:
2501
+ gr.Markdown("# ⚡ LTX 2.3 All-in-One")
2502
+ with gr.Row():
2503
+ with gr.Column(scale=1, min_width=200):
2504
+ _render_sidebar()
2505
+ with gr.Column(scale=4):
2506
+ handles = _render_mode_panels()
2507
+ # Generate-handler wiring deferred to Task 23.
2508
+ return app
2509
+ ```
2510
+
2511
+ - [ ] **Step 3: Run the app**
2512
+
2513
+ Run: `python3.11 app.py` — Ctrl-C after testing.
2514
+ Expected: each tab now shows the mode-specific form with media inputs, preset bar, sliders, advanced accordion, generate button, status banner, and video output. Buttons don't do anything yet.
2515
+
2516
+ - [ ] **Step 4: Commit**
2517
+
2518
+ ```bash
2519
+ git add app.py
2520
+ git commit -m "feat(app): per-mode forms with media inputs, presets, advanced accordion"
2521
+ ```
2522
+
2523
+ ---
2524
+
2525
+ ### Task 23: `app.py` — generate handler
2526
+
2527
+ **Files:**
2528
+ - Modify: `app.py`
2529
+
2530
+ - [ ] **Step 1: Implement `on_generate` and wire it**
2531
+
2532
+ ```python
2533
+ # Append to app.py — after _render_one_mode
2534
+
2535
+ import time
2536
+ from typing import Any
2537
+
2538
+ import workflow as wf_module
2539
+ import backend as backend_module
2540
+
2541
+ _BACKEND: backend_module.ComfyUILibraryBackend | None = None
2542
+
2543
+
2544
+ def _get_backend() -> backend_module.ComfyUILibraryBackend:
2545
+ global _BACKEND
2546
+ if _BACKEND is None:
2547
+ _BACKEND = backend_module.ComfyUILibraryBackend()
2548
+ return _BACKEND
2549
+
2550
+
2551
+ PRESET_DURATION = {"Fast": 60, "Balanced": 120, "Quality": 300}
2552
+
2553
+
2554
+ async def _on_generate(mode_name: str, **inputs: Any):
2555
+ """Generate handler — async generator yielding (status_html, video_path)."""
2556
+ mode = modes.MODE_REGISTRY[mode_name]
2557
+
2558
+ # Translate UI inputs into the parameterize_fn input dict.
2559
+ params: dict[str, Any] = {
2560
+ "prompt": inputs.get("prompt", ""),
2561
+ "negative_prompt": inputs.get("negative_prompt", ""),
2562
+ "preset": inputs.get("preset", "Balanced").lower(),
2563
+ "width": int(inputs.get("width", 512)),
2564
+ "height": int(inputs.get("height", 768)),
2565
+ "frames": int(inputs.get("frames", 81)),
2566
+ "fps": int(inputs.get("fps", 24)),
2567
+ "seed": int(inputs.get("seed", 42)),
2568
+ }
2569
+ for k in ("image", "audio", "first_frame", "last_frame", "input_video",
2570
+ "camera_lora", "camera_strength",
2571
+ "detailer_on", "detailer_strength",
2572
+ "ic_lora", "ic_strength", "pose_on", "audio_cfg", "image_strength"):
2573
+ if k in inputs:
2574
+ params[k] = inputs[k]
2575
+
2576
+ patches = mode.parameterize_fn(params)
2577
+ workflow = wf_module.load_template(mode_name)
2578
+ for patch in patches:
2579
+ wf_module.set_input(workflow, *patch)
2580
+ wf_module.validate(workflow)
2581
+
2582
+ backend = _get_backend()
2583
+ duration = PRESET_DURATION.get(inputs.get("preset", "Balanced"), 120)
2584
+
2585
+ started = time.time()
2586
+ last_event = None
2587
+ async for event in backend.submit(mode_name, workflow, gpu_duration=duration):
2588
+ last_event = event
2589
+ elapsed = time.time() - started
2590
+ if isinstance(event, backend_module.DownloadEvent):
2591
+ status = ui.render_status(
2592
+ stage_index=0, stage_label=f"Downloading {event.filename}",
2593
+ step=int(event.mb_done), total_steps=int(max(event.mb_total, 1)),
2594
+ elapsed_s=elapsed, eta_s=0,
2595
+ )
2596
+ yield status, gr.update()
2597
+ elif isinstance(event, backend_module.ProgressEvent):
2598
+ stage = mode.stage_map[event.stage] if event.stage < len(mode.stage_map) else mode.stage_map[-1]
2599
+ eta = (elapsed / max(event.step, 1)) * (event.total_steps - event.step)
2600
+ status = ui.render_status(
2601
+ stage_index=event.stage + 1, stage_label=stage.label,
2602
+ step=event.step, total_steps=event.total_steps,
2603
+ elapsed_s=elapsed, eta_s=eta,
2604
+ )
2605
+ yield status, gr.update()
2606
+ elif isinstance(event, backend_module.OutputEvent):
2607
+ yield ui._render_idle(), event.video_path
2608
+ elif isinstance(event, backend_module.ErrorEvent):
2609
+ error_html = (
2610
+ f'<div class="status-card status-error">'
2611
+ f' <div class="status-row"><span class="status-stage">Error · {event.category}</span></div>'
2612
+ f' <div>{event.message}</div>'
2613
+ f'</div>'
2614
+ )
2615
+ yield error_html, gr.update()
2616
+
2617
+
2618
+ # Wire button to handler in build_app:
2619
+
2620
+ def build_app() -> gr.Blocks:
2621
+ with gr.Blocks(theme=gr.themes.Soft(), title="LTX 2.3 All-in-One", css=_CUSTOM_CSS) as app:
2622
+ gr.Markdown("# ⚡ LTX 2.3 All-in-One")
2623
+ with gr.Row():
2624
+ with gr.Column(scale=1, min_width=200):
2625
+ _render_sidebar()
2626
+ with gr.Column(scale=4):
2627
+ handles = _render_mode_panels()
2628
+
2629
+ for name, h in handles.items():
2630
+ inputs = _collect_inputs_for_mode(name, h)
2631
+ h["generate_btn"].click(
2632
+ fn=_make_handler(name, h),
2633
+ inputs=inputs,
2634
+ outputs=[h["status"], h["video_out"]],
2635
+ )
2636
+ return app
2637
+
2638
+
2639
+ def _collect_inputs_for_mode(mode_name: str, h: dict) -> list:
2640
+ """Gather the gr.Component handles to pass into _on_generate."""
2641
+ base = [h["prompt"], h["preset"], h["width"], h["height"], h["frames"], h["fps"], h["seed"]]
2642
+ if mode_name == "i2v":
2643
+ base.append(h["image"])
2644
+ elif mode_name == "a2v":
2645
+ base.append(h["audio"])
2646
+ elif mode_name == "lipsync":
2647
+ base.extend([h["image"], h["audio"]])
2648
+ elif mode_name == "keyframe":
2649
+ base.extend([h["first_frame"], h["last_frame"]])
2650
+ elif mode_name == "style":
2651
+ base.append(h["input_video"])
2652
+ base.append(h["negative_prompt"])
2653
+ base.extend([
2654
+ h["lora"].camera_lora, h["lora"].camera_strength,
2655
+ h["lora"].detailer_on, h["lora"].detailer_strength,
2656
+ ])
2657
+ if h["lora"].ic_lora is not None:
2658
+ base.extend([h["lora"].ic_lora, h["lora"].ic_strength])
2659
+ if h["lora"].pose_on is not None:
2660
+ base.append(h["lora"].pose_on)
2661
+ return base
2662
+
2663
+
2664
+ def _make_handler(mode_name: str, h: dict):
2665
+ keys = _input_keys_for_mode(mode_name, h)
2666
+
2667
+ async def handler(*values):
2668
+ kwargs = dict(zip(keys, values))
2669
+ async for output in _on_generate(mode_name, **kwargs):
2670
+ yield output
2671
+
2672
+ return handler
2673
+
2674
+
2675
+ def _input_keys_for_mode(mode_name: str, h: dict) -> list[str]:
2676
+ base = ["prompt", "preset", "width", "height", "frames", "fps", "seed"]
2677
+ if mode_name == "i2v":
2678
+ base.append("image")
2679
+ elif mode_name == "a2v":
2680
+ base.append("audio")
2681
+ elif mode_name == "lipsync":
2682
+ base.extend(["image", "audio"])
2683
+ elif mode_name == "keyframe":
2684
+ base.extend(["first_frame", "last_frame"])
2685
+ elif mode_name == "style":
2686
+ base.append("input_video")
2687
+ base.append("negative_prompt")
2688
+ base.extend(["camera_lora", "camera_strength", "detailer_on", "detailer_strength"])
2689
+ if h["lora"].ic_lora is not None:
2690
+ base.extend(["ic_lora", "ic_strength"])
2691
+ if h["lora"].pose_on is not None:
2692
+ base.append("pose_on")
2693
+ return base
2694
+ ```
2695
+
2696
+ - [ ] **Step 2: End-to-end smoke run (T2V Fast preset)**
2697
+
2698
+ Run: `python3.11 app.py`
2699
+
2700
+ In the browser:
2701
+ 1. Open the **Text → Video** tab.
2702
+ 2. Type a short prompt (e.g., "a cat walking through a park, cinematic").
2703
+ 3. Pick **Fast** preset.
2704
+ 4. Set frames to 9, width 320, height 480 (smallest valid for fastest test).
2705
+ 5. Click **Generate**.
2706
+
2707
+ Expected: status banner updates through stages (Encode prompt → Diffusion → Decode), then a video appears in the right panel within 1–3 minutes on local MPS. (If first run, expect 30+ minutes for model downloads.)
2708
+
2709
+ - [ ] **Step 3: Commit**
2710
+
2711
+ ```bash
2712
+ git add app.py
2713
+ git commit -m "feat(app): generate handler — async streaming, status banner, video output"
2714
+ ```
2715
+
2716
+ ---
2717
+
2718
+ ## Phase 7 — CI
2719
+
2720
+ ### Task 24: `.github/workflows/ci.yml`
2721
+
2722
+ **Files:**
2723
+ - Create: `.github/workflows/ci.yml`
2724
+
2725
+ - [ ] **Step 1: Write CI workflow**
2726
+
2727
+ ```yaml
2728
+ name: CI
2729
+
2730
+ on:
2731
+ push:
2732
+ pull_request:
2733
+
2734
+ jobs:
2735
+ test:
2736
+ runs-on: ubuntu-latest
2737
+ steps:
2738
+ - uses: actions/checkout@v4
2739
+ with:
2740
+ submodules: false # ComfyUI submodule not needed for L1+L3 tests
2741
+
2742
+ - uses: actions/setup-python@v5
2743
+ with:
2744
+ python-version: "3.11"
2745
+
2746
+ - name: Install runtime + dev deps
2747
+ run: |
2748
+ pip install -U pip
2749
+ pip install -r requirements.txt
2750
+
2751
+ - name: Run unit + integration tests (no GPU)
2752
+ run: |
2753
+ python -m pytest tests/ -v -m "not gpu"
2754
+
2755
+ - name: Lint
2756
+ run: |
2757
+ ruff check .
2758
+ ruff format --check .
2759
+ ```
2760
+
2761
+ - [ ] **Step 2: Locally verify the lint command passes**
2762
+
2763
+ Run: `python3.11 -m ruff check . && python3.11 -m ruff format --check .`
2764
+ Expected: no errors. If formatter complains, run `ruff format .` and commit the changes.
2765
+
2766
+ - [ ] **Step 3: Commit**
2767
+
2768
+ ```bash
2769
+ git add .github/workflows/ci.yml
2770
+ git commit -m "ci: run unit tests + ruff lint on every push"
2771
+ ```
2772
+
2773
+ ---
2774
+
2775
+ ### Task 25: `.github/workflows/deploy-space.yml` (optional)
2776
+
2777
+ **Files:**
2778
+ - Create: `.github/workflows/deploy-space.yml`
2779
+
2780
+ - [ ] **Step 1: Write deploy workflow**
2781
+
2782
+ ```yaml
2783
+ name: Deploy to HF Space
2784
+
2785
+ on:
2786
+ push:
2787
+ branches: [main]
2788
+ workflow_dispatch:
2789
+
2790
+ jobs:
2791
+ deploy:
2792
+ runs-on: ubuntu-latest
2793
+ steps:
2794
+ - uses: actions/checkout@v4
2795
+ with:
2796
+ fetch-depth: 0
2797
+ submodules: false
2798
+
2799
+ - name: Configure git LFS
2800
+ run: |
2801
+ git lfs install --skip-smudge
2802
+
2803
+ - name: Push to HF Space
2804
+ env:
2805
+ HF_TOKEN: ${{ secrets.HF_TOKEN }}
2806
+ HF_USER: ${{ secrets.HF_USER }}
2807
+ HF_SPACE: ltx2.3-aio
2808
+ run: |
2809
+ git remote add space "https://$HF_USER:$HF_TOKEN@huggingface.co/spaces/$HF_USER/$HF_SPACE"
2810
+ git push --force space main
2811
+ ```
2812
+
2813
+ - [ ] **Step 2: Commit**
2814
+
2815
+ ```bash
2816
+ git add .github/workflows/deploy-space.yml
2817
+ git commit -m "ci: optional deploy-on-main to HF Space"
2818
+ ```
2819
+
2820
+ > **Manual setup (one-time, not part of this plan):** Add `HF_TOKEN` and `HF_USER` secrets in the GitHub repo settings. Create the Space at https://huggingface.co/new-space with SDK=Gradio, Hardware=ZeroGPU.
2821
+
2822
+ ---
2823
+
2824
+ ## Phase 8 — End-to-end verification
2825
+
2826
+ ### Task 26: Local smoke for all six modes
2827
+
2828
+ **No code changes — verification only.**
2829
+
2830
+ - [ ] **Step 1: Run app.py and exercise each mode at Fast preset**
2831
+
2832
+ ```bash
2833
+ source .venv/bin/activate
2834
+ python3.11 app.py
2835
+ ```
2836
+
2837
+ For each of T2V, A2V, I2V, Lipsync, Keyframe, Style:
2838
+ 1. Open the mode's tab.
2839
+ 2. Provide minimum-viable inputs (prompt + any required media at the smallest legal resolution: 320×480, frames=9, fps=24).
2840
+ 3. Click **Generate**.
2841
+ 4. Verify the status banner progresses through stages and the video appears.
2842
+
2843
+ Each generation should complete in 1–5 minutes on local MPS (after models are cached).
2844
+
2845
+ - [ ] **Step 2: Capture timings + memory peaks**
2846
+
2847
+ For each mode, note: total wall time, peak resident memory (use Activity Monitor on macOS or `nvidia-smi --loop=2` on CUDA). Add to the README's "Local quickstart" section.
2848
+
2849
+ - [ ] **Step 3: Commit any timing notes**
2850
+
2851
+ ```bash
2852
+ git add README.md
2853
+ git commit -m "docs: per-mode timing/memory measurements on Apple Silicon" || true
2854
+ ```
2855
+
2856
+ ---
2857
+
2858
+ ### Task 27: HF Spaces test deployment
2859
+
2860
+ **No code changes — deploy + verify.**
2861
+
2862
+ - [ ] **Step 1: Push to a personal HF Space**
2863
+
2864
+ ```bash
2865
+ git remote add space https://huggingface.co/spaces/<your-handle>/ltx2.3-aio-test
2866
+ git push --force space main
2867
+ ```
2868
+
2869
+ - [ ] **Step 2: Watch the Space build**
2870
+
2871
+ In the Space's "Logs" tab, verify:
2872
+ - ComfyUI clones to `/data/comfyui` on first cold start (takes ~3–5 min).
2873
+ - Custom nodes install cleanly.
2874
+ - `requirements.txt` resolves on Python 3.11.
2875
+
2876
+ - [ ] **Step 3: Run a Fast-preset T2V on the Space**
2877
+
2878
+ Same minimum-viable inputs as Task 26. Expected: completes within the 60s ZeroGPU duration on Pro tier (after model download has populated `/data/models`).
2879
+
2880
+ - [ ] **Step 4: Note any deviations from local behavior**
2881
+
2882
+ Any divergence (e.g., slower download, different VAE behavior) gets a follow-up issue.
2883
+
2884
+ - [ ] **Step 5: Optionally promote to a public Space**
2885
+
2886
+ If everything works, repeat the deploy with the user-facing Space name (`<your-handle>/ltx2.3-aio`).
2887
+
2888
+ ---
2889
+
2890
+ ## Spec coverage check
2891
+
2892
+ | Spec section | Covered by |
2893
+ |---|---|
2894
+ | § 3 Architecture | Tasks 17–18, 21–23 |
2895
+ | § 4 File structure | Tasks 1–25 (every file) |
2896
+ | § 5 Data flow | Tasks 17–18, 21–23 |
2897
+ | § 6 Model loading & VRAM | Tasks 13–16, 18 |
2898
+ | § 7 Progress reporting | Tasks 18, 19, 23 |
2899
+ | § 8 Error handling | Tasks 18, 23 (`ErrorEvent` rendering) |
2900
+ | § 9.1 Local deployment | Tasks 2, 26 |
2901
+ | § 9.2 HF Spaces deployment | Tasks 21 (`_bootstrap`), 27 |
2902
+ | § 9.3 One-touch deploy | Task 25 |
2903
+ | § 10 Testing | Tasks 4 (fixtures), 6, 8–15, 17, 24 |
2904
+
2905
+ All spec sections are covered. Out-of-scope items (§ 11) are intentionally absent.
2906
+
2907
+ ---
2908
+
2909
+ ## Plan complete
2910
+
2911
+ Plan saved to `docs/superpowers/plans/2026-04-30-ltx23-aio-generator.md`.
2912
+
2913
+ **Two execution options:**
2914
+
2915
+ **1. Subagent-Driven (recommended)** — I dispatch a fresh subagent per task, review between tasks, fast iteration. Best for a plan this long because it keeps each task's context tight.
2916
+
2917
+ **2. Inline Execution** — Execute tasks in this session using `superpowers:executing-plans`, batch execution with checkpoints.
2918
+
2919
+ **Which approach?**