techfreakworm commited on
Commit
dc32ce0
·
unverified ·
1 Parent(s): 3801f4d

docs: polish README + write AGENTS.md + SKILLS.md, refresh CLAUDE.md

Browse files

Four meta-docs aligned for a real open-source project:

- README.md: badges (HF Space / GitHub stars / MIT / Python 3.11 /
DiffSynth-Studio), demo link, features table, quick-start (local +
HF), architecture diagram, project layout, tech stack, design
philosophy (Soft Dark Restraint), license + credits.
- CLAUDE.md: refreshed with current architecture facts (one pipeline,
two transformers, swap via pool index, MPS vram_limit=None,
PYTORCH_ENABLE_MPS_FALLBACK, hf cache -> ./models/<repo>/ symlink).
Adds 'Gotchas we already paid for' covering every footgun caught
during the 40+ commit iteration (model_pool discard, visible=False
DOM removal, transformers<5 pin, torchaudio + basicsr shims, model
slug mismatches, depth_midas vs midas, etc.).
- AGENTS.md (new): tool-agnostic version of CLAUDE.md. TL;DR five
rules, locked architecture decisions table, commit + verify +
testing conventions, v1 scope cap.
- SKILLS.md (new): process rules. Investigation before fix, HF log
fetching, stage check, sequential thinking for repeated failures,
local server lifecycle, verification before commit, deployment
workflow, adding new models/modes, subagent dispatch rules.

Files changed (4) hide show
  1. AGENTS.md +123 -0
  2. CLAUDE.md +119 -26
  3. README.md +115 -15
  4. SKILLS.md +230 -0
AGENTS.md ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # AGENTS.md
2
+
3
+ Tool-agnostic agent guidance for the Z-Image Studio repo. If you're driving Claude Code, Cursor, Aider, Codex, or anything else with file-edit + shell access, **start here**.
4
+
5
+ This file is the authoritative project rulebook. `CLAUDE.md` is Claude-specific extensions; `SKILLS.md` is workflow rules. README.md is the public-facing project intro — different audience.
6
+
7
+ ---
8
+
9
+ ## TL;DR — the five rules
10
+
11
+ 1. **Mayank Gupta is sole author on every commit.** No agent co-author trailers. No "generated with…" footers. No `--author=` flag. Strip any tool-suggested attribution.
12
+ 2. **Backend = DiffSynth-Studio, not ComfyUI.** Don't add a ComfyUI dependency under any guise.
13
+ 3. **Both transformers live in one pool.** `pipe._zis_pool` is our handle; `pipe.dit` swaps by index. Don't refactor to one-pipeline-per-model — it doubles memory and breaks LoRA-revert.
14
+ 4. **Don't pin `spaces` in `requirements.txt`.** HF Spaces' ZeroGPU build injects its own version. A pin causes pip-resolve failure.
15
+ 5. **Locally is the source of truth.** All changes restart `python app.py` and verify on http://127.0.0.1:7860 BEFORE pushing to HF. The Space rebuild is ~5–10 min; iterate locally.
16
+
17
+ If you can't satisfy these without changing architectural shape, **ask the user before proceeding**.
18
+
19
+ ---
20
+
21
+ ## Project shape
22
+
23
+ Single-process Gradio 5.50 app, flat top-level Python layout, ~2.7k LOC including tests.
24
+
25
+ ```
26
+ app.py Gradio Blocks entry + bootstrap + event handlers + CTA banner
27
+ backend.py ZImageStudioBackend; @spaces.GPU; duration_for; generate_with_retry
28
+ modes.py call_t2i / call_controlnet / call_upscale (pure handlers)
29
+ models.py auto_device, MODEL_CONFIGS, vram_limit_for, HF→DiffSynth symlink helper
30
+ lora.py safetensors header sniff + applied_lora context manager
31
+ preprocessors.py Canny (cv2), Depth (controlnet_aux "depth_midas"), Pose ("openpose")
32
+ upscale.py RealESRGAN x4 + 0.5 resize bridge (with basicsr→torchvision shim)
33
+ ui.py Three per-tab builders, gr.Radio model selector, soon-row links
34
+ theme.py Soft Dark Restraint palette + minimal CSS (~175 lines)
35
+ tooltips.py Centralised `info=` strings — single source of truth
36
+ tests/ 70 passing tests + 4 GPU-deselected smoke
37
+ docs/superpowers/ spec + plan + brainstorm artifacts
38
+ ```
39
+
40
+ Same code path locally (MPS / CUDA) and on HF Spaces. The only branching is whether `_bootstrap()` does the cache-mirror dance (Spaces) or just the symlink step (local).
41
+
42
+ ---
43
+
44
+ ## Locked architecture decisions
45
+
46
+ These came out of brainstorming + 40+ commits of iteration. Do not relitigate.
47
+
48
+ | Decision | Why | Code reference |
49
+ |---|---|---|
50
+ | One `ZImagePipeline` instance, both transformers preloaded | Avoids ~30 s pipeline rebuild per model swap; LoRA revert is cleaner | `backend._build_pipeline` |
51
+ | Transformer swap = `pipe.dit = pool.model[idx]` | DiffSynth's `fetch_model("z_image_dit")` returns the first match; both base + turbo register under the same name. Index by load order. | `modes._swap_transformer` |
52
+ | MPS `vram_limit = None` | `torch.mps` has no `mem_get_info`; DiffSynth's `check_free_vram` raises AttributeError otherwise | `models.vram_limit_for` |
53
+ | `PYTORCH_ENABLE_MPS_FALLBACK=1` set at app import | A few MPS-unsupported ops crash mid-pipeline without it | `app.py` top-of-file |
54
+ | HF cache → `./models/<repo>/` symlink at boot | DiffSynth's `ModelConfig.download` looks at `local_model_path/<model_id>/`, NOT in the HF cache `models--<org>--<repo>/snapshots/<sha>/` layout | `app._bootstrap` + `models.symlink_hf_cache_to_diffsynth_layout` |
55
+ | Native `gr.Radio` for model selector (not a custom HTML card grid) | Gradio reactivity + accessibility free; nothing to debug | `ui.build_t2i_tab` |
56
+ | Native `gr.Progress(track_tqdm=True)` for progress bar | DiffSynth + RealESRGAN both use `tqdm`; one parameter auto-captures both | `app.on_*_generate` signatures |
57
+ | Soft Dark Restraint theme | Locked from brainstorming round 2 (round 1 was over-designed) | `theme.py` |
58
+ | Single output meta block under the image | The first redesign duplicated meta in Advanced; users flagged it | `ui.build_*_tab` |
59
+
60
+ ---
61
+
62
+ ## Commit rules
63
+
64
+ - **Conventional Commits:** `<type>(<scope>): <subject>`
65
+ - types: `feat`, `fix`, `chore`, `docs`, `test`, `refactor`, `ci`, `perf`
66
+ - Subject is imperative, lowercase, **no trailing period**.
67
+ - Body explains **why** when not obvious. Reference plan task IDs (Task 7, Task A, etc.) when the change implements a specific plan step.
68
+ - Frequent small commits; one logical change per commit.
69
+ - **No agent attribution** in commit message or body. See rule 1.
70
+ - Don't `git push --force` to `main` unless the user explicitly says so. Force-push to a feature branch is fine; the seed commits + spec doc are on `main` and protected by convention only.
71
+
72
+ ---
73
+
74
+ ## Verification rules
75
+
76
+ - **Tests must pass before committing.** `python -m pytest tests/ -q` from the project root. Target: 70/70 + 4 deselected.
77
+ - **Ruff must be clean.** `ruff check . && ruff format --check .`
78
+ - **The local app must boot.** `python app.py` → http://127.0.0.1:7860 reachable, no import error in `/tmp/zimage-studio.log`.
79
+ - **For UI changes:** open the URL in a browser (or Playwright eval) and verify the change is rendered. Don't trust a clean test run + clean ruff as proof that the UI works.
80
+ - **For deployment changes:** push to HF Space, watch the build, verify the runtime stage transitions to `RUNNING` before claiming success.
81
+
82
+ If a change requires breaking these rules, write the reason in the commit body.
83
+
84
+ ---
85
+
86
+ ## Testing conventions
87
+
88
+ - **TDD per the plan.** Failing test first, then implementation.
89
+ - **L1 + L2 in CI** (no GPU). The mode handlers are tested with a mocked pipeline — patches on `preprocessors.run`, `upscale.realesrgan_2x`, and direct injection of fake dits into `pipe._zis_pool.model`. We do NOT mock DiffSynth internals.
90
+ - **L3 GPU smoke** is opt-in (`pytest -m gpu`). Lives in `tests/test_smoke_gpu.py`. Loads the real pipeline (~30 GB cache hit on a warm machine).
91
+ - **L4 HF Space smoke** is manual. Push, wait, click each tab, verify the image renders.
92
+
93
+ `pyproject.toml` has `addopts = -m 'not gpu'` so the default `pytest` invocation skips GPU. Add the marker before any test that touches DiffSynth weights.
94
+
95
+ ---
96
+
97
+ ## Out of scope (v1 cap — don't add without asking)
98
+
99
+ The spec lists these as deferred. If you find yourself "while I'm here"-ing into one of them, stop.
100
+
101
+ - Multi-prompt queueing
102
+ - Output history across sessions
103
+ - Telemetry-driven duration estimator
104
+ - Persistent storage add-on
105
+ - Custom LoRA add/remove rows (single LoRA per tab is the cap)
106
+ - LoRA on the Upscale refinement pass
107
+ - ControlNet on Z-Image base
108
+ - Z-Image-Edit and Z-Image-Omni-Base (intentionally placeholders linking to GitHub Model Zoo)
109
+ - Display-font customization beyond Inter
110
+ - Visual regression tests
111
+ - Property-based / fuzz testing of generation params
112
+
113
+ If a feature you're adding requires one of these as a sub-step, **ask the user** before proceeding.
114
+
115
+ ---
116
+
117
+ ## When you're not sure
118
+
119
+ 1. Read `docs/superpowers/specs/2026-05-13-z-image-studio-design.md` — that's the architectural source of truth.
120
+ 2. Read `docs/superpowers/plans/2026-05-13-z-image-studio.md` — the task-by-task breakdown.
121
+ 3. Read `SKILLS.md` — process rules, debugging patterns, deployment workflow.
122
+ 4. `git log --oneline` — every non-obvious decision has a fix-commit explaining the reasoning.
123
+ 5. **Ask the user.** A clarifying question costs the user ten seconds. A wrong implementation costs everyone an hour.
CLAUDE.md CHANGED
@@ -1,44 +1,137 @@
1
- # Project Guidelines — z-image-studio
2
 
3
- Working notes for AI assistants implementing this project.
4
 
5
- ## Sole-author rule (non-negotiable)
6
 
7
- Mayank Gupta is the sole author on every commit. NO `Co-Authored-By: Claude...`, NO "Generated with Claude Code" footer, NO `--author=...` flag. Treat any tooling suggesting a Claude trailer as a bug.
8
 
9
- ## Architecture facts (locked see spec)
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
  Spec: `docs/superpowers/specs/2026-05-13-z-image-studio-design.md`
12
  Plan: `docs/superpowers/plans/2026-05-13-z-image-studio.md`
13
 
14
- 1. Backend is DiffSynth-Studio's `ZImagePipeline` — not ComfyUI.
15
- 2. Three tabs (T2I dual-model, ControlNet turbo-only, Upscale turbo-only).
16
- 3. One pipeline instance, shared across modes; transformer swap is the only model-pool change.
17
- 4. `@spaces.GPU` applied module-level; identity off-Spaces.
18
- 5. DiffSynth handles VRAM management do not sprinkle `empty_cache()` calls.
19
- 6. Models live in HF cache; on Spaces mirrored into `~/hf-cache-rw/` (build-vs-runtime user permissions).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
  ## Coding conventions
22
 
23
- - Python 3.11 (HF Spaces base image is 3.11)
24
- - Flat top-level layout no `src/`, no nested packages.
25
- - No conda `python3.11 -m venv .venv` + brew for system binaries.
26
- - No emojis in code or commits unless explicitly asked.
27
- - Type hints on public functions.
28
- - Imports at top of file unless breaking circular deps.
29
- - `ruff format` + `ruff check` must pass in CI.
 
 
30
 
31
  ## Commits
32
 
33
- - Conventional Commits: `<type>(<scope>): <subject>` — types: `feat`, `fix`, `chore`, `docs`, `test`, `refactor`, `ci`, `perf`.
34
- - Subject is imperative, lowercase, no trailing period.
35
- - Body explains WHY when non-obvious. Reference plan task if relevant.
36
  - Frequent small commits — one logical change per commit.
37
- - NO Claude trailer (see above).
 
 
38
 
39
  ## Testing
40
 
41
- - TDD per the plan failing test first, then implementation.
42
- - L1 + L2 run in CI without GPU. L3 + L4 require GPU/HF Space and are manual.
43
- - No mocks for DiffSynth internals mock only the `pipe(...)` call boundary.
44
- - Use `pytest --gpu` to opt into L3 smoke tests.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Project Guidelines — Z-Image Studio
2
 
3
+ Working notes for AI assistants editing this repo. This file is the *what & why* — the locked architecture, the gotchas, the sole-author rule. Companion to `SKILLS.md` (the *how* — process, debugging, deployment workflow) and `AGENTS.md` (tool-agnostic version of this file).
4
 
5
+ ---
6
 
7
+ ## Sole-author rule (non-negotiable)
8
 
9
+ **Mayank Gupta is the sole author on every commit in this repo.** No exceptions.
10
+
11
+ When committing:
12
+
13
+ - **NO** `Co-Authored-By: Claude…` (or any agent name) trailer.
14
+ - **NO** "Generated with Claude Code" / "🤖 Generated with…" footers.
15
+ - **NO** `--author=…` flag — let git use the user's configured identity.
16
+ - **NO** attribution in PR descriptions.
17
+
18
+ If asked to amend, re-commit, or rebase, strip any prior agent attribution from the commit message. Treat any tooling that suggests adding a Claude trailer as a bug to ignore.
19
+
20
+ ---
21
+
22
+ ## Architecture facts (locked — do not relitigate)
23
 
24
  Spec: `docs/superpowers/specs/2026-05-13-z-image-studio-design.md`
25
  Plan: `docs/superpowers/plans/2026-05-13-z-image-studio.md`
26
 
27
+ 1. **Backend is DiffSynth-Studio's `ZImagePipeline`** — not ComfyUI. Installed from git (the package isn't on PyPI). The repo lives at `/Users/techfreakworm/Projects/llm/lora-training-zimage-base/DiffSynth-Studio/` for local development and is `git+https://github.com/modelscope/DiffSynth-Studio.git` in `requirements.txt`.
28
+ 2. **Three tabs.** T2I has the Base/Turbo radio; ControlNet and Upscale are hard-locked to Turbo.
29
+ 3. **One pipeline instance, two transformers in the pool.** `backend._build_pipeline` does NOT call `ZImagePipeline.from_pretrained` (which discards its `ModelPool` locally). Instead it instantiates the pipeline manually, runs `download_and_load_models`, attaches the pool to `pipe._zis_pool`, and indexes the two `z_image_dit` entries by load order (Base = `pool.model[0]`, Turbo = `pool.model[1]`). Swap is `pipe.dit = dits[idx]` in `modes._swap_transformer`.
30
+ 4. **`@spaces.GPU` is applied at module load time.** Identity decorator off Spaces. The decorator's `duration=` parameter takes a callable that estimates per-call timeout from `(mode, params, multiplier)`. Estimator clamps at `[60, 180] s`.
31
+ 5. **DiffSynth handles VRAM management.** Do **not** sprinkle `empty_cache()` calls. The only place we touch this is `models.vram_limit_for()` which returns `None` for MPS (CUDA-only `mem_get_info` API would crash otherwise) and a numeric cap for CUDA.
32
+ 6. **HF cache DiffSynth `./models/<repo>/` symlink.** DiffSynth's `ModelConfig.download()` looks for files at `local_model_path/<model_id>/...`, NOT in `~/.cache/huggingface/hub/models--<org>--<repo>/snapshots/<sha>/`. `app._bootstrap()` symlinks every cached snapshot into `./models/<org>/<repo>/` so the preload weights are findable. On Spaces, the build-user-owned `~/.cache/huggingface/hub` is mirrored to runtime-writable `~/hf-cache-rw/` first, then symlinked.
33
+ 7. **One Gradio process. Lazy backend singleton.** `get_backend()` constructs the pipeline on the first request (~30 – 60 s warm-up). Module import is fast.
34
+
35
+ ---
36
+
37
+ ## Gotchas we already paid for (don't re-discover)
38
+
39
+ Each of these cost a debug cycle. Read once.
40
+
41
+ ### Model selector swap
42
+
43
+ - `pipe.model_pool` does NOT exist after `ZImagePipeline.from_pretrained` — DiffSynth builds the pool locally and discards it. **Fix:** we keep our own reference on `pipe._zis_pool`. See architecture fact #3.
44
+ - A hidden `gr.Textbox(visible=False)` is removed from the DOM entirely in Gradio 5, so a JS shim can't write to it. We use `elem_classes=["zis-hidden"]` + CSS `display:none` when we need an off-screen value carrier. As of the v2 redesign we use `gr.Radio` directly and don't need a carrier textbox.
45
+
46
+ ### MPS / Apple Silicon
47
+
48
+ - `torch.mps` has no `mem_get_info`. DiffSynth's `AutoWrappedModule.check_free_vram` calls that method and raises AttributeError when `vram_limit` is set. **Fix:** `vram_limit_for("mps")` returns `None` so the gate short-circuits.
49
+ - Several DiffSynth ops aren't implemented on the MPS backend (SDPA variants, some index ops). `app.py` sets `PYTORCH_ENABLE_MPS_FALLBACK=1` so they degrade to CPU instead of crashing.
50
+
51
+ ### Dependency footguns
52
+
53
+ - `diffsynth-studio` (kebab) is NOT a PyPI package. The pip-installable name is `diffsynth` and only via `git+https://github.com/modelscope/DiffSynth-Studio.git`.
54
+ - `transformers >= 5` removes `SiglipVisionTransformer` from `transformers.models.siglip.modeling_siglip`. DiffSynth 2.0.7 imports it. **Pin:** `transformers>=4.45,<5.0`.
55
+ - DiffSynth blanket-imports `torchaudio` in `diffsynth.core.data.operators`. Add `torchaudio>=2.4` to requirements even though we don't use audio.
56
+ - `basicsr` (a `realesrgan` dep) imports `torchvision.transforms.functional_tensor`, removed in `torchvision >= 0.17`. **Fix:** `upscale.py` aliases `torchvision.transforms.functional` into `sys.modules["torchvision.transforms.functional_tensor"]` BEFORE the basicsr import.
57
+
58
+ ### Model name slugs
59
+
60
+ - `PAI/Z-Image-Turbo-Fun-Controlnet-Union-2.1` is the **ModelScope** slug. On HuggingFace the same model is at `alibaba-pai/...`. We use the HF slug + `DIFFSYNTH_DOWNLOAD_SOURCE=huggingface` env var.
61
+ - `xinntao/Real-ESRGAN` doesn't exist on HF (returns 401). We use `lllyasviel/Annotators` which mirrors `RealESRGAN_x4plus.pth`.
62
+ - `controlnet_aux.Processor` registers depth as `depth_midas`, **not** `midas`. The plain name raises KeyError.
63
+
64
+ ### Gradio 5 quirks
65
+
66
+ - Don't put `<script>` tags inside `gr.HTML` blocks — they get stripped. JS goes in `gr.Blocks(head=…)`.
67
+ - `gr.File`'s default drop zone is ~400 px tall. CSS in `theme.py` (`.zis-lora-file .upload-container`) tightens it to 56 px.
68
+ - The Gradio 6.0 deprecation warnings about `theme=` / `css=` / `head=` on `Blocks` are benign on 5.50. Ignore until upgrade.
69
+
70
+ ### HF Spaces deployment
71
+
72
+ - `preload_from_hub` is build-time only. Runtime falls back to network if any required file isn't preloaded. Use broad globs (`transformer/*` not `transformer/*.safetensors`) so configs + index.json files come along. Our current preload totals ~47 GB (cap is 150 GB).
73
+ - ZeroGPU build injects `spaces==0.50.0`. If `requirements.txt` pins `spaces==0.30.0`, pip resolution fails. **Don't pin `spaces` at all** — let HF provide it.
74
+ - The `@spaces.GPU` decorator must be applied at module load. Runtime decoration isn't detected by ZeroGPU's startup analyzer.
75
+ - Per-call `duration=` is a queue-priority signal AND a hard cap. Auto-retry once at 2× on `"GPU task aborted"`.
76
+
77
+ ### Brand vs filename casing
78
+
79
+ - Repo / directory / Python package: `z-image-studio` (kebab-case).
80
+ - User-visible brand: `Z-Image Studio` (title-case) — header, browser tab, README title. Do not propagate the kebab into UI strings.
81
+
82
+ ---
83
 
84
  ## Coding conventions
85
 
86
+ - **Python 3.11.** HF Spaces base image is 3.11; older syntax (like no `match`) is fine.
87
+ - **Flat top-level layout.** No `src/`, no nested packages. One `.py` per responsibility.
88
+ - **No conda.** `python3.11 -m venv .venv`; `brew` for system binaries.
89
+ - **No emojis** in code or commits unless explicitly requested. UI strings (CTA banner, button labels) are OK because they're user-facing copy, not code.
90
+ - **Type hints on public functions.** Internal helpers can skip them when obvious.
91
+ - **Imports at the top of the file.** Inline imports only to break circular deps OR to defer heavy modules (DiffSynth, torch, basicsr) for fast CI startup.
92
+ - **`ruff format` + `ruff check`** both pass in CI. No exceptions.
93
+
94
+ ---
95
 
96
  ## Commits
97
 
98
+ - **Conventional Commits:** `<type>(<scope>): <subject>` — types: `feat`, `fix`, `chore`, `docs`, `test`, `refactor`, `ci`, `perf`.
99
+ - Subject is **imperative**, lowercase, no trailing period.
100
+ - Body explains **why** when not obvious. Reference the spec / plan section if relevant.
101
  - Frequent small commits — one logical change per commit.
102
+ - **NO Claude trailer.** See top of file.
103
+
104
+ ---
105
 
106
  ## Testing
107
 
108
+ - **TDD per the plan.** Each implementation task has the failing test first.
109
+ - **L1 + L2 in CI** (no GPU): module structure, mocked pipeline call boundaries, ruff. `tests/test_smoke_gpu.py` is the GPU smoke; it's marked with `@pytest.mark.gpu` and skipped by default (pyproject `addopts = -m 'not gpu'`).
110
+ - **No mocks for DiffSynth internals.** Mock only the `pipe(...)` call boundary so the mode-handler logic is verified at the boundary.
111
+ - **Use `pytest -m gpu`** to opt into the GPU smoke (~30 GB download on a cold cache; runs full t2i base/turbo + controlnet + upscale at 384²).
112
+
113
+ ---
114
+
115
+ ## Out of scope for v1 (don't add without asking)
116
+
117
+ - Multi-prompt queueing
118
+ - Output history persistence across sessions
119
+ - Telemetry / duration estimator that learns from logs
120
+ - Persistent storage add-on integration
121
+ - Custom LoRA add/remove rows (single LoRA per tab is the v1 cap)
122
+ - LoRA on the Upscale refinement pass (locked to vanilla Turbo refinement)
123
+ - ControlNet on Z-Image base (no released ControlNet weights for base)
124
+ - Z-Image-Edit and Z-Image-Omni-Base (placeholders link to GitHub Model Zoo)
125
+ - Display-font customization beyond Inter (locked by Soft Dark Restraint)
126
+ - Visual regression tests for the Gradio UI
127
+
128
+ If a task feels like it needs one of these, stop and ask the user.
129
+
130
+ ---
131
+
132
+ ## When in doubt
133
+
134
+ 1. Read the spec + plan. Fifteen minutes of reading vs a day of wrong implementation.
135
+ 2. Read `SKILLS.md` for the process side — debugging, deployment, when to commit, when to verify.
136
+ 3. `git log --oneline` — most non-obvious decisions have a fix-commit explaining the reasoning.
137
+ 4. **Ask the user** before changing architectural shape or adding scope outside the v1 list.
README.md CHANGED
@@ -16,39 +16,139 @@ preload_from_hub:
16
  - lllyasviel/Annotators RealESRGAN_x4plus.pth
17
  ---
18
 
19
- # z-image-studio
20
 
21
- Gradio app for [Z-Image](https://huggingface.co/Tongyi-MAI/Z-Image) and [Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) wrapping three modes under a single, focused UI:
22
 
23
- 1. **Text → Image** — pick Base (25 steps, cfg=4) or Turbo (8 steps, cfg=1)
24
- 2. **ControlNet** — Z-Image-Turbo-Fun-Controlnet-Union-2.1 with Canny / Depth / Pose preprocessors
25
- 3. **Upscale** — RealESRGAN x4 + Z-Image-Turbo img2img refinement (effective 2× with detail restoration)
 
 
26
 
27
- Each tab supports an optional LoRA upload + strength slider. Runs on Apple Silicon (MPS) or NVIDIA (CUDA) locally, deploys to Hugging Face Spaces (ZeroGPU H200).
28
 
29
- ## Local quickstart
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
- Requires Python 3.11 and ~35 GB free disk for model weights.
32
 
33
  ```bash
34
- git clone https://github.com/<your-handle>/z-image-studio
35
  cd z-image-studio
36
- bash setup.sh
37
  source .venv/bin/activate
38
- python app.py
39
  ```
40
 
41
- First run downloads ~30 GB into `~/.cache/huggingface/hub` (one-time). Subsequent starts are fast.
42
 
43
- ## HF Spaces deployment
 
 
44
 
45
  ```bash
46
  git remote add space https://huggingface.co/spaces/<your-handle>/z-image-studio
47
  git push space main
48
  ```
49
 
50
- The Space's `preload_from_hub` directive pre-downloads the weights at build time; the `_bootstrap()` in `app.py` mirrors them into a writable tree at runtime.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
  ## License
53
 
54
- MIT for the app code. DiffSynth-Studio (Apache-2.0), Z-Image, and RealESRGAN retain their respective licenses.
 
 
 
 
 
16
  - lllyasviel/Annotators RealESRGAN_x4plus.pth
17
  ---
18
 
19
+ # Z-Image Studio
20
 
21
+ A single-process Gradio app that wraps [Tongyi-MAI Z-Image](https://huggingface.co/Tongyi-MAI/Z-Image) and [Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) with ControlNet and a upscaler under one focused UI. Runs locally on Apple Silicon (MPS) or NVIDIA (CUDA), deploys to Hugging Face Spaces (ZeroGPU).
22
 
23
+ [![Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Spaces-Live-FFB02E?style=flat-square)](https://huggingface.co/spaces/techfreakworm/z-image-studio)
24
+ [![GitHub stars](https://img.shields.io/github/stars/techfreakworm/z-image-studio?style=flat-square&color=FFB02E)](https://github.com/techfreakworm/z-image-studio/stargazers)
25
+ [![License: MIT](https://img.shields.io/badge/License-MIT-FFB02E?style=flat-square)](LICENSE)
26
+ [![Python 3.11](https://img.shields.io/badge/Python-3.11-FFB02E?style=flat-square&logo=python&logoColor=white)](pyproject.toml)
27
+ [![Backed by DiffSynth-Studio](https://img.shields.io/badge/backend-DiffSynth--Studio-FFB02E?style=flat-square)](https://github.com/modelscope/DiffSynth-Studio)
28
 
29
+ **Live demo:** https://huggingface.co/spaces/techfreakworm/z-image-studio
30
 
31
+ ---
32
+
33
+ ## What's inside
34
+
35
+ Three tabs. Same DiffSynth `ZImagePipeline` underneath. Progressive disclosure — the form starts short and reveals controls only when you ask for them.
36
+
37
+ | Mode | Model | What it does |
38
+ |---|---|---|
39
+ | **Text → Image** | Z-Image (25 steps, cfg=4) · Z-Image-Turbo (8 steps, cfg=1) | Prompt-to-image. Toggle the model on the fly; the form swaps Steps / CFG / Negative-Prompt defaults to match. |
40
+ | **ControlNet** | Z-Image-Turbo + Fun-Controlnet-Union 2.1 | Canny / Depth / Pose preprocessors with a **live preview** of the processed control image. |
41
+ | **Upscale** | RealESRGAN x4 → Z-Image-Turbo refinement | Effective 2× upscale with diffusion-based detail restoration (5-step img2img at denoise 0.33). |
42
+
43
+ Each tab carries an optional LoRA toggle. When enabled, exposes a compact `.safetensors` slot + strength slider. The toggle label tells you which model's LoRA is accepted (Z-Image vs Z-Image-Turbo) and updates as you flip the radio.
44
+
45
+ ---
46
+
47
+ ## Quick start (local)
48
 
49
+ Requires **Python 3.11**, ~50 GB free disk for the weight set, and ~24 GB VRAM (CUDA) or ~32 GB unified memory (Apple Silicon).
50
 
51
  ```bash
52
+ git clone https://github.com/techfreakworm/z-image-studio
53
  cd z-image-studio
54
+ bash setup.sh # creates .venv, installs requirements
55
  source .venv/bin/activate
56
+ python app.py # http://127.0.0.1:7860
57
  ```
58
 
59
+ The first run resolves model weights into your HF cache (`~/.cache/huggingface/hub/`). Subsequent starts are fast — the app symlinks the cache snapshots into DiffSynth's expected `./models/<repo>/` layout so nothing re-downloads.
60
 
61
+ **Apple Silicon notes:** `PYTORCH_ENABLE_MPS_FALLBACK=1` is set automatically so the few MPS-unsupported ops fall back to CPU. DiffSynth's free-VRAM check (CUDA-only) is bypassed on MPS — module swapping still works.
62
+
63
+ ## Quick start (HF Spaces)
64
 
65
  ```bash
66
  git remote add space https://huggingface.co/spaces/<your-handle>/z-image-studio
67
  git push space main
68
  ```
69
 
70
+ The Space's `preload_from_hub` directive pre-downloads the ~47 GB weight set at build time. `app.py:_bootstrap()` mirrors the read-only build cache into `~/hf-cache-rw/` and symlinks every snapshot into `./models/<repo>/`. Pipeline construction at first request finds everything locally; no network on inference 2 onward.
71
+
72
+ ## Architecture
73
+
74
+ ```
75
+ ┌──────────────────────────────┐
76
+ browser ──▶ │ app.py — Gradio Blocks │
77
+ │ (header + CTA + 3 tabs) │
78
+ └──────────────┬───────────────┘
79
+
80
+
81
+ ┌──────────────────────────────┐
82
+ │ backend.py │
83
+ │ ZImageStudioBackend │
84
+ │ @spaces.GPU(duration=…) │
85
+ │ one DiffSynth pipeline, │
86
+ │ two transformers in pool │
87
+ └──────────────┬───────────────┘
88
+
89
+ ┌───────────────┬───────────┴────────┬──────────────────┐
90
+ ▼ ▼ ▼ ▼
91
+ modes.py preprocessors.py upscale.py lora.py
92
+ 3 handlers Canny/Depth/Pose RealESRGAN x4 safetensors
93
+ (controlnet_aux) + 0.5 resize sniff + apply/revert
94
+ ```
95
+
96
+ **One pipeline instance**, both transformers (Base + Turbo) preloaded into the pool, swapped per request by indexing into `pool.model`. Shared encoder + VAE + tokenizer between Base and Turbo — no duplication.
97
+
98
+ `@spaces.GPU(duration=callable)` decorates the generate method at module load time on Spaces. The duration estimator clamps to `[60, 180] s` based on mode, model, steps, and image area. ZeroGPU "GPU task aborted" surfaces auto-retry once at 2× duration.
99
+
100
+ ## Project layout
101
+
102
+ ```
103
+ .
104
+ ├── app.py # Gradio Blocks entry, bootstrap, event handlers, CTA
105
+ ├── backend.py # ZImageStudioBackend; @spaces.GPU; duration estimator
106
+ ├── modes.py # call_t2i / call_controlnet / call_upscale pure handlers
107
+ ├── models.py # device autodetect, MODEL_CONFIGS, cache mirror + symlink
108
+ ├── lora.py # safetensors header sniff + apply/revert ctx
109
+ ├── preprocessors.py # Canny (cv2) + Depth (depth_midas) + Pose (openpose)
110
+ ├── upscale.py # RealESRGAN x4 wrapper + basicsr/torchvision shim
111
+ ├── ui.py # Per-tab Gradio component builders
112
+ ├── theme.py # Soft Dark Restraint palette + minimal CSS
113
+ ├── tooltips.py # Centralised info= strings
114
+ ├── requirements.txt # pinned deps
115
+ ├── pyproject.toml # ruff + pytest config (py311)
116
+ ├── setup.sh # venv bootstrap
117
+ └── tests/ # 70 passing (L1+L2 in CI); GPU smoke in -m gpu
118
+ ```
119
+
120
+ ## Tech stack
121
+
122
+ - **[Gradio 5.50](https://gradio.app/)** — UI shell, native components, `gr.Progress(track_tqdm=True)`
123
+ - **[DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio)** — Z-Image pipeline + model pool + VRAM management
124
+ - **[Z-Image / Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image)** by Tongyi-MAI
125
+ - **[Z-Image-Turbo-Fun-Controlnet-Union-2.1](https://huggingface.co/alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union-2.1)** by Alibaba PAI
126
+ - **[RealESRGAN](https://github.com/xinntao/Real-ESRGAN)** weights via [`lllyasviel/Annotators`](https://huggingface.co/lllyasviel/Annotators)
127
+ - **[controlnet_aux](https://github.com/huggingface/controlnet_aux)** for Depth (MiDaS) and Pose (OpenPose)
128
+ - **HF Spaces ZeroGPU** (A10G) — `@spaces.GPU(duration=…)` queue priority
129
+
130
+ ## Design
131
+
132
+ Theme: **Soft Dark Restraint** — warm dark substrate `#1A1614`, cream ink `#F0E8DD`, one accent `#FFB02E` used sparingly (live radio dot, slider fill, primary button, progress fill, brand period). Inter throughout. No display fonts, no shadows, no gradients. The accent is rationed so the generated image stays the visual focus.
133
+
134
+ Disclosure patterns — controls appear when they're needed:
135
+
136
+ - `Use a LoRA` checkbox → file slot + strength slider appear inline
137
+ - Model = Base → Negative Prompt + CFG slider appear (Turbo runs cfg=1 so they'd be no-ops)
138
+ - `Advanced` accordion → Width / Height / Seed live inside, collapsed by default
139
+
140
+ Spec + plan + design rationale live under `docs/superpowers/`.
141
+
142
+ ## Notes on running
143
+
144
+ - **First inference is slow.** Cold-start pipeline construction (~30 – 60 s on MPS, ~10 – 20 s on CUDA) is amortised across the whole session. Subsequent requests hit warm cache.
145
+ - **MPS Macs:** Z-Image-Turbo at 8 steps + 1024² produces an image in ~30 – 60 s. Base at 25 steps is closer to 2 min. Upscale on 1024² → 2048² adds ~30 s on the refinement pass.
146
+ - **ZeroGPU duration cap.** The estimator clamps at 180 s. If a generation aborts, the handler retries once at 2× duration. The duration field per call is the queue-priority signal, not a billing cap.
147
 
148
  ## License
149
 
150
+ MIT for the app code (see `LICENSE`). DiffSynth-Studio is Apache-2.0. Z-Image and Z-Image-Turbo retain their respective Tongyi-MAI licenses. RealESRGAN weights are BSD-3-Clause via the xinntao/Real-ESRGAN repository.
151
+
152
+ ## Credits
153
+
154
+ Z-Image and Z-Image-Turbo by [Tongyi-MAI](https://github.com/Tongyi-MAI). DiffSynth-Studio by the [ModelScope](https://github.com/modelscope) team. ControlNet Union 2.1 by [Alibaba PAI](https://github.com/alibaba). Built by [@techfreakworm](https://huggingface.co/techfreakworm) — drop a ♥ on the [Space](https://huggingface.co/spaces/techfreakworm/z-image-studio) if it's useful.
SKILLS.md ADDED
@@ -0,0 +1,230 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SKILLS.md — how to work in this repo
2
+
3
+ Process rules and habits for editing Z-Image Studio. Companion to `CLAUDE.md` (which is *what & why*); this file is *how* — debugging, verification, deployment, when to commit, when to ship.
4
+
5
+ > **Default rule when in doubt:** stop and ask the user. The user prefers a question over wrong work.
6
+
7
+ ---
8
+
9
+ ## Investigation before fix
10
+
11
+ ### Reproduce the bug before patching
12
+
13
+ When the user reports a layout, color, click, or visibility issue, **first action is verify, not code**. Open the local app (http://127.0.0.1:7860) in a browser OR via Playwright (`mcp__playwright__browser_*`) and reproduce the issue. Take a screenshot. THEN diagnose.
14
+
15
+ Skipping the visual repro twice in a row will produce a patch that fixes a different symptom than the one the user is seeing.
16
+
17
+ For shape / data bugs: read the stack trace fully, identify the line, then read the function — don't trust the line number alone.
18
+
19
+ ### Pull HF Space logs when something runs there
20
+
21
+ For Spaces failures, the run logs are the source of truth.
22
+
23
+ ```bash
24
+ HF_TOKEN=$(cat ~/.cache/huggingface/token)
25
+ curl -s -H "Authorization: Bearer ${HF_TOKEN}" \
26
+ "https://huggingface.co/api/spaces/techfreakworm/z-image-studio/logs/run" \
27
+ > /tmp/hf-runtime.log
28
+
29
+ # Decode the SSE-style `data: {...}` lines
30
+ python3 << 'PY'
31
+ import json
32
+ msgs = []
33
+ for line in open('/tmp/hf-runtime.log'):
34
+ if line.startswith('data:'):
35
+ try: msgs.append(json.loads(line[5:].strip()).get('data', '').rstrip())
36
+ except Exception: pass
37
+ with open('/tmp/hf-runtime-decoded.log', 'w') as f:
38
+ f.write('\n'.join(msgs))
39
+ print(f'Decoded {len(msgs)} lines')
40
+ PY
41
+
42
+ tail -100 /tmp/hf-runtime-decoded.log
43
+ ```
44
+
45
+ `/logs/run` is runtime container output. `/logs/build` is the image-build phase (pip install, preload, etc.). Different problems, different endpoints.
46
+
47
+ ### Stage check before action
48
+
49
+ ```bash
50
+ curl -s https://huggingface.co/api/spaces/techfreakworm/z-image-studio/runtime | python3 -m json.tool
51
+ ```
52
+
53
+ Terminal stages: `RUNNING`, `RUNTIME_ERROR`, `BUILD_ERROR`. Transient: `BUILDING`, `APP_STARTING`, `RUNNING_BUILDING` (live serving while a new build runs). Always check `errorMessage` first when stage is non-RUNNING.
54
+
55
+ ### Sequential thinking for repeated failures
56
+
57
+ The user has called this out: if a fix doesn't work on the first try, **stop patching**. Use the `superpowers:sequential-thinking` MCP and the `superpowers:systematic-debugging` skill. Two failed fixes is the signal — go back to root-cause investigation before attempting fix #3.
58
+
59
+ Pattern that means you're guessing:
60
+ - "Just try changing X and see if it works"
61
+ - "I see another thing it could be — fix that too"
62
+ - Multiple changes in one commit chasing a symptom
63
+
64
+ Pattern that means you're investigating:
65
+ - One hypothesis per cycle
66
+ - Each hypothesis has a falsifying experiment
67
+ - Experiments produce evidence before code changes
68
+
69
+ ---
70
+
71
+ ## Running locally
72
+
73
+ ```bash
74
+ cd /Users/techfreakworm/Projects/llm/z-image-studio
75
+ source .venv/bin/activate
76
+ # Restart cleanly (kill anything on 7860)
77
+ kill -9 $(lsof -ti:7860 2>/dev/null) 2>/dev/null || true
78
+ sleep 1
79
+ nohup .venv/bin/python app.py > /tmp/zimage-studio.log 2>&1 &
80
+ disown
81
+ # Wait for ready
82
+ for i in $(seq 1 30); do curl -sf http://127.0.0.1:7860/ -o /dev/null && echo "ready ${i}s" && break; sleep 1; done
83
+ ```
84
+
85
+ `/tmp/zimage-studio.log` is the live log. Tail it during development. The Monitor tool with a `grep -E "ERROR|Traceback|Exception"` filter is the right way to watch it across many turns without blowing context.
86
+
87
+ LAN access for phone / tablet testing: `http://192.168.0.10:7860` (the LAN IP of the dev machine). Gradio binds to `0.0.0.0:7860` by default in `app.py`.
88
+
89
+ ---
90
+
91
+ ## Verification before committing
92
+
93
+ Before every commit:
94
+
95
+ 1. **Tests pass.** `python -m pytest tests/ -q` → target 70/70 + 4 deselected. New code adds new tests.
96
+ 2. **Ruff clean.** `ruff check . && ruff format --check .` — both no-op.
97
+ 3. **App boots.** Restart the local server (kill 7860, relaunch). Confirm "ready" within ~5 seconds and no traceback in `/tmp/zimage-studio.log`.
98
+ 4. **The change is visible.** For UI changes, click through the affected tab in the browser. For backend changes, click Generate and verify the output matches expectation.
99
+
100
+ Tests + ruff alone is not proof the UI works — the test suite mocks `pipe(...)` and doesn't exercise the Gradio render tree.
101
+
102
+ ---
103
+
104
+ ## When to commit
105
+
106
+ - **One logical change per commit.** A fix and a refactor are TWO commits, not one.
107
+ - After a test goes red → green, commit.
108
+ - After fixing a regression, commit BEFORE adding the next feature.
109
+ - Don't bundle "while I'm here" changes — they hide the actual fix in the diff.
110
+
111
+ Conventional Commits format:
112
+
113
+ ```
114
+ <type>(<scope>): <subject>
115
+
116
+ <body — explains WHY, not what>
117
+ ```
118
+
119
+ Types in use: `feat`, `fix`, `chore`, `docs`, `test`, `refactor`, `ci`, `perf`.
120
+
121
+ NO Claude trailer. NO "Generated with…" footer. See `CLAUDE.md` rule 1.
122
+
123
+ ---
124
+
125
+ ## Deployment workflow
126
+
127
+ The repo has two remotes:
128
+
129
+ ```
130
+ origin → git@github.com:techfreakworm/z-image-studio.git
131
+ space → https://huggingface.co/spaces/techfreakworm/z-image-studio
132
+ ```
133
+
134
+ To push:
135
+
136
+ ```bash
137
+ git push origin main
138
+ git push space main
139
+ ```
140
+
141
+ After the `space` push, HF starts rebuilding. Watch:
142
+
143
+ ```bash
144
+ TOKEN=$(cat ~/.cache/huggingface/token)
145
+ while true; do
146
+ STATE=$(curl -s -H "Authorization: Bearer $TOKEN" \
147
+ https://huggingface.co/api/spaces/techfreakworm/z-image-studio/runtime \
148
+ | python3 -c "import json,sys; print(json.load(sys.stdin).get('stage','?'))")
149
+ echo "$(date +%H:%M:%S) $STATE"
150
+ case "$STATE" in
151
+ RUNNING|BUILD_ERROR|RUNTIME_ERROR) break ;;
152
+ esac
153
+ sleep 30
154
+ done
155
+ ```
156
+
157
+ Typical build time: ~5 min after weights are cached. First build with new preload globs: ~15 – 20 min.
158
+
159
+ ### Don't push during HF testing
160
+
161
+ When the user is actively testing on the live Space, hold local commits — don't push mid-test. They'll explicitly say "push it now" when they're ready.
162
+
163
+ ---
164
+
165
+ ## Adding a new model / weight
166
+
167
+ 1. Add a `ModelConfig(...)` entry to `models.MODEL_CONFIGS`.
168
+ 2. Add the file (or glob) to `preload_from_hub:` in `README.md`'s YAML frontmatter.
169
+ 3. If it's the optional kind DiffSynth fetches lazily (siglip / dinov3 / image2lora), it appears in `_build_pipeline`'s `pool.fetch_model("…")` calls — those return `None` when absent and don't crash.
170
+ 4. If the file is on ModelScope only (e.g. `PAI/…`), find the HF mirror first. The repo uses HF exclusively (`DIFFSYNTH_DOWNLOAD_SOURCE=huggingface`). Common mirror patterns: `PAI/X` → `alibaba-pai/X`. `xinntao/Real-ESRGAN` → `lllyasviel/Annotators`.
171
+ 5. Run tests, restart server, verify in browser, then commit.
172
+
173
+ ---
174
+
175
+ ## Adding a new mode / tab
176
+
177
+ 1. Spec the new mode in `docs/superpowers/specs/` first. Don't skip this.
178
+ 2. Add a `call_<mode>(pipe, params)` to `modes.py`. Same shape as the existing three.
179
+ 3. Add a `build_<mode>_tab()` to `ui.py`. Use the existing tabs as template — gr.Radio / gr.Checkbox / gr.Accordion patterns are already proven Gradio-friendly.
180
+ 4. Wire `on_<mode>_generate()` in `app.py` with `progress=gr.Progress(track_tqdm=True)`. Connect `c["generate_btn"].click(...)`.
181
+ 5. Add tests in `tests/test_modes.py` mocking the `pipe` boundary.
182
+ 6. Update tooltips dict in `tooltips.py`.
183
+ 7. Update the spec + plan to reflect the new mode.
184
+
185
+ ---
186
+
187
+ ## When you have 2+ failed fixes
188
+
189
+ This is a process signal, not a coding signal. Stop coding.
190
+
191
+ 1. Read `superpowers:systematic-debugging` (the Iron Law: no fixes without root-cause investigation).
192
+ 2. Use `mcp__sequential-thinking__sequentialthinking` to walk through hypotheses one at a time.
193
+ 3. Each hypothesis needs a falsifying experiment (a log line, a Playwright eval, a test). Run the experiment before writing code.
194
+ 4. If 3+ fixes have failed, the architecture is wrong — escalate to the user, don't attempt fix #4.
195
+
196
+ This rule has saved several hours of thrashing in this repo. Honour it.
197
+
198
+ ---
199
+
200
+ ## Brainstorm + visual companion
201
+
202
+ When making material UI changes, use:
203
+
204
+ - `superpowers:brainstorming` to clarify what's actually being built
205
+ - `superpowers:frontend-design` (or `frontend-design:frontend-design`) for design quality
206
+ - The visual companion server (under `.superpowers/brainstorm/.../content/`) for mockups the user can click through
207
+
208
+ The user's `.superpowers/` directory is git-ignored and persists per project. Don't prematurely re-mockup — confirm with the user that mockups are wanted before generating them.
209
+
210
+ The user has rejected over-designed mockups TWICE. Default to RESTRAINT — single accent, single font, gradio-native shapes, progressive disclosure. The Soft Dark Restraint design in this repo is what landed; future redesigns should match its discipline.
211
+
212
+ ---
213
+
214
+ ## Skills + sub-agents
215
+
216
+ When dispatching subagents (Agent tool):
217
+
218
+ - **Brief them like they walked in cold.** They see none of this conversation. Include file paths, line numbers, what to change, what NOT to change.
219
+ - **Don't make a subagent read the plan file.** Paste the relevant section into the prompt.
220
+ - **Use Opus for design + heavy refactors.** Sonnet for mechanical implementation. Haiku for trivial CSS / config changes.
221
+ - **One subagent per task.** Two parallel subagents touching the same file is a guaranteed merge conflict.
222
+ - **Subagents commit but don't push.** The user pushes when they've reviewed the diff locally. The "don't push during HF testing" rule means the human owns the push button.
223
+
224
+ ---
225
+
226
+ ## When in doubt
227
+
228
+ 1. Re-read the spec at `docs/superpowers/specs/2026-05-13-z-image-studio-design.md`.
229
+ 2. `git log --oneline` — every non-obvious decision has a fix-commit explaining the reasoning.
230
+ 3. Ask the user. They prefer answering a clarifying question to debugging wrong code an hour later.