Spaces:
Running on Zero
AGENTS.md
Tool-agnostic agent guidance for the ACE Music Studio repo. If you're driving Claude Code, Cursor, Aider, Codex, or anything else with file-edit + shell access, start here.
This file is the authoritative project rulebook. CLAUDE.md is Claude-specific extensions; SKILLS.md is workflow rules. README.md is the public-facing project intro β different audience.
TL;DR β the five rules
- Mayank Gupta is sole author on every commit. No agent co-author trailers. No "generated withβ¦" footers. No
--author=flag. Strip any tool-suggested attribution. - Backend = ACE-Step 1.5 XL SFT, not ComfyUI. Don't add a ComfyUI dependency under any guise.
- One pipeline instance for all modes. Generate / Cover / Extend / Edit call different entry points on the same pipeline object. Don't instantiate per-mode β it doubles memory and breaks LoRA state.
- Don't pin
spacesinrequirements.txt. HF Spaces' ZeroGPU build injects its own version. A pin causes pip-resolve failure. - Locally is the source of truth. All changes restart
python app.pyand verify on http://127.0.0.1:7860 BEFORE pushing to HF. The Space rebuild is ~5β10 min; iterate locally.
If you can't satisfy these without changing architectural shape, ask the user before proceeding.
Project shape
Single-process Gradio 6.14 app, flat top-level Python layout. ACE-Step is vendored as a git submodule at vendor/ace-step/ (NOT pip-installed β see CLAUDE.md).
app.py Gradio Blocks entry, sys.path injection, bootstrap, event handlers
backend.py ACEStepStudioBackend; dispatch; meta-dict assembly
modes.py generate / cover / extend / edit / lyrics β pure handlers
ace_pipeline.py ACEStepStudio wrapper around AceStepHandler + LLMHandler
lora_stack.py safetensors header sniff + preset registry + apply_stack
lyrics_lm.py Qwen 2.5 7B inference (mlx-lm on Mac, transformers on CUDA)
post_process.py Demucs htdemucs stems + LUFS normalisation + ffmpeg MP3 320 k
ui.py Per-tab builders (Generate / Cover / Extend / Edit / Lyrics)
+ _build_lora_accordion + _build_advanced_accordion +
_build_output_panel
theme.py Brutalist Mono palette + Gradio CSS overrides
tooltips.py Centralised info= strings β single source of truth
presets/ LoRA preset manifest.json (Chinese Rap + Chinese New Year)
tests/ L1+L2 tests + GPU-deselected smoke (54 tests pass on CPU)
docs/superpowers/ spec + plan + brainstorm artifacts + visual mockups
vendor/ace-step/ Git submodule of the apple-silicon ace-step fork
Same code path locally (MPS / CUDA) and on HF Spaces. The only branching is _bootstrap_spaces_cache() (skipped locally β gated on SPACE_ID env var; runs _symlink_ace_step_checkpoints on Spaces) and _warm_demucs_on_spaces() (also Spaces-only).
Locked architecture decisions
These came out of brainstorming + spec design + the HF deploy push that followed. Do not relitigate.
| Decision | Why | Code reference |
|---|---|---|
| ACE-Step vendored as git submodule, NOT pip-installed | Upstream pyproject pins nano-vllm; sys_platform != "darwin" β not on PyPI, breaks pip-install on Linux. Vendoring sidesteps the dep declaration; nano-vllm imports inside ace-step are all lazy. |
vendor/ace-step/ + app.py sys.path injection |
One ACEStepStudioBackend instance, lazy init |
Avoids ~60 s pipeline rebuild per request; LoRA revert is cleaner | backend.py + app.get_backend |
Mode dispatch = separate handler functions in modes.py |
Clean boundaries; easy to test with mocked pipe | modes.generate/cover/extend/edit/lyrics |
MPS vram_limit = None |
torch.mps has no mem_get_info; any VRAM gate raises AttributeError otherwise |
ace_pipeline.vram_limit_for |
PYTORCH_ENABLE_MPS_FALLBACK=1 set at app import |
A few MPS-unsupported ops crash mid-pipeline without it | app.py top-of-file |
Preload symlinks β vendor/ace-step/checkpoints/ (NOT ./models/<org>/<repo>/) |
The fork's AceStepHandler._get_project_root() ignores its kwarg and resolves checkpoints relative to its own install dir |
app._symlink_ace_step_checkpoints |
| No cache-mirror dance | cp -al fails with EXDEV on ZeroGPU (different filesystems); inference workloads only READ the cache |
app._bootstrap_spaces_cache |
HF_MODULES_CACHE=/tmp/hf-modules at import |
~/.cache/huggingface/modules is read-only at runtime; trust_remote_code=True writes there during model load |
app.py env-var block |
| MLX path for Qwen on Mac, transformers on Linux | mlx-lm is 3-4x faster than transformers on Apple Silicon for text inference | lyrics_lm._get_lm |
_HFLM.generate slices prompt at token level |
tokenizer.decode(skip_special_tokens=True) strips ChatML markers, so string-level startswith(prompt) strip fails and the system + user turns leak into output |
lyrics_lm.py |
| Single-LoRA semantics (one active at a time) | The apple-silicon fork's DiT exposes load_lora/unload_lora/set_use_lora, not the multi-adapter PEFT API. Multi-entry stacks warn + use the first. |
lora_stack.apply_stack |
| Advanced controls accordion | User pain: outputs feel "samey" because ace-step inference_steps defaults to 8 (turbo). Accordion exposes 21 knobs across Diffusion / CFG schedule / 5Hz LM / Music metadata. Defaults tuned for XL SFT. |
ui._build_advanced_accordion |
| Per-mode duration estimator | Cover/Extend have duration_s at positional index 3 (not 2); Extend uses kwarg extra_duration_s; Edit uses segment_end_s β segment_start_s; Lyrics has no audio duration |
app._GPU_DURATION_HINTS + _extract_duration_s |
Deploy state
- GitHub: techfreakworm/ace-music-studio (mirror; canonical history)
- HF Space: techfreakworm/ACE-Music-Studio on
zero-a10ghardware - Remotes:
origin β git@github.com:techfreakworm/ace-music-studio.gitandspace β https://huggingface.co/spaces/techfreakworm/ACE-Music-Studio - HF token storage: macOS keychain via
git credential-osxkeychain. Set up once with:Then push withprintf "protocol=https\nhost=huggingface.co\nusername=techfreakworm\npassword=$(cat ~/.cache/huggingface/stored_tokens | grep hf_token | cut -d'=' -f2 | tr -d ' ')\n\n" \ | git credential-osxkeychain storegit -c credential.helper=osxkeychain push space main. - GPG-signed deploy tag per release. The user signs commits with SSH globally; override per-command for the dated deploy tag:
git -c gpg.format=openpgp -c user.signingkey=8845ABB54D0176AA tag -s deploy-YYYY-MM-DD HEAD -m "..." - Milestone tags (
m0βm7) live on GitHub only β HF's pre-receive hook validates README YAML on every commit a tag points at, and older milestones fail theshort_descriptionβ€60-char rule.
Commit rules
- Conventional Commits:
<type>(<scope>): <subject>- types:
feat,fix,chore,docs,test,refactor,ci,perf
- types:
- Subject is imperative, lowercase, no trailing period.
- Body explains why when not obvious. Reference plan task IDs (Task 7, Task A, etc.) when the change implements a specific plan step.
- Frequent small commits; one logical change per commit.
- No agent attribution in commit message or body. See rule 1.
- Don't
git push --forcetomainunless the user explicitly says so. EXCEPTION: HF Space bootstrap force-push is fine β HF auto-creates a template README and that's what you're overwriting.
Verification rules
- Tests must pass before committing.
python -m pytest tests/ -qfrom the project root. - Ruff must be clean.
ruff check . && ruff format --check . - The local app must boot.
python app.pyβ http://127.0.0.1:7860 reachable, no import error in/tmp/ace-music-studio.log. - For UI changes: open the URL in a browser (or Playwright eval) and verify the change is rendered. Don't trust a clean test run + clean ruff as proof that the UI works.
- For deployment changes: push to HF Space, watch the build, verify the runtime stage transitions to
RUNNINGbefore claiming success.
If a change requires breaking these rules, write the reason in the commit body.
Testing conventions
- TDD per the plan. Failing test first, then implementation.
- L1 + L2 in CI (no GPU). The mode handlers are tested with a mocked pipeline. We do NOT mock ACE-Step internals.
- L3 GPU smoke is opt-in (
pytest -m gpu). Lives intests/test_smoke_gpu.py. Loads the real pipeline (~32 GB cache hit on a warm machine). - L4 HF Space smoke is manual. Push, wait, click each tab, verify audio renders.
pyproject.toml has addopts = -m 'not gpu' so the default pytest invocation skips GPU. Add the marker before any test that touches ACE-Step weights.
Out of scope (v1 cap β don't add without asking)
Per spec Β§13. If you find yourself "while I'm here"-ing into one of them, stop.
- Multi-prompt batch queue
- Persistent generation history
- User accounts
- Telemetry dashboard
- Voice cloning (RVC)
- LoRA training in-app
- ControlNet-style conditioning
- Spectrogram visualization
- Multi-language UI strings
- Watermarking output audio
- Browser audio editing
- Multi-tenant rate limiting
- DAW export
If a feature you're adding requires one of these as a sub-step, ask the user before proceeding.
When you're not sure
- Read
docs/superpowers/specs/2026-05-18-ace-music-studio-design.mdβ that's the architectural source of truth. - Read
docs/superpowers/plans/2026-05-18-ace-music-studio.mdβ the task-by-task breakdown. - Read
SKILLS.mdβ process rules, debugging patterns, deployment workflow. git log --onelineβ every non-obvious decision has a fix-commit explaining the reasoning.- Ask the user. A clarifying question costs the user ten seconds. A wrong implementation costs everyone an hour.