LTX2.3-Studio / README.md
techfreakworm's picture
docs: revamp README + add AGENTS.md + refresh SKILLS.md
5a81fc9 unverified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: LTX 2.3 Studio
emoji: 🎬
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 5.50.0
app_file: app.py
python_version: '3.11'
suggested_hardware: zero-a10g
hf_oauth: false
preload_from_hub:
  - Comfy-Org/ltx-2 split_files/text_encoders/gemma_3_12B_it.safetensors
  - >-
    Kijai/LTX2.3_comfy
    diffusion_models/ltx-2.3-22b-dev_transformer_only_bf16.safetensors,loras/ltx-2.3-22b-distilled-lora-dynamic_fro09_avg_rank_105_bf16.safetensors,text_encoders/ltx-2.3_text_projection_bf16.safetensors,vae/LTX23_audio_vae_bf16.safetensors,vae/LTX23_video_vae_bf16.safetensors,vae/taeltx2_3.safetensors
  - Lightricks/LTX-2-19b-IC-LoRA-Detailer ltx-2-19b-ic-lora-detailer.safetensors
  - >-
    Lightricks/LTX-2-19b-LoRA-Camera-Control-Jib-Down
    ltx-2-19b-lora-camera-control-jib-down.safetensors
  - >-
    Lightricks/LTX-2-19b-LoRA-Camera-Control-Jib-Up
    ltx-2-19b-lora-camera-control-jib-up.safetensors
  - >-
    Lightricks/LTX-2-19b-LoRA-Camera-Control-Static
    ltx-2-19b-lora-camera-control-static.safetensors
  - >-
    Lightricks/LTX-2.3
    ltx-2.3-22b-distilled-lora-384.safetensors,ltx-2.3-spatial-upscaler-x2-1.0.safetensors
  - >-
    Lightricks/LTX-2.3-22b-IC-LoRA-Union-Control
    ltx-2.3-22b-ic-lora-union-control-ref0.5.safetensors
  - >-
    google/gemma-3-12b-it-qat-q4_0-unquantized
    gemma-3-12b-it/model-00001-of-00005.safetensors,gemma-3-12b-it/model-00002-of-00005.safetensors,gemma-3-12b-it/model-00003-of-00005.safetensors,gemma-3-12b-it/model-00004-of-00005.safetensors,gemma-3-12b-it/model-00005-of-00005.safetensors,gemma-3-12b-it/model.safetensors.index.json,gemma-3-12b-it/preprocessor_config.json,gemma-3-12b-it/tokenizer.model

LTX 2.3 Studio

A single-process Gradio app that wraps LTX-2.3 β€” Lightricks' open 22B video generation model β€” under one focused UI. Six modes (text Β· image Β· audio Β· lipsync Β· keyframe Β· style) sharing the same ComfyUI All-In-One workflow. Runs locally on Apple Silicon (MPS) or NVIDIA (CUDA), deploys to Hugging Face Spaces (ZeroGPU).

Spaces GitHub stars License: MIT Python 3.11 Powered by ComfyUI Built on LTX-2.3

β†’ Live demo: https://huggingface.co/spaces/techfreakworm/LTX2.3-Studio


What's inside

Six modes wired through the same ComfyUI All-In-One workflow. Each mode exposes only the inputs it actually consumes β€” the form stays short and focused.

Mode Inputs Output Notes
Text β†’ Video Prompt (+ optional audio prompt) mp4 (+ optional wav) The core mode. Camera-control LoRAs auto-applied by keyword.
Audio β†’ Video Prompt + audio track mp4 with the input audio preserved Conditions motion on the audio waveform.
Image β†’ Video Image + prompt mp4 (+ optional audio) Image-conditioned generation.
Lipsync Image + audio mp4 with audio Viseme-aligned mouth motion.
Keyframe First + last frames + prompt mp4 Latent interpolation between two anchors.
Style Transfer Source video + style image mp4 IC-LoRA restyle; motion preserved from source.

Every mode carries Fast / Balanced / Quality presets (steps Γ— 1, Γ— 1.5, Γ— 3). A per-mode ZeroGPU duration estimator adapts the call timeout to the requested workload.


Quick start (local)

Requires Python 3.11, ~80 GB free disk for the weight set, and ~24 GB VRAM (CUDA) or ~32 GB unified memory (Apple Silicon).

git clone --recurse-submodules https://github.com/techfreakworm/ltx2.3-AIO-generator
cd ltx2.3-AIO-generator
bash setup.sh           # creates .venv, installs ComfyUI + pinned custom nodes + app deps
source .venv/bin/activate
python app.py           # http://127.0.0.1:7860

The first run resolves model weights into your HF cache (~/.cache/huggingface/hub/) and symlinks them into comfyui/models/<comfy_type>/. Subsequent starts skip the download. Expect ~70 GB of weights pulled on a cold first run.

Apple Silicon notes. PYTORCH_ENABLE_MPS_FALLBACK=1 is set automatically so the few MPS-unsupported ops fall back to CPU. ComfyUI's VRAM autodetect picks the right tier; override with LTX23_AIO_VRAM=lowvram|normalvram|highvram if you need to force one.

LAN access (phone / tablet on the same WiFi): python app.py binds 0.0.0.0:7860. Visit http://<your-LAN-IP>:7860 from another device. On macOS, allow inbound for python in System Settings β†’ Network β†’ Firewall if the connection refuses.

Quick start (HF Spaces)

This repo is a Gradio Space. The Pro tier provides ZeroGPU (A10G) access and the per-call duration budget needed for the Balanced and Quality presets.

git remote add space https://huggingface.co/spaces/<your-handle>/LTX2.3-Studio
git push space master:main       # local branch is master; HF Space deploys from main

⚠ The refspec master:main matters. The local default branch is master (GitHub convention); the HF Space deploys from main. A bare git push space master creates an orphan remote branch that does NOT trigger a deploy.

The Space's preload_from_hub directive (see the YAML at the top of this file) bakes ~111 GB of weights into the build image. app.py:_bootstrap() then:

  1. Clones ComfyUI + pinned custom nodes into ~/comfyui on cold start (ZeroGPU container freezes preserve them across calls)
  2. Mirrors the read-only preload cache into ~/hf-cache-rw/ β€” works around the build-user-vs-runtime-user permissions trap (preloaded files are root-owned; we run as uid 1000 and can't write to them, so any lazy download to the cache would fail with Permission denied)
  3. Stages seed input files into comfyui/input/ so workflow loaders don't error before any user upload arrives

Subsequent requests hit warm cache β€” no network traffic on inference 2+.

ZeroGPU duration estimator. Each generate call carries a dynamic @spaces.GPU(duration=N) calculated from mode, preset, and frame count. Clamped at [60, 900] s. On timeout ("GPU task aborted"), the handler auto-retries once at 2Γ— duration.


Architecture

                                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                          browser ──▢│   app.py β€” Gradio Blocks         β”‚
                                     β”‚   header Β· drawer Β· 6 mode tabs  β”‚
                                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                        β”‚
                                                        β–Ό
                                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                     β”‚   backend.py                     β”‚
                                     β”‚   ComfyUILibraryBackend          β”‚
                                     β”‚   @spaces.GPU(duration=callable) β”‚
                                     β”‚   calls PromptExecutor directly  β”‚
                                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                        β”‚
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β–Ό              β–Ό              β–Ό                         β–Ό                  β–Ό
   modes.py       models.py      workflow.py                ui.py              tools/
   per-mode       walk + ensure  load + patch               per-mode form      extract_modes.py
   parameterize   from HF cache  API-format JSON            builders           (regen workflows/)
                                                        β”‚
                                                        β–Ό
                                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                     β”‚   comfyui/                       β”‚
                                     β”‚   submodule (local)              β”‚
                                     β”‚   runtime clone at ~/comfyui     β”‚
                                     β”‚   on HF Spaces                   β”‚
                                     β”‚                                  β”‚
                                     β”‚   β”œβ”€β”€ custom_nodes/ (pinned SHAs)β”‚
                                     β”‚   └── models/ β†’ HF cache symlinksβ”‚
                                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

One backend, one process. The @spaces.GPU decorator is the only divergence between local and Spaces runtime. ComfyUI manages VRAM via its tiered presets β€” no empty_cache() sprinkling needed elsewhere.

Workflow as data. Each of the six modes is a user-exported API-format JSON in workflows/. The mode handler patches a deep-copied template (modes.parameterize_fn) and hands it to ComfyUI's PromptExecutor. Updating the master workflow is a three-step ritual: edit in the ComfyUI editor β†’ export β†’ python tools/extract_modes.py --master ... --out workflows.


Project layout

.
β”œβ”€β”€ app.py              # Gradio Blocks entry, _bootstrap, _on_generate, mode tabs
β”œβ”€β”€ backend.py          # ComfyUILibraryBackend, @spaces.GPU, duration estimator
β”œβ”€β”€ modes.py            # MODE_REGISTRY + per-mode parameterize_fn + node-id constants
β”œβ”€β”€ models.py           # MODEL_REGISTRY, walk_workflow_for_models, ensure_models
β”œβ”€β”€ ui.py               # render_status, _render_idle, mode-form layout primitives
β”œβ”€β”€ workflow.py         # load_template, set_input helpers
β”œβ”€β”€ workflows/          # API-format mode JSONs (do not hand-edit)
β”‚   β”œβ”€β”€ t2v.json
β”‚   β”œβ”€β”€ i2v.json
β”‚   β”œβ”€β”€ a2v.json
β”‚   β”œβ”€β”€ lipsync.json
β”‚   β”œβ”€β”€ keyframe.json
β”‚   └── style.json
β”œβ”€β”€ assets/seed_inputs/ # placeholder image / audio / video for cold-start staging
β”œβ”€β”€ tools/
β”‚   └── extract_modes.py  # regenerate workflows/ from a master ComfyUI export
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ future_improvements.md
β”‚   └── superpowers/{specs,plans}/  # spec + implementation plans per feature
β”œβ”€β”€ tests/              # L1 + L3 in CI; L2 with --comfy-real; L4 GPU smoke
β”œβ”€β”€ README.md           # this file (HF Space YAML + project intro)
β”œβ”€β”€ CLAUDE.md           # project facts + gotchas (what & why)
β”œβ”€β”€ AGENTS.md           # tool-agnostic agent rulebook
β”œβ”€β”€ SKILLS.md           # process / debugging / deployment (how)
β”œβ”€β”€ requirements.txt    # pinned deps
β”œβ”€β”€ pyproject.toml      # ruff + pytest config (py311)
β”œβ”€β”€ setup.sh            # venv + ComfyUI + custom nodes bootstrap
└── comfyui/            # git submodule (local) / runtime clone target (Spaces)

Tech stack


Design

Theme: Topaz Cinema Slate β€” slate substrate #1A1F26, warm amber accent #E0A458 used sparingly, IBM Plex Sans throughout. Defined as _TOPAZ_THEME + _CUSTOM_CSS in app.py.

Layout: hamburger drawer. Pinned 220 px sidebar at β‰₯1024 px (mode buttons + model status + settings); below 1024 px it slides in as a fixed overlay via the .aio-shell.drawer-open class. The header carries a live mode tag (T2V/A2V/I2V/LIPSYNC/KEY/STYLE) updated by JS without a server round-trip.

Spec, plan, and design rationale live under docs/superpowers/specs/ and docs/superpowers/plans/.


Notes on running

  • First inference is slow. Cold-start workflow validation + model load on the active node graph takes ~30 – 90 s. Subsequent calls within the same session reuse loaded models.
  • VRAM tier is auto-detected; override with LTX23_AIO_VRAM=lowvram|normalvram|highvram.
  • ZeroGPU duration cap. The per-call estimator clamps to [60, 900] s. If a generation aborts with "GPU task aborted", the handler retries once at 2Γ— duration. The duration field is the queue-priority signal, not a billing cap.
  • Output directory. Local: comfyui/output/LTX2.3/. Spaces: ~/comfyui/output/LTX2.3/. Both are whitelisted via allowed_paths= on launch (Gradio 5 file-access policy).
  • Local LAN testing. Bound to 0.0.0.0:7860. macOS firewall: allow inbound for python if a connection from your phone refuses.

License

MIT for the AIO app code (see LICENSE).

  • ComfyUI is GPL-3.0.
  • LTX-2.3 and Lightricks-published LoRAs / auxiliaries retain Lightricks' open-source licensing β€” see the individual model cards on Hugging Face.
  • Gemma 3 weights are subject to Google's Gemma Terms of Use.
  • Each pinned custom node retains its own license; see the linked repositories.

Credits

Built by @techfreakworm β€” drop a β™₯ on the Space if it's useful, and follow there for what's next.