Spaces:

luh0502
/

NeAR

Running on Zero

luh1124 commited on 26 days ago

Commit

c98c836

1 Parent(s): 1dcc01e

app: CPU preload Hunyuan+NeAR, cuda move in GPU callbacks; drop gsplat warmup

- Background thread loads geometry and NeAR on CPU (NEAR_MODEL_CPU_PRELOAD_AT_START).
- ensure_geometry_on_cuda / ensure_near_on_cuda for first GPU use.
- Remove _warmup_gsplat_rasterization and ensure_gsplat_ready.
- Simplify app_gsplat; gsplat wheel pip uses --no-deps; update docs and tests.

Made-with: Cursor

Files changed (7) hide show

DEPLOY_HF_SPACE.md +2 -7
README.md +1 -1
app.py +86 -312
app_gsplat.py +74 -190
requirements.txt +19 -20
tests/test_app_architecture.py +10 -4
tests/test_app_gsplat_architecture.py +4 -4

DEPLOY_HF_SPACE.md CHANGED Viewed

@@ -60,10 +60,7 @@ If you maintain a separate template tree (e.g. `NeAR_space`), copy changes **int
 - **`import spaces`** (optional `try/except` for local runs without the package).
 - Decorate **every Gradio callback that uses CUDA** with **`@spaces.GPU`** (same as [E-RayZer](https://huggingface.co/spaces/qitaoz/E-RayZer): no `duration=` in app code — platform defaults apply). This repo aliases it as **`GPU`** in `app.py` and uses **`@GPU`**; locally, without the `spaces` package, it is a no-op. The decorator is effectively a no-op off ZeroGPU per HF docs.
 - Keep **page-load defaults and HDRI preview off the heavy model path**. This repo now uses a lightweight CPU image-preprocess path and a CPU-only HDRI preview path, so first page load no longer triggers full model initialization.
-- **Lazy-load** large models **inside** GPU callbacks. This repo now splits loading by responsibility:
-  - **`ensure_geometry_pipeline()`** for Hunyuan3D mesh generation
-  - **`ensure_near_pipeline()`** for NeAR SLaT/render/export
-  - **`ensure_gsplat_ready()`** only before the first real render/export path
 - **Space Variables**: at the top of `app.py` (before `import spaces`), **`NEAR_ZEROGPU_MAX_SECONDS`** / **`NEAR_ZEROGPU_DURATION_CAP`** are **rewritten in `os.environ`** if they exceed **`NEAR_ZEROGPU_HF_CEILING_S`** (default **90**, max **120**) so values like `300` cannot break the Hub runtime. This does not set per-callback `duration` in Python; it only clamps env vars HF may read.
 ### 2b1. Recommended runtime variable matrix
@@ -75,7 +72,6 @@ If you maintain a separate template tree (e.g. `NeAR_space`), copy changes **int
 | `NEAR_DINO_REPO_SUBDIR` | `dinov2` | `dinov2` |
 | `NEAR_DINO_MODEL_ID` | leave unset unless your mirror renames hubconf entries | leave unset unless your mirror renames hubconf entries |
 | `NEAR_DINO_FILENAME` | optional validation file inside the mirror repo | optional validation file inside the mirror repo |
-| `NEAR_GSPLAT_WARMUP` | `0` | `1` |
 | `NEAR_GSPLAT_SOURCE_SPEC` | unset unless you have a proven build path | optional if you want build-time source compile |
 | `NEAR_ZEROGPU_HF_CEILING_S` | `90` | tune to your tier |
 | `NEAR_HYSHAPE_GEOMETRY_CPU_PRELOAD_AT_START` | `1` when Space entry is **`app_hyshape.py`** (default: background thread runs `from_pretrained(..., device="cpu")` at startup — **no** `@spaces.GPU`) | `0` to defer CPU load until the first **Generate Mesh** click (inside the GPU callback; longer first click) |
@@ -104,7 +100,6 @@ Similar to [E-RayZer](https://huggingface.co/spaces/qitaoz/E-RayZer), the first
 |----------|---------|
 | **`NEAR_GSPLAT_WHEEL_URL`** | If set, `app.py` / `app_gsplat.py` runs `pip install --force-reinstall` on this URL **before** importing gsplat/trellis. Use when you host a **cp310 manylinux** wheel on `near-wheels` built against **your exact** PyTorch/CUDA pin (official gsplat wheels top out at **pt2.4+cu124**; see `requirements.txt`). |
 | **`NEAR_GSPLAT_SOURCE_SPEC`** | If set, `app.py` runs `pip install --no-build-isolation` on this spec **before** importing `trellis` (e.g. `./third_party/gsplat` after vendoring, or `git+https://github.com/nerfstudio-project/gsplat.git@<tag>`). Needs **nvcc** — usually absent on the default Gradio builder. |
-| **`NEAR_GSPLAT_WARMUP`** | Default **on** (`1`). After models load on CUDA, runs one **tiny** `RGB+ED` raster pass so the first user preview/video is less likely to hit JIT alone. Set to **`0`** if the extra time risks **ZeroGPU** timeout on the **first** GPU callback. |
 Alternatively, pin a **VCS** gsplat line in `requirements.txt` (e.g. `gsplat @ git+https://...`) so the **Space build** step compiles once; no `NEAR_GSPLAT_SOURCE_SPEC` needed.
@@ -116,7 +111,7 @@ Alternatively, pin a **VCS** gsplat line in `requirements.txt` (e.g. `gsplat @ g
 **Trade-offs vs fixed GPU + Docker**
 - **ZeroGPU + Gradio builder**: the image may **not** include a full CUDA toolkit (`nvcc`, `CUDA_HOME`). **`git+…nvdiffrast`** (source install) often fails here. This repo uses a **prebuilt `nvdiffrast` wheel** URL in `requirements.txt` (see **§5**) so the builder only downloads a wheel. If that wheel is ABI-incompatible with our PyTorch pin, build your own wheel or add a **`Dockerfile`** with `nvidia/cuda:*-devel` and fixed GPU.
-- **Quota**: visitors consume **daily GPU seconds**. This repo keeps plain `@spaces.GPU` in Python and tunes runtime behavior through environment variables such as `NEAR_GSPLAT_WARMUP` and `NEAR_ZEROGPU_HF_CEILING_S`, rather than setting `duration=` in app code.
 ### 2c. Example gallery empty on the Space (`assets/` “not deployed”)

 - **`import spaces`** (optional `try/except` for local runs without the package).
 - Decorate **every Gradio callback that uses CUDA** with **`@spaces.GPU`** (same as [E-RayZer](https://huggingface.co/spaces/qitaoz/E-RayZer): no `duration=` in app code — platform defaults apply). This repo aliases it as **`GPU`** in `app.py` and uses **`@GPU`**; locally, without the `spaces` package, it is a no-op. The decorator is effectively a no-op off ZeroGPU per HF docs.
 - Keep **page-load defaults and HDRI preview off the heavy model path**. This repo now uses a lightweight CPU image-preprocess path and a CPU-only HDRI preview path, so first page load no longer triggers full model initialization.
+- **Model init**: a background thread (when **`NEAR_MODEL_CPU_PRELOAD_AT_START=1`**) loads **Hunyuan + NeAR** on **CPU** at process start (no GPU lease). **`@spaces.GPU`** callbacks call **`ensure_geometry_on_cuda()`** / **`ensure_near_on_cuda()`** to move weights to CUDA once, then run inference. **gsplat** is exercised only when the pipeline renders (first call may still JIT if no prebuilt wheel).
 - **Space Variables**: at the top of `app.py` (before `import spaces`), **`NEAR_ZEROGPU_MAX_SECONDS`** / **`NEAR_ZEROGPU_DURATION_CAP`** are **rewritten in `os.environ`** if they exceed **`NEAR_ZEROGPU_HF_CEILING_S`** (default **90**, max **120**) so values like `300` cannot break the Hub runtime. This does not set per-callback `duration` in Python; it only clamps env vars HF may read.
 ### 2b1. Recommended runtime variable matrix
 | `NEAR_DINO_REPO_SUBDIR` | `dinov2` | `dinov2` |
 | `NEAR_DINO_MODEL_ID` | leave unset unless your mirror renames hubconf entries | leave unset unless your mirror renames hubconf entries |
 | `NEAR_DINO_FILENAME` | optional validation file inside the mirror repo | optional validation file inside the mirror repo |
 | `NEAR_GSPLAT_SOURCE_SPEC` | unset unless you have a proven build path | optional if you want build-time source compile |
 | `NEAR_ZEROGPU_HF_CEILING_S` | `90` | tune to your tier |
 | `NEAR_HYSHAPE_GEOMETRY_CPU_PRELOAD_AT_START` | `1` when Space entry is **`app_hyshape.py`** (default: background thread runs `from_pretrained(..., device="cpu")` at startup — **no** `@spaces.GPU`) | `0` to defer CPU load until the first **Generate Mesh** click (inside the GPU callback; longer first click) |
 |----------|---------|
 | **`NEAR_GSPLAT_WHEEL_URL`** | If set, `app.py` / `app_gsplat.py` runs `pip install --force-reinstall` on this URL **before** importing gsplat/trellis. Use when you host a **cp310 manylinux** wheel on `near-wheels` built against **your exact** PyTorch/CUDA pin (official gsplat wheels top out at **pt2.4+cu124**; see `requirements.txt`). |
 | **`NEAR_GSPLAT_SOURCE_SPEC`** | If set, `app.py` runs `pip install --no-build-isolation` on this spec **before** importing `trellis` (e.g. `./third_party/gsplat` after vendoring, or `git+https://github.com/nerfstudio-project/gsplat.git@<tag>`). Needs **nvcc** — usually absent on the default Gradio builder. |
 Alternatively, pin a **VCS** gsplat line in `requirements.txt` (e.g. `gsplat @ git+https://...`) so the **Space build** step compiles once; no `NEAR_GSPLAT_SOURCE_SPEC` needed.
 **Trade-offs vs fixed GPU + Docker**
 - **ZeroGPU + Gradio builder**: the image may **not** include a full CUDA toolkit (`nvcc`, `CUDA_HOME`). **`git+…nvdiffrast`** (source install) often fails here. This repo uses a **prebuilt `nvdiffrast` wheel** URL in `requirements.txt` (see **§5**) so the builder only downloads a wheel. If that wheel is ABI-incompatible with our PyTorch pin, build your own wheel or add a **`Dockerfile`** with `nvidia/cuda:*-devel` and fixed GPU.
+- **Quota**: visitors consume **daily GPU seconds**. This repo keeps plain `@spaces.GPU` in Python and clamps **`NEAR_ZEROGPU_*`** via **`NEAR_ZEROGPU_HF_CEILING_S`** rather than setting `duration=` in app code.
 ### 2c. Example gallery empty on the Space (`assets/` “not deployed”)

README.md CHANGED Viewed

@@ -51,7 +51,7 @@ This repository combines:
 - The Space is temporarily pointed at **`app_gsplat.py`** (gsplat **RGB+ED** raster only) to test JIT / ZeroGPU without NeAR or Hunyuan. Switch **`app_file`** to **`app_hyshape.py`**, **`app.py`**, etc. in the YAML header above as needed.
 - **`app_hyshape.py`** (when used as entry): defaults to **`NEAR_HYSHAPE_GEOMETRY_CPU_PRELOAD_AT_START=1`** — background **CPU** Hunyuan load at start; **Generate Mesh** pays **GPU move + inference** in `@spaces.GPU`.
 - The full `app.py` Space keeps **page-load image defaults** and **HDRI preview** on lightweight CPU paths so the first page visit does not spend the first ZeroGPU allocation on model initialization.
-- Runtime loading is split by responsibility: **Hunyuan3D geometry** is loaded only for mesh generation, **NeAR relighting** is loaded only for SLaT/render/export, and **gsplat warmup** is delayed until the first real render.
 - Binary wheels and mirrored auxiliary assets are stored separately:
   - **`luh0502/near-wheels`**: prebuilt wheels such as `nvdiffrast` and optional future `gsplat` wheels
   - **`luh0502/near-assets`**: torch.hub-compatible mirrored auxiliary assets such as the DINOv2 repo used by NeAR/TRELLIS image-conditioning

 - The Space is temporarily pointed at **`app_gsplat.py`** (gsplat **RGB+ED** raster only) to test JIT / ZeroGPU without NeAR or Hunyuan. Switch **`app_file`** to **`app_hyshape.py`**, **`app.py`**, etc. in the YAML header above as needed.
 - **`app_hyshape.py`** (when used as entry): defaults to **`NEAR_HYSHAPE_GEOMETRY_CPU_PRELOAD_AT_START=1`** — background **CPU** Hunyuan load at start; **Generate Mesh** pays **GPU move + inference** in `@spaces.GPU`.
 - The full `app.py` Space keeps **page-load image defaults** and **HDRI preview** on lightweight CPU paths so the first page visit does not spend the first ZeroGPU allocation on model initialization.
+- **`app.py`**: optional background **CPU** preload of Hunyuan + NeAR (`NEAR_MODEL_CPU_PRELOAD_AT_START`); **`@spaces.GPU`** callbacks move each pipeline to CUDA once, then run inference. **gsplat** is used when the pipeline renders (no separate app-level warmup pass).
 - Binary wheels and mirrored auxiliary assets are stored separately:
   - **`luh0502/near-wheels`**: prebuilt wheels such as `nvdiffrast` and optional future `gsplat` wheels
   - **`luh0502/near-assets`**: torch.hub-compatible mirrored auxiliary assets such as the DINOv2 repo used by NeAR/TRELLIS image-conditioning

app.py CHANGED Viewed

@@ -1,20 +1,12 @@
 import os
 import sys
-# transformers/huggingface_hub authenticate gated repos via HF_TOKEN (or HUGGING_FACE_HUB_TOKEN).
-# Space secrets become env vars named exactly like the secret: a secret named "near" sets "near",
-# not HF_TOKEN, so Hub calls stay anonymous until we mirror it here.
 if not os.environ.get("HF_TOKEN") and not os.environ.get("HUGGING_FACE_HUB_TOKEN"):
     _hub_tok = (os.environ.get("near") or os.environ.get("NEAR") or "").strip()
     if _hub_tok:
         os.environ["HF_TOKEN"] = _hub_tok
-        print(
-            "[NeAR] HF_TOKEN unset; using Space secret 'near' as HF_TOKEN. "
-            "Prefer renaming that secret to HF_TOKEN (standard Hub env name).",
-            flush=True,
-        )
-# ZeroGPU: must run before `import spaces`. Space Variables often leave NEAR_* at 300/1800; HF still rejects those.
 try:
     _raw_zerogpu_cap = int(os.environ.get("NEAR_ZEROGPU_HF_CEILING_S", "90"))
 except ValueError:
@@ -27,12 +19,7 @@ for _ek in ("NEAR_ZEROGPU_MAX_SECONDS", "NEAR_ZEROGPU_DURATION_CAP"):
                 os.environ[_ek] = str(_ZEROGPU_ENV_CAP_S)
         except ValueError:
             pass
-print(
-    f"[NeAR] ZeroGPU: NEAR_ZEROGPU_MAX_SECONDS / NEAR_ZEROGPU_DURATION_CAP clamped to cap {_ZEROGPU_ENV_CAP_S}s "
-    f"(adjust NEAR_ZEROGPU_HF_CEILING_S up to 120 if your tier allows). "
-    f"Gradio callbacks use plain spaces.GPU (platform default duration).",
-    flush=True,
-)
 import shutil
 import subprocess
@@ -71,72 +58,12 @@ except Exception:
 sys.path.insert(0, "./hy3dshape")
 os.environ.setdefault("ATTN_BACKEND", "xformers")
 os.environ.setdefault("SPCONV_ALGO", "native")
-# Cloud GPUs (T4, A10G, L4, …) vs H100: override with TORCH_CUDA_ARCH_LIST if cutlass/spconv complains.
 os.environ.setdefault("TORCH_CUDA_ARCH_LIST", "7.5;8.0;8.6;8.9;9.0")
-def _maybe_reinstall_gsplat_wheel() -> None:
-    """Optional: force-install a prebuilt gsplat wheel (e.g. from ``near-wheels``) before trellis import."""
-    url = (os.environ.get("NEAR_GSPLAT_WHEEL_URL") or "").strip()
-    if not url:
-        return
-    cmd = [
-        sys.executable,
-        "-m",
-        "pip",
-        "install",
-        "--no-cache-dir",
-        "--force-reinstall",
-        url,
-    ]
-    print(f"[NeAR] NEAR_GSPLAT_WHEEL_URL set; pip install: {url}", flush=True)
-    r = subprocess.run(cmd, check=False)
-    if r.returncode != 0:
-        print(
-            f"[NeAR] WARNING: gsplat wheel install failed (exit {r.returncode}).",
-            flush=True,
-        )
-def _maybe_reinstall_gsplat_from_source() -> None:
-    """Optional pip install before importing trellis (E-RayZer-style source build).
-    Set ``NEAR_GSPLAT_SOURCE_SPEC`` to a pip requirement, e.g.:
-    ``./third_party/gsplat`` or ``git+https://github.com/nerfstudio-project/gsplat.git@v1.5.3``
-    Compiles CUDA extensions at container start instead of on first rasterization.
-    """
-    spec = (os.environ.get("NEAR_GSPLAT_SOURCE_SPEC") or "").strip()
-    if not spec:
-        return
-    cmd = [
-        sys.executable,
-        "-m",
-        "pip",
-        "install",
-        "--no-build-isolation",
-        "--no-cache-dir",
-        spec,
-    ]
-    print(f"[NeAR] NEAR_GSPLAT_SOURCE_SPEC set; building gsplat via: {' '.join(cmd)}", flush=True)
-    r = subprocess.run(cmd, check=False)
-    if r.returncode != 0:
-        print(
-            f"[NeAR] WARNING: gsplat install from NEAR_GSPLAT_SOURCE_SPEC failed (exit {r.returncode}).",
-            flush=True,
-        )
-_maybe_reinstall_gsplat_wheel()
-_maybe_reinstall_gsplat_from_source()
 from trellis.pipelines import NeARImageToRelightable3DPipeline
 from hy3dshape.pipelines import Hunyuan3DDiTFlowMatchingPipeline  # pyright: ignore[reportMissingImports]
-# Hugging Face ZeroGPU: same style as E-RayZer — bare ``spaces.GPU`` (no custom duration in app code).
 GPU = spaces.GPU if spaces is not None else (lambda f: f)
 APP_DIR = Path(__file__).resolve().parent
@@ -145,8 +72,6 @@ CACHE_DIR.mkdir(exist_ok=True)
 def _path_is_git_lfs_pointer(p: Path) -> bool:
-    """True if this path is a tiny Git LFS pointer file (real media was never smudged / pushed)."""
     try:
         if not p.is_file():
             return False
@@ -159,8 +84,6 @@ def _path_is_git_lfs_pointer(p: Path) -> bool:
 def _warn_example_assets() -> None:
-    """Log once if bundled examples are missing or still LFS pointer stubs (common Space deploy issue)."""
     img_dir = APP_DIR / "assets/example_image"
     if not img_dir.is_dir():
         print(
@@ -186,10 +109,6 @@ DEFAULT_PORT = 7860
 MAX_SEED = np.iinfo(np.int32).max
-# ---------------------------------------------------------------------------
-# Session helpers
-# ---------------------------------------------------------------------------
 _SESSION_LAST_TOUCH: Dict[str, float] = {}
 _SESSION_TOUCH_LOCK = threading.Lock()
@@ -212,7 +131,6 @@ def ensure_session_dir(req: Optional[gr.Request]) -> Path:
     return d
-@GPU
 def clear_session_dir(req: Optional[gr.Request]) -> str:
     d = ensure_session_dir(req)
     shutil.rmtree(d, ignore_errors=True)
@@ -230,8 +148,6 @@ def end_session(req: gr.Request):
 def _session_dir_latest_mtime(path: Path) -> float:
-    """Latest mtime among path and all nested files (best-effort for 'last activity')."""
     try:
         latest = path.stat().st_mtime
     except OSError:
@@ -321,21 +237,30 @@ def get_file_path(file_obj: Any) -> Optional[str]:
     return None
-# ---------------------------------------------------------------------------
-# Model loading (lazy — ZeroGPU may have no CUDA until @spaces.GPU runs)
-# ---------------------------------------------------------------------------
 _model_lock = threading.Lock()
 PIPELINE: Optional[NeARImageToRelightable3DPipeline] = None
 GEOMETRY_PIPELINE: Optional[Hunyuan3DDiTFlowMatchingPipeline] = None
 _light_preprocess_lock = threading.Lock()
 _light_preprocessor: Any | None = None
-_gsplat_warmup_done = False
-# Dropdown defaults before lazy load; use allow_custom_value for full OCIO view names.
 _FALLBACK_TONE_MAPPER_CHOICES = ["AgX", "False", "Khronos neutrals", "Filmic", "Khronos glTF PBR"]
 def _default_tone_mapper_choices() -> list[str]:
     try:
         views = getattr(ToneMapper(), "available_views", None)
@@ -349,75 +274,6 @@ def _default_tone_mapper_choices() -> list[str]:
 TONE_MAPPER_CHOICES = _default_tone_mapper_choices()
-def _warmup_gsplat_rasterization(device: str) -> None:
-    """One tiny RGB+ED raster pass so first user render does not pay JIT alone.
-    Matches channel layout used in ``trellis/renderers/gaussian_render.py``.
-    Disable with ``NEAR_GSPLAT_WARMUP=0`` if ZeroGPU budget is too tight.
-    """
-    if device != "cuda":
-        return
-    if os.environ.get("NEAR_GSPLAT_WARMUP", "1").strip().lower() in ("0", "false", "no", "off"):
-        return
-    try:
-        from gsplat.rendering import rasterization as _gsplat_rasterization
-    except Exception as exc:
-        print(f"[NeAR] gsplat warmup skipped (import): {exc}", flush=True)
-        return
-    dev = torch.device("cuda")
-    t_w = time.time()
-    n, h, w = 1, 64, 64
-    try:
-        means = torch.zeros(n, 3, device=dev, dtype=torch.float32)
-        quats = torch.tensor([[1.0, 0.0, 0.0, 0.0]], device=dev, dtype=torch.float32)
-        scales = torch.ones(n, 3, device=dev, dtype=torch.float32) * 0.02
-        opacities = torch.ones(n, device=dev, dtype=torch.float32)
-        colors = torch.ones(n, 8, device=dev, dtype=torch.float32)
-        viewmat = torch.eye(4, device=dev, dtype=torch.float32)
-        k = torch.tensor(
-            [[80.0, 0.0, w * 0.5], [0.0, 80.0, h * 0.5], [0.0, 0.0, 1.0]],
-            device=dev,
-            dtype=torch.float32,
-        )
-        backgrounds = torch.zeros(1, 9, device=dev, dtype=torch.float32)
-        _gsplat_rasterization(
-            means=means,
-            quats=quats,
-            scales=scales,
-            opacities=opacities,
-            colors=colors,
-            viewmats=viewmat[None],
-            Ks=k[None],
-            backgrounds=backgrounds,
-            width=w,
-            height=h,
-            near_plane=0.01,
-            far_plane=100.0,
-            distributed=False,
-            render_mode="RGB+ED",
-            rasterize_mode="antialiased",
-            packed=False,
-        )
-        torch.cuda.synchronize()
-        print(f"[NeAR] gsplat warmup (RGB+ED {w}x{h}) done in {time.time() - t_w:.1f}s", flush=True)
-    except Exception as exc:
-        print(f"[NeAR] gsplat warmup skipped (rasterization): {exc}", flush=True)
-def _runtime_device() -> str:
-    return "cuda" if torch.cuda.is_available() else "cpu"
-def _log_timing(stage: str, started_at: float, **extra: object) -> float:
-    elapsed = time.time() - started_at
-    details = ", ".join(f"{k}={v}" for k, v in extra.items() if v is not None)
-    suffix = f" ({details})" if details else ""
-    print(f"[NeAR] timing {stage}: {elapsed:.1f}s{suffix}", flush=True)
-    return elapsed
 def _get_light_image_preprocessor():
     global _light_preprocessor
     if _light_preprocessor is not None:
@@ -427,12 +283,11 @@ def _get_light_image_preprocessor():
             from hy3dshape.rembg import BackgroundRemover  # pyright: ignore[reportMissingImports]
             _light_preprocessor = BackgroundRemover()
-            print("[NeAR] Background remover ready for lightweight image preprocessing.", flush=True)
     return _light_preprocessor
 def _preprocess_image_rgba_light(input_image: Image.Image) -> Image.Image:
-    """Background-remove, crop, and resize without loading the NeAR pipeline."""
     image = _ensure_rgba(input_image)
     has_alpha = False
     if image.mode == "RGBA":
@@ -488,69 +343,79 @@ def _update_tone_mapper_choices(tone_mapper: Any) -> None:
         TONE_MAPPER_CHOICES = [str(v) for v in views]
-def ensure_geometry_pipeline() -> Hunyuan3DDiTFlowMatchingPipeline:
     global GEOMETRY_PIPELINE
     if GEOMETRY_PIPELINE is not None:
-        return GEOMETRY_PIPELINE
-    with _model_lock:
-        if GEOMETRY_PIPELINE is not None:
-            return GEOMETRY_PIPELINE
-        device = _runtime_device()
-        hy_id = os.environ.get("NEAR_HUNYUAN_PRETRAINED", "tencent/Hunyuan3D-2.1")
-        t0 = time.time()
-        print("[NeAR] Loading Hunyuan3D geometry pipeline...", flush=True)
-        gp = Hunyuan3DDiTFlowMatchingPipeline.from_pretrained(hy_id, device="cpu")
-        print(f"[NeAR] Hunyuan3D from_pretrained (cpu): {time.time() - t0:.1f}s", flush=True)
-        t_move = time.time()
-        gp.to(device)
-        print(f"[NeAR] Hunyuan3D moved to {device} in {time.time() - t_move:.1f}s", flush=True)
-        GEOMETRY_PIPELINE = gp
-        print(f"[NeAR] Geometry pipeline ready on {device} ({time.time() - t0:.1f}s total).", flush=True)
-        return GEOMETRY_PIPELINE
-def ensure_near_pipeline() -> NeARImageToRelightable3DPipeline:
     global PIPELINE
     if PIPELINE is not None:
-        return PIPELINE
     with _model_lock:
-        if PIPELINE is not None:
-            return PIPELINE
-        device = _runtime_device()
-        # briaai/RMBG-2.0 is gated: accept the license on the model card, then add HF_TOKEN
-        # (read) in Space Settings -> Secrets. Never commit tokens into git.
-        near_id = os.environ.get("NEAR_PRETRAINED", "luh0502/NeAR")
-        t0 = time.time()
-        print(f"[NeAR] Loading NeAR relighting pipeline from {near_id!r} on target {device}...", flush=True)
-        t_stage = time.time()
-        pipeline = NeARImageToRelightable3DPipeline.from_pretrained(near_id)
-        _log_timing("ensure_near_pipeline.from_pretrained", t_stage, repo=near_id)
-        t_move = time.time()
-        pipeline.to(device)
-        _log_timing("ensure_near_pipeline.to_device", t_move, device=device)
-        PIPELINE = pipeline
-        _update_tone_mapper_choices(pipeline.tone_mapper)
-        _log_timing("ensure_near_pipeline.total", t0, device=device)
-        return PIPELINE
-def ensure_gsplat_ready() -> None:
-    global _gsplat_warmup_done
-    if _gsplat_warmup_done:
-        return
     with _model_lock:
-        if _gsplat_warmup_done:
-            return
-        device = _runtime_device()
         t0 = time.time()
-        print(f"[NeAR] Preparing gsplat on {device}...", flush=True)
-        _warmup_gsplat_rasterization(device)
-        _gsplat_warmup_done = True
-        _log_timing("ensure_gsplat_ready.total", t0, device=device)
 def set_tone_mapper(view_name: str):
-    pipeline = ensure_near_pipeline()
     if view_name:
         pipeline.setup_tone_mapper(view_name)
     return pipeline
@@ -577,7 +442,6 @@ def switch_asset_source(mode: str):
 def _ensure_rgba(img: Image.Image) -> Image.Image:
-    """Normalize to RGBA so alpha is preserved for mesh (white matte) vs SLaT (black matte)."""
     if img.mode == "RGBA":
         return img
     if img.mode == "RGB":
@@ -602,10 +466,6 @@ def save_slat_npz(slat, save_path: Path):
     )
-# ---------------------------------------------------------------------------
-# Core pipeline functions
-# ---------------------------------------------------------------------------
 @GPU
 @torch.inference_mode()
 def generate_mesh(
@@ -613,10 +473,7 @@ def generate_mesh(
     req: gr.Request,
     progress=gr.Progress(track_tqdm=True),
 ):
-    """Step ①: generate Hunyuan3D geometry from an already preprocessed image.
-    Returns: (state, mesh_glb_path, status)
-    """
-    geometry_pipeline = ensure_geometry_pipeline()
     session_dir = ensure_session_dir(req)
     if image_input is None:
@@ -657,7 +514,7 @@ def generate_slat(
     req: gr.Request,
     progress=gr.Progress(track_tqdm=True),
 ):
-    pipeline = ensure_near_pipeline()
     session_dir = ensure_session_dir(req)
     if not asset_state or not asset_state.get("mesh_path"):
@@ -749,30 +606,17 @@ def render_preview(
     t0 = time.time()
     session_dir = ensure_session_dir(req)
     progress(0.1, desc="Loading SLaT and HDRI")
-    t_load = time.time()
     pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
-    load_elapsed = _log_timing("render_preview.load_asset_and_hdri", t_load)
-    t_gsplat = time.time()
-    ensure_gsplat_ready()
-    gsplat_elapsed = _log_timing("render_preview.ensure_gsplat_ready", t_gsplat)
     progress(0.5, desc="Rendering")
-    t_render = time.time()
     views = pipeline.render_view(
         slat, hdri_np,
         yaw_deg=yaw, pitch_deg=pitch, fov=fov, radius=radius,
         hdri_rot_deg=hdri_rot, resolution=int(resolution),
     )
-    render_elapsed = _log_timing("render_preview.render_view", t_render, resolution=int(resolution))
     for key, image in views.items():
         image.save(session_dir / f"preview_{key}.png")
-    _log_timing(
-        "render_preview.total",
-        t0,
-        load_asset_and_hdri=f"{load_elapsed:.1f}s",
-        ensure_gsplat_ready=f"{gsplat_elapsed:.1f}s",
-        render_view=f"{render_elapsed:.1f}s",
-    )
     msg = (
         f"**Preview done** — "
@@ -808,32 +652,18 @@ def render_camera_video(
     t0 = time.time()
     session_dir = ensure_session_dir(req)
     progress(0.1, desc="Loading SLaT and HDRI")
-    t_load = time.time()
     pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
-    load_elapsed = _log_timing("render_camera_video.load_asset_and_hdri", t_load)
-    t_gsplat = time.time()
-    ensure_gsplat_ready()
-    gsplat_elapsed = _log_timing("render_camera_video.ensure_gsplat_ready", t_gsplat)
     progress(0.4, desc="Rendering camera path")
-    t_render = time.time()
     frames = pipeline.render_camera_path_video(
         slat, hdri_np,
         num_views=int(num_views), fov=fov, radius=radius,
         hdri_rot_deg=hdri_rot, full_video=full_video, shadow_video=shadow_video,
         bg_color=(1, 1, 1), verbose=True,
     )
-    render_elapsed = _log_timing("render_camera_video.render_path", t_render, num_views=int(num_views))
     video_path = session_dir / ("camera_path_full.mp4" if full_video else "camera_path.mp4")
     imageio.mimsave(video_path, frames, fps=int(fps))
-    _log_timing(
-        "render_camera_video.total",
-        t0,
-        load_asset_and_hdri=f"{load_elapsed:.1f}s",
-        ensure_gsplat_ready=f"{gsplat_elapsed:.1f}s",
-        render_path=f"{render_elapsed:.1f}s",
-        fps=int(fps),
-    )
     return str(video_path), f"**Camera path video saved**"
@@ -857,34 +687,20 @@ def render_hdri_video(
     t0 = time.time()
     session_dir = ensure_session_dir(req)
     progress(0.1, desc="Loading SLaT and HDRI")
-    t_load = time.time()
     pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
-    load_elapsed = _log_timing("render_hdri_video.load_asset_and_hdri", t_load)
-    t_gsplat = time.time()
-    ensure_gsplat_ready()
-    gsplat_elapsed = _log_timing("render_hdri_video.ensure_gsplat_ready", t_gsplat)
     progress(0.4, desc="Rendering HDRI rotation")
-    t_render = time.time()
     hdri_roll_frames, render_frames = pipeline.render_hdri_rotation_video(
         slat, hdri_np,
         num_frames=int(num_frames), yaw_deg=yaw, pitch_deg=pitch,
         fov=fov, radius=radius, full_video=full_video, shadow_video=shadow_video,
         bg_color=(1, 1, 1), verbose=True,
     )
-    render_elapsed = _log_timing("render_hdri_video.render_rotation", t_render, num_frames=int(num_frames))
     hdri_roll_path = session_dir / "hdri_roll.mp4"
     render_path = session_dir / ("hdri_rotation_full.mp4" if full_video else "hdri_rotation.mp4")
     imageio.mimsave(hdri_roll_path, hdri_roll_frames, fps=int(fps))
     imageio.mimsave(render_path, render_frames, fps=int(fps))
-    _log_timing(
-        "render_hdri_video.total",
-        t0,
-        load_asset_and_hdri=f"{load_elapsed:.1f}s",
-        ensure_gsplat_ready=f"{gsplat_elapsed:.1f}s",
-        render_rotation=f"{render_elapsed:.1f}s",
-        fps=int(fps),
-    )
     return str(hdri_roll_path), str(render_path), "**HDRI rotation video saved**"
@@ -899,46 +715,24 @@ def export_glb(
     req: gr.Request,
     progress=gr.Progress(track_tqdm=True),
 ):
-    """Returns: (glb_path, status)"""
     t0 = time.time()
     session_dir = ensure_session_dir(req)
     progress(0.1, desc="Loading SLaT and HDRI")
-    t_load = time.time()
     pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
-    load_elapsed = _log_timing("export_glb.load_asset_and_hdri", t_load)
-    t_gsplat = time.time()
-    ensure_gsplat_ready()
-    gsplat_elapsed = _log_timing("export_glb.ensure_gsplat_ready", t_gsplat)
     progress(0.6, desc="Baking PBR textures")
-    t_export = time.time()
     glb = pipeline.export_glb_from_slat(
         slat, hdri_np,
         hdri_rot_deg=hdri_rot, base_mesh=None,
         simplify=simplify, texture_size=int(texture_size), fill_holes=True,
     )
-    export_elapsed = _log_timing(
-        "export_glb.export_glb_from_slat",
-        t_export,
-        texture_size=int(texture_size),
-    )
     glb_path = session_dir / "near_pbr.glb"
     glb.export(glb_path)
-    _log_timing(
-        "export_glb.total",
-        t0,
-        load_asset_and_hdri=f"{load_elapsed:.1f}s",
-        ensure_gsplat_ready=f"{gsplat_elapsed:.1f}s",
-        export_glb_from_slat=f"{export_elapsed:.1f}s",
-    )
     return str(glb_path), f"PBR GLB exported: **{glb_path.name}**"
-# ---------------------------------------------------------------------------
-# CSS
-# ---------------------------------------------------------------------------
 CUSTOM_CSS = """
-/* Use full browser width (was max-width:1600px leaving empty margin on the right) */
 .gradio-container { max-width: 100% !important; width: 100% !important; }
 main.gradio-container { max-width: 100% !important; }
 .gradio-wrap { max-width: 100% !important; }
@@ -1163,13 +957,9 @@ NEAR_GRADIO_THEME = gr.themes.Base(
 )
-# ---------------------------------------------------------------------------
-# UI
-# ---------------------------------------------------------------------------
 def build_app() -> gr.Blocks:
     with gr.Blocks(
         title="NeAR",
-        # (600, 600) deletes Gradio /tmp/gradio uploads in ~10m while the UI may still reference paths.
         delete_cache=None,
         fill_width=True,
     ) as demo:
@@ -1208,9 +998,6 @@ def build_app() -> gr.Blocks:
         with gr.Row(equal_height=False):
-            # ════════════════════════════════════════════════════════════════
-            # LEFT — controls only (TRELLIS-style narrow column)
-            # ════════════════════════════════════════════════════════════════
             with gr.Column(scale=1, min_width=360):
                 with gr.Group():
@@ -1242,10 +1029,6 @@ def build_app() -> gr.Blocks:
                     slat_button = gr.Button(
                         "② Generate / Load SLaT", variant="primary", min_width=100,
                     )
-                    # gr.HTML(
-                        # "<div style='font-size:0.78rem;color:#9ca3af;margin-top:0.2rem;'>"
-                        # "Image mode: run ① then ②. SLaT mode: ② loads file directly.</div>"
-                    # )
                 with gr.Group():
                     gr.HTML('<p class="section-kicker">HDRI</p>')
@@ -1277,9 +1060,6 @@ def build_app() -> gr.Blocks:
                 with gr.Row():
                     clear_button = gr.Button("Clear Cache", variant="secondary", min_width=100)
-            # ═��══════════════════════════════════════════════════════════════
-            # CENTER — status at top, then Camera & HDRI, then tabs
-            # ════════════════════════════════════════════════════════════════
             with gr.Column(scale=10, min_width=560):
                 status_md = gr.Markdown(
@@ -1363,10 +1143,6 @@ def build_app() -> gr.Blocks:
                                 label="HDRI Roll", autoplay=True, loop=True, height=180,
                             )
-            # ════════════════════════════════════════════════════════════════
-            # RIGHT — examples sidebar (TRELLIS-style narrow column)
-            # ════════════════════════════════════════════════════════════════
             with gr.Column(scale=1, min_width=172):
                 with gr.Column(visible=True, elem_classes=["sidebar-examples", "img-gallery"]) as col_img_examples:
                     if _img_ex:
@@ -1403,7 +1179,6 @@ def build_app() -> gr.Blocks:
                     else:
                         gr.Markdown("*No `.exr` examples in `assets/hdris`*")
-        # ── Event wiring ─────────────────────────────────────────────────────
         demo.unload(end_session)
         source_mode.change(switch_asset_source, inputs=[source_mode], outputs=[source_tabs])
@@ -1423,7 +1198,6 @@ def build_app() -> gr.Blocks:
                 outputs=[hdri_preview, status_md],
             )
-        # Same as TRELLIS.2 app.py: only on upload — avoids infinite preprocess loop.
         image_input.upload(
             preprocess_image_only,
             inputs=[image_input],
@@ -1514,11 +1288,11 @@ def _near_launch(*args: Any, **kwargs: Any):
 demo.launch = _near_launch  # type: ignore[method-assign]
 start_tmp_gradio_pruner()
-# ---------------------------------------------------------------------------
-# Entry point
-# ---------------------------------------------------------------------------
 if __name__ == "__main__":
     import argparse

 import os
 import sys
 if not os.environ.get("HF_TOKEN") and not os.environ.get("HUGGING_FACE_HUB_TOKEN"):
     _hub_tok = (os.environ.get("near") or os.environ.get("NEAR") or "").strip()
     if _hub_tok:
         os.environ["HF_TOKEN"] = _hub_tok
+        print("[NeAR] HF_TOKEN from Space secret 'near'.", flush=True)
 try:
     _raw_zerogpu_cap = int(os.environ.get("NEAR_ZEROGPU_HF_CEILING_S", "90"))
 except ValueError:
                 os.environ[_ek] = str(_ZEROGPU_ENV_CAP_S)
         except ValueError:
             pass
+print(f"[NeAR] ZeroGPU cap {_ZEROGPU_ENV_CAP_S}s (NEAR_ZEROGPU_HF_CEILING_S).", flush=True)
 import shutil
 import subprocess
 sys.path.insert(0, "./hy3dshape")
 os.environ.setdefault("ATTN_BACKEND", "xformers")
 os.environ.setdefault("SPCONV_ALGO", "native")
 os.environ.setdefault("TORCH_CUDA_ARCH_LIST", "7.5;8.0;8.6;8.9;9.0")
 from trellis.pipelines import NeARImageToRelightable3DPipeline
 from hy3dshape.pipelines import Hunyuan3DDiTFlowMatchingPipeline  # pyright: ignore[reportMissingImports]
 GPU = spaces.GPU if spaces is not None else (lambda f: f)
 APP_DIR = Path(__file__).resolve().parent
 def _path_is_git_lfs_pointer(p: Path) -> bool:
     try:
         if not p.is_file():
             return False
 def _warn_example_assets() -> None:
     img_dir = APP_DIR / "assets/example_image"
     if not img_dir.is_dir():
         print(
 MAX_SEED = np.iinfo(np.int32).max
 _SESSION_LAST_TOUCH: Dict[str, float] = {}
 _SESSION_TOUCH_LOCK = threading.Lock()
     return d
 def clear_session_dir(req: Optional[gr.Request]) -> str:
     d = ensure_session_dir(req)
     shutil.rmtree(d, ignore_errors=True)
 def _session_dir_latest_mtime(path: Path) -> float:
     try:
         latest = path.stat().st_mtime
     except OSError:
     return None
 _model_lock = threading.Lock()
 PIPELINE: Optional[NeARImageToRelightable3DPipeline] = None
 GEOMETRY_PIPELINE: Optional[Hunyuan3DDiTFlowMatchingPipeline] = None
 _light_preprocess_lock = threading.Lock()
 _light_preprocessor: Any | None = None
+_geometry_on_cuda = False
+_near_on_cuda = False
 _FALLBACK_TONE_MAPPER_CHOICES = ["AgX", "False", "Khronos neutrals", "Filmic", "Khronos glTF PBR"]
+def _truthy_env(name: str, default: str) -> bool:
+    v = (os.environ.get(name) if name in os.environ else default).strip().lower()
+    return v in ("1", "true", "yes", "on")
+_CPU_PRELOAD_AT_START = _truthy_env("NEAR_MODEL_CPU_PRELOAD_AT_START", "1")
+print(
+    f"[NeAR] NEAR_MODEL_CPU_PRELOAD_AT_START={'1' if _CPU_PRELOAD_AT_START else '0'} "
+    "(Hunyuan + NeAR weights on CPU at process start; GPU callbacks only .to(cuda) + infer).",
+    flush=True,
+)
 def _default_tone_mapper_choices() -> list[str]:
     try:
         views = getattr(ToneMapper(), "available_views", None)
 TONE_MAPPER_CHOICES = _default_tone_mapper_choices()
 def _get_light_image_preprocessor():
     global _light_preprocessor
     if _light_preprocessor is not None:
             from hy3dshape.rembg import BackgroundRemover  # pyright: ignore[reportMissingImports]
             _light_preprocessor = BackgroundRemover()
+            print("[NeAR] BackgroundRemover ready.", flush=True)
     return _light_preprocessor
 def _preprocess_image_rgba_light(input_image: Image.Image) -> Image.Image:
     image = _ensure_rgba(input_image)
     has_alpha = False
     if image.mode == "RGBA":
         TONE_MAPPER_CHOICES = [str(v) for v in views]
+def _ensure_geometry_cpu_locked() -> None:
     global GEOMETRY_PIPELINE
     if GEOMETRY_PIPELINE is not None:
+        return
+    hy_id = os.environ.get("NEAR_HUNYUAN_PRETRAINED", "tencent/Hunyuan3D-2.1")
+    t0 = time.time()
+    print(f"[NeAR] Hunyuan geometry on CPU from {hy_id!r}...", flush=True)
+    GEOMETRY_PIPELINE = Hunyuan3DDiTFlowMatchingPipeline.from_pretrained(hy_id, device="cpu")
+    print(f"[NeAR] Hunyuan CPU load {time.time() - t0:.1f}s", flush=True)
+def _ensure_near_cpu_locked() -> None:
     global PIPELINE
     if PIPELINE is not None:
+        return
+    near_id = os.environ.get("NEAR_PRETRAINED", "luh0502/NeAR")
+    t0 = time.time()
+    print(f"[NeAR] NeAR on CPU from {near_id!r}...", flush=True)
+    p = NeARImageToRelightable3DPipeline.from_pretrained(near_id)
+    p.to("cpu")
+    _update_tone_mapper_choices(p.tone_mapper)
+    PIPELINE = p
+    print(f"[NeAR] NeAR CPU load {time.time() - t0:.1f}s", flush=True)
+def ensure_geometry_on_cuda() -> Hunyuan3DDiTFlowMatchingPipeline:
+    global _geometry_on_cuda
     with _model_lock:
+        _ensure_geometry_cpu_locked()
+        assert GEOMETRY_PIPELINE is not None
+        if torch.cuda.is_available() and not _geometry_on_cuda:
+            t0 = time.time()
+            GEOMETRY_PIPELINE.to("cuda")
+            _geometry_on_cuda = True
+            print(f"[NeAR] Hunyuan -> cuda {time.time() - t0:.1f}s", flush=True)
+        return GEOMETRY_PIPELINE
+def ensure_near_on_cuda() -> NeARImageToRelightable3DPipeline:
+    global _near_on_cuda
     with _model_lock:
+        _ensure_near_cpu_locked()
+        assert PIPELINE is not None
+        if torch.cuda.is_available() and not _near_on_cuda:
+            t0 = time.time()
+            PIPELINE.to("cuda")
+            _near_on_cuda = True
+            print(f"[NeAR] NeAR -> cuda {time.time() - t0:.1f}s", flush=True)
+        return PIPELINE
+def _preload_models_cpu_worker() -> None:
+    try:
         t0 = time.time()
+        print("[NeAR] background CPU preload start", flush=True)
+        with _model_lock:
+            _ensure_geometry_cpu_locked()
+            _ensure_near_cpu_locked()
+        print(f"[NeAR] background CPU preload done {time.time() - t0:.1f}s", flush=True)
+    except Exception as exc:
+        print(f"[NeAR] background CPU preload failed: {exc}", flush=True)
+def start_model_cpu_preload_thread() -> None:
+    threading.Thread(
+        target=_preload_models_cpu_worker,
+        daemon=True,
+        name="near-model-cpu-preload",
+    ).start()
 def set_tone_mapper(view_name: str):
+    pipeline = ensure_near_on_cuda()
     if view_name:
         pipeline.setup_tone_mapper(view_name)
     return pipeline
 def _ensure_rgba(img: Image.Image) -> Image.Image:
     if img.mode == "RGBA":
         return img
     if img.mode == "RGB":
     )
 @GPU
 @torch.inference_mode()
 def generate_mesh(
     req: gr.Request,
     progress=gr.Progress(track_tqdm=True),
 ):
+    geometry_pipeline = ensure_geometry_on_cuda()
     session_dir = ensure_session_dir(req)
     if image_input is None:
     req: gr.Request,
     progress=gr.Progress(track_tqdm=True),
 ):
+    pipeline = ensure_near_on_cuda()
     session_dir = ensure_session_dir(req)
     if not asset_state or not asset_state.get("mesh_path"):
     t0 = time.time()
     session_dir = ensure_session_dir(req)
     progress(0.1, desc="Loading SLaT and HDRI")
     pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
     progress(0.5, desc="Rendering")
     views = pipeline.render_view(
         slat, hdri_np,
         yaw_deg=yaw, pitch_deg=pitch, fov=fov, radius=radius,
         hdri_rot_deg=hdri_rot, resolution=int(resolution),
     )
     for key, image in views.items():
         image.save(session_dir / f"preview_{key}.png")
+    print(f"[NeAR] render_preview {time.time() - t0:.1f}s", flush=True)
     msg = (
         f"**Preview done** — "
     t0 = time.time()
     session_dir = ensure_session_dir(req)
     progress(0.1, desc="Loading SLaT and HDRI")
     pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
     progress(0.4, desc="Rendering camera path")
     frames = pipeline.render_camera_path_video(
         slat, hdri_np,
         num_views=int(num_views), fov=fov, radius=radius,
         hdri_rot_deg=hdri_rot, full_video=full_video, shadow_video=shadow_video,
         bg_color=(1, 1, 1), verbose=True,
     )
     video_path = session_dir / ("camera_path_full.mp4" if full_video else "camera_path.mp4")
     imageio.mimsave(video_path, frames, fps=int(fps))
+    print(f"[NeAR] render_camera_video {time.time() - t0:.1f}s", flush=True)
     return str(video_path), f"**Camera path video saved**"
     t0 = time.time()
     session_dir = ensure_session_dir(req)
     progress(0.1, desc="Loading SLaT and HDRI")
     pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
     progress(0.4, desc="Rendering HDRI rotation")
     hdri_roll_frames, render_frames = pipeline.render_hdri_rotation_video(
         slat, hdri_np,
         num_frames=int(num_frames), yaw_deg=yaw, pitch_deg=pitch,
         fov=fov, radius=radius, full_video=full_video, shadow_video=shadow_video,
         bg_color=(1, 1, 1), verbose=True,
     )
     hdri_roll_path = session_dir / "hdri_roll.mp4"
     render_path = session_dir / ("hdri_rotation_full.mp4" if full_video else "hdri_rotation.mp4")
     imageio.mimsave(hdri_roll_path, hdri_roll_frames, fps=int(fps))
     imageio.mimsave(render_path, render_frames, fps=int(fps))
+    print(f"[NeAR] render_hdri_video {time.time() - t0:.1f}s", flush=True)
     return str(hdri_roll_path), str(render_path), "**HDRI rotation video saved**"
     req: gr.Request,
     progress=gr.Progress(track_tqdm=True),
 ):
     t0 = time.time()
     session_dir = ensure_session_dir(req)
     progress(0.1, desc="Loading SLaT and HDRI")
     pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
     progress(0.6, desc="Baking PBR textures")
     glb = pipeline.export_glb_from_slat(
         slat, hdri_np,
         hdri_rot_deg=hdri_rot, base_mesh=None,
         simplify=simplify, texture_size=int(texture_size), fill_holes=True,
     )
     glb_path = session_dir / "near_pbr.glb"
     glb.export(glb_path)
+    print(f"[NeAR] export_glb {time.time() - t0:.1f}s", flush=True)
     return str(glb_path), f"PBR GLB exported: **{glb_path.name}**"
 CUSTOM_CSS = """
 .gradio-container { max-width: 100% !important; width: 100% !important; }
 main.gradio-container { max-width: 100% !important; }
 .gradio-wrap { max-width: 100% !important; }
 )
 def build_app() -> gr.Blocks:
     with gr.Blocks(
         title="NeAR",
         delete_cache=None,
         fill_width=True,
     ) as demo:
         with gr.Row(equal_height=False):
             with gr.Column(scale=1, min_width=360):
                 with gr.Group():
                     slat_button = gr.Button(
                         "② Generate / Load SLaT", variant="primary", min_width=100,
                     )
                 with gr.Group():
                     gr.HTML('<p class="section-kicker">HDRI</p>')
                 with gr.Row():
                     clear_button = gr.Button("Clear Cache", variant="secondary", min_width=100)
             with gr.Column(scale=10, min_width=560):
                 status_md = gr.Markdown(
                                 label="HDRI Roll", autoplay=True, loop=True, height=180,
                             )
             with gr.Column(scale=1, min_width=172):
                 with gr.Column(visible=True, elem_classes=["sidebar-examples", "img-gallery"]) as col_img_examples:
                     if _img_ex:
                     else:
                         gr.Markdown("*No `.exr` examples in `assets/hdris`*")
         demo.unload(end_session)
         source_mode.change(switch_asset_source, inputs=[source_mode], outputs=[source_tabs])
                 outputs=[hdri_preview, status_md],
             )
         image_input.upload(
             preprocess_image_only,
             inputs=[image_input],
 demo.launch = _near_launch  # type: ignore[method-assign]
+if _CPU_PRELOAD_AT_START:
+    start_model_cpu_preload_thread()
 start_tmp_gradio_pruner()
 if __name__ == "__main__":
     import argparse

app_gsplat.py CHANGED Viewed

@@ -1,10 +1,12 @@
 """
-Minimal Hugging Face / ZeroGPU probe for gsplat CUDA rasterization only.
-Switch Space entry in README front matter: ``app_file: app_gsplat.py``.
-Does not import NeAR, trellis, or hy3dshape.
 """
 import os
 import subprocess
 import sys
@@ -17,121 +19,53 @@ import numpy as np
 import torch
 if not os.environ.get("HF_TOKEN") and not os.environ.get("HUGGING_FACE_HUB_TOKEN"):
-    _hub_tok = (os.environ.get("near") or os.environ.get("NEAR") or "").strip()
-    if _hub_tok:
-        os.environ["HF_TOKEN"] = _hub_tok
-        print("[GsplatProbe] HF_TOKEN unset; using Space secret 'near' as HF_TOKEN.", flush=True)
-try:
-    _raw_zerogpu_cap = int(os.environ.get("NEAR_ZEROGPU_HF_CEILING_S", "90"))
-except ValueError:
-    _raw_zerogpu_cap = 90
-_ZEROGPU_ENV_CAP_S = min(max(15, _raw_zerogpu_cap), 120)
-for _ek in ("NEAR_ZEROGPU_MAX_SECONDS", "NEAR_ZEROGPU_DURATION_CAP"):
-    if _ek in os.environ:
-        try:
-            if int(os.environ[_ek]) > _ZEROGPU_ENV_CAP_S:
-                os.environ[_ek] = str(_ZEROGPU_ENV_CAP_S)
-        except ValueError:
-            pass
-print(
-    f"[GsplatProbe] ZeroGPU cap set to {_ZEROGPU_ENV_CAP_S}s. Callbacks use plain spaces.GPU.",
-    flush=True,
-)
 try:
     import spaces  # pyright: ignore[reportMissingImports]
 except ImportError:
     spaces = None
-os.environ.setdefault("TORCH_CUDA_ARCH_LIST", "7.5;8.0;8.6;8.9;9.0")
 GPU = spaces.GPU if spaces is not None else (lambda f: f)
-DEFAULT_PORT = 7860
-_LAST_PROBE_ELAPSED_S: Optional[float] = None
-def _maybe_reinstall_gsplat_wheel() -> None:
-    """Force-install a prebuilt wheel (e.g. from ``near-wheels``) before importing gsplat."""
-    url = (os.environ.get("NEAR_GSPLAT_WHEEL_URL") or "").strip()
-    if not url:
-        return
-    cmd = [
-        sys.executable,
-        "-m",
-        "pip",
-        "install",
-        "--no-cache-dir",
-        "--force-reinstall",
-        url,
-    ]
-    print(f"[GsplatProbe] NEAR_GSPLAT_WHEEL_URL set; pip install: {url}", flush=True)
-    r = subprocess.run(cmd, check=False)
-    if r.returncode != 0:
-        print(
-            f"[GsplatProbe] WARNING: gsplat wheel install failed (exit {r.returncode}).",
-            flush=True,
-        )
-def _maybe_reinstall_gsplat_from_source() -> None:
-    spec = (os.environ.get("NEAR_GSPLAT_SOURCE_SPEC") or "").strip()
-    if not spec:
-        return
-    cmd = [
-        sys.executable,
-        "-m",
-        "pip",
-        "install",
-        "--no-build-isolation",
-        "--no-cache-dir",
-        spec,
-    ]
-    print(f"[GsplatProbe] NEAR_GSPLAT_SOURCE_SPEC set; building gsplat via: {' '.join(cmd)}", flush=True)
-    r = subprocess.run(cmd, check=False)
-    if r.returncode != 0:
-        print(
-            f"[GsplatProbe] WARNING: gsplat install failed (exit {r.returncode}).",
-            flush=True,
-        )
-_maybe_reinstall_gsplat_wheel()
-_maybe_reinstall_gsplat_from_source()
-def _log_gsplat_cuda_backend_status() -> None:
-    try:
-        import gsplat.cuda._backend as _gsplat_be  # pyright: ignore[reportMissingImports]
-        c_mod = getattr(_gsplat_be, "_C", None)
-        if c_mod is None:
-            print(
-                "[GsplatProbe] gsplat CUDA backend _C is None — no usable prebuilt extension and no nvcc. "
-                "Use a prebuilt wheel (see requirements.txt / NEAR_GSPLAT_WHEEL_URL).",
-                flush=True,
-            )
-        else:
-            print("[GsplatProbe] gsplat CUDA backend _C loaded.", flush=True)
-    except Exception as exc:
-        print(f"[GsplatProbe] gsplat backend probe failed: {exc}", flush=True)
-_log_gsplat_cuda_backend_status()
-def _placeholder_preview() -> np.ndarray:
     return np.full((64, 64, 3), 48, dtype=np.uint8)
-def _raster_rgb_ed_once(width: int, height: int) -> tuple[Any, float]:
-    """One RGB+ED pass matching ``app.py`` / NeAR renderer layout. Returns (render_colors, elapsed_s)."""
-    from gsplat.rendering import rasterization as _gsplat_rasterization  # pyright: ignore[reportMissingImports]
     dev = torch.device("cuda")
-    t_w = time.time()
     n = 1
     means = torch.zeros(n, 3, device=dev, dtype=torch.float32)
     quats = torch.tensor([[1.0, 0.0, 0.0, 0.0]], device=dev, dtype=torch.float32)
@@ -145,7 +79,7 @@ def _raster_rgb_ed_once(width: int, height: int) -> tuple[Any, float]:
         dtype=torch.float32,
     )
     backgrounds = torch.zeros(1, 9, device=dev, dtype=torch.float32)
-    render_colors, _render_alphas, _meta = _gsplat_rasterization(
         means=means,
         quats=quats,
         scales=scales,
@@ -164,125 +98,75 @@ def _raster_rgb_ed_once(width: int, height: int) -> tuple[Any, float]:
         packed=False,
     )
     torch.cuda.synchronize()
-    elapsed = time.time() - t_w
-    return render_colors, elapsed
-def _tensor_to_preview_rgb_uint8(render_colors: Any) -> np.ndarray:
-    """First camera / batch plane as HWC RGB uint8 (aligned with ``gaussian_render.py``)."""
     t = render_colors
     if t.dim() == 4:
-        img_hwc = t[0, :, :, :3]
     elif t.dim() == 3:
-        img_hwc = t[:, :, :3]
     else:
-        raise ValueError(f"Unexpected render_colors rank/shape: {tuple(t.shape)}")
-    arr = img_hwc.detach().float().cpu().clamp(0.0, 1.0).numpy()
     return (arr * 255.0).astype(np.uint8)
 @GPU
 @torch.inference_mode()
-def run_gsplat_probe(resolution: int):
-    """Single ZeroGPU task: import path + one RGB+ED raster (JIT on first success)."""
-    global _LAST_PROBE_ELAPSED_S
-    print(
-        "[GsplatProbe] run_gsplat_probe entered "
-        f"(cuda_available={torch.cuda.is_available()})",
-        flush=True,
-    )
     try:
-        if not torch.cuda.is_available():
-            return _placeholder_preview(), "CUDA is not available in this callback."
-        side = int(max(16, min(512, resolution)))
-        prev = _LAST_PROBE_ELAPSED_S
-        try:
-            render_colors, elapsed = _raster_rgb_ed_once(side, side)
-        except Exception:
-            tb = traceback.format_exc()
-            print(tb, flush=True)
-            return _placeholder_preview(), (
-                "**Rasterization failed.** Full traceback (check Space logs too):\n\n"
-                f"```\n{tb}\n```"
-            )
-        _LAST_PROBE_ELAPSED_S = elapsed
-        try:
-            preview = _tensor_to_preview_rgb_uint8(render_colors)
-        except Exception:
-            tb = traceback.format_exc()
-            print(tb, flush=True)
-            return _placeholder_preview(), (
-                f"Raster OK in **{elapsed:.2f}s** but preview unpack failed:\n\n```\n{tb}\n```"
-            )
-        msg = (
-            f"**RGB+ED** raster **{side}x{side}** finished in **{elapsed:.2f}s** "
-            f"(cuda synchronized). Same layout as `app.py` warmup."
-        )
-        if prev is not None:
-            msg += f" Previous run: **{prev:.2f}s**."
-        print(f"[GsplatProbe] probe done in {elapsed:.2f}s", flush=True)
-        return preview, msg
-    except BaseException:
-        tb = traceback.format_exc()
-        print(tb, flush=True)
-        return _placeholder_preview(), f"**Unhandled error:**\n\n```\n{tb}\n```"
 def build_app() -> gr.Blocks:
-    with gr.Blocks(title="gsplat ZeroGPU Probe", delete_cache=None) as demo:
         gr.Markdown(
-            """
-## gsplat ZeroGPU probe
-Isolated **`gsplat.rendering.rasterization`** in **RGB+ED** mode (same tensor layout as NeAR `app.py` warmup).
-- One **Generate** click = one `@spaces.GPU` task (import + JIT + raster).
-- Second click usually faster if JIT already compiled.
-- **`NEAR_GSPLAT_WHEEL_URL`**: force `pip install --force-reinstall` of a prebuilt wheel before import (e.g. your own `near-wheels` build for torch 2.8).
-- **`NEAR_GSPLAT_SOURCE_SPEC`**: optional source build at start (needs nvcc — usually not on HF builder).
-            """
         )
-        resolution = gr.Slider(32, 256, value=64, step=16, label="Square resolution (px)")
-        btn = gr.Button("Run gsplat RGB+ED pass", variant="primary")
-        out_img = gr.Image(label="RGB preview (first 3 channels)", interactive=False, height=320)
-        out_md = gr.Markdown("Click **Run** to start.")
-        btn.click(
-            run_gsplat_probe,
-            inputs=[resolution],
-            outputs=[out_img, out_md],
-        )
     return demo
 demo = build_app()
 demo.queue(max_size=4)
 if __name__ == "__main__":
     import argparse
-    parser = argparse.ArgumentParser()
-    parser.add_argument(
-        "--host",
-        type=str,
-        default=os.environ.get("GRADIO_SERVER_NAME", "0.0.0.0"),
-    )
-    parser.add_argument(
         "--port",
         type=int,
         default=int(os.environ.get("PORT", os.environ.get("GRADIO_SERVER_PORT", str(DEFAULT_PORT)))),
     )
-    parser.add_argument("--share", action="store_true")
-    args = parser.parse_args()
-    demo.launch(
-        server_name=args.host,
-        server_port=args.port,
-        share=args.share,
-    )

 """
+Minimal Gradio app: one gsplat RGB+ED raster pass on CUDA.
+HF Space: set ``app_file: app_gsplat.py`` in README front matter.
+No NeAR / trellis / hy3dshape imports.
 """
+from __future__ import annotations
 import os
 import subprocess
 import sys
 import torch
 if not os.environ.get("HF_TOKEN") and not os.environ.get("HUGGING_FACE_HUB_TOKEN"):
+    _t = (os.environ.get("near") or os.environ.get("NEAR") or "").strip()
+    if _t:
+        os.environ["HF_TOKEN"] = _t
 try:
     import spaces  # pyright: ignore[reportMissingImports]
 except ImportError:
     spaces = None
 GPU = spaces.GPU if spaces is not None else (lambda f: f)
+DEFAULT_PORT = 7863
+_LAST_ELAPSED_S: Optional[float] = None
+# def _maybe_pip_gsplat_wheel() -> None:
+#     url = (os.environ.get("NEAR_GSPLAT_WHEEL_URL") or "").strip()
+#     if not url:
+#         return
+#     cmd = [
+#         sys.executable,
+#         "-m",
+#         "pip",
+#         "install",
+#         "--no-cache-dir",
+#         "--no-deps",
+#         "--force-reinstall",
+#         url,
+#     ]
+#     print(f"[gsplat-app] pip install wheel: {url}", flush=True)
+#     r = subprocess.run(cmd, check=False)
+#     if r.returncode != 0:
+#         print(f"[gsplat-app] wheel install failed (exit {r.returncode})", flush=True)
+# _maybe_pip_gsplat_wheel()
+def _gray_preview() -> np.ndarray:
     return np.full((64, 64, 3), 48, dtype=np.uint8)
+def _raster_rgb_ed(width: int, height: int) -> tuple[Any, float]:
+    from gsplat.rendering import rasterization as rasterize  # pyright: ignore[reportMissingImports]
     dev = torch.device("cuda")
+    t0 = time.time()
     n = 1
     means = torch.zeros(n, 3, device=dev, dtype=torch.float32)
     quats = torch.tensor([[1.0, 0.0, 0.0, 0.0]], device=dev, dtype=torch.float32)
         dtype=torch.float32,
     )
     backgrounds = torch.zeros(1, 9, device=dev, dtype=torch.float32)
+    render_colors, _, _ = rasterize(
         means=means,
         quats=quats,
         scales=scales,
         packed=False,
     )
     torch.cuda.synchronize()
+    return render_colors, time.time() - t0
+def _to_rgb_u8(render_colors: Any) -> np.ndarray:
     t = render_colors
     if t.dim() == 4:
+        img = t[0, :, :, :3]
     elif t.dim() == 3:
+        img = t[:, :, :3]
     else:
+        raise ValueError(f"bad render_colors shape: {tuple(t.shape)}")
+    arr = img.detach().float().cpu().clamp(0.0, 1.0).numpy()
     return (arr * 255.0).astype(np.uint8)
 @GPU
 @torch.inference_mode()
+def run_once(resolution: int):
+    """Single GPU task: gsplat RGB+ED raster (matches NeAR warmup layout)."""
+    global _LAST_ELAPSED_S
+    if not torch.cuda.is_available():
+        return _gray_preview(), "CUDA not available."
+    side = int(max(16, min(512, resolution)))
+    prev = _LAST_ELAPSED_S
     try:
+        render_colors, elapsed = _raster_rgb_ed(side, side)
+    except Exception:
+        return _gray_preview(), f"Raster failed:\n```\n{traceback.format_exc()}\n```"
+    _LAST_ELAPSED_S = elapsed
+    try:
+        preview = _to_rgb_u8(render_colors)
+    except Exception:
+        return _gray_preview(), f"Raster {elapsed:.2f}s but preview failed:\n```\n{traceback.format_exc()}\n```"
+    msg = f"**{side}x{side}** in **{elapsed:.2f}s** (RGB+ED, cuda sync)."
+    if prev is not None:
+        msg += f" Previous: **{prev:.2f}s**."
+    return preview, msg
 def build_app() -> gr.Blocks:
+    with gr.Blocks(title="gsplat probe") as demo:
         gr.Markdown(
+            "One **Run** = one `gsplat.rendering.rasterization` pass (RGB+ED). "
+            "Optional env: **`NEAR_GSPLAT_WHEEL_URL`** = `/resolve/...whl` (installed with `--no-deps`)."
         )
+        res = gr.Slider(32, 256, value=64, step=16, label="Size (px)")
+        go = gr.Button("Run", variant="primary")
+        img = gr.Image(label="RGB", interactive=False, height=320)
+        md = gr.Markdown("—")
+        go.click(run_once, [res], [img, md])
     return demo
 demo = build_app()
 demo.queue(max_size=4)
 if __name__ == "__main__":
     import argparse
+    p = argparse.ArgumentParser()
+    p.add_argument("--host", default=os.environ.get("GRADIO_SERVER_NAME", "0.0.0.0"))
+    p.add_argument(
         "--port",
         type=int,
         default=int(os.environ.get("PORT", os.environ.get("GRADIO_SERVER_PORT", str(DEFAULT_PORT)))),
     )
+    p.add_argument("--share", action="store_true")
+    a = p.parse_args()
+    demo.launch(server_name=a.host, server_port=a.port, share=a.share)

requirements.txt CHANGED Viewed

@@ -1,23 +1,16 @@
-# PyTorch + CUDA 12.8 (aligned with NeAR setup.sh: torch 2.8 + cu128 + matching xformers).
-# nvdiffrast: custom cp310 Linux wheel (no nvcc on HF builder), ABI-matched to torch 2.8 + cu128 (see wheel URL below).
-# pip must install via https://.../resolve/.../file.whl — not /blob/ — see DEPLOY_HF_SPACE.md.
---extra-index-url https://download.pytorch.org/whl/cu128
-torch==2.8.0
-torchvision==0.23.0
-torchaudio==2.8.0
-xformers==0.0.32.post2
 huggingface_hub>=0.26.0
-# Runtime HF mirrors:
-# - `luh0502/near-wheels` stores prebuilt binary wheels referenced directly below.
-# - `luh0502/near-assets` stores torch.hub-compatible auxiliary assets (for example a mirrored DINOv2 repo)
-#   resolved at runtime via `huggingface_hub`, not via `requirements.txt`.
 gradio[oauth,mcp]==6.9.0
 spaces
 websockets>=10.4
 simple_ocio
---find-links https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.8.0_cu128.html
 kaolin
 # NeAR demo / inference (see NeAR setup.sh --basic --demo)
@@ -39,12 +32,7 @@ pymeshfix
 igraph
 transformers==4.57.6
 pyexr
-# gsplat: PyPI publishes only `py3-none-any` (no CUDA). HF Gradio builder has no nvcc, so JIT cannot build `_C`.
-# Official prebuilt Linux cp310 wheels (torch/CUDA-tagged) are listed at https://docs.gsplat.studio/whl/gsplat/
-# Below: pt2.4+cu124 wheel — CUDA runtime is usually compatible with cu128 drivers; PyTorch ABI vs 2.8 may still break
-# (then build a cp310 manylinux wheel against torch 2.8/cu128 and host on `near-wheels`, install via NEAR_GSPLAT_WHEEL_URL).
-gsplat @ https://github.com/nerfstudio-project/gsplat/releases/download/v1.5.3/gsplat-1.5.3%2Bpt24cu124-cp310-cp310-linux_x86_64.whl ; python_version == "3.10" and sys_platform == "linux" and platform_machine == "x86_64"
-gsplat==1.5.3 ; python_version != "3.10" or sys_platform != "linux" or platform_machine != "x86_64"
 pyyaml
 # hy3dshape (vendored Hunyuan3D-2.1)
@@ -69,4 +57,15 @@ spconv-cu120
 git+https://github.com/EasternJournalist/utils3d.git@9a4eb15e4021b67b12c460c7057d642626897ec8
 # nvdiffrast: custom wheel (torch 2.8/cu128 ABI) — pip needs /resolve/, not /blob/
-https://huggingface.co/luh0502/near-wheels/resolve/main/nvdiffrast-0.4.0-cp310-cp310-linux_x86_64.whl

+--extra-index-url https://download.pytorch.org/whl/cu124
+torch==2.4.0
+torchvision==0.19.0
+torchaudio==2.4.0
+xformers==0.0.27.post2
 huggingface_hub>=0.26.0
 gradio[oauth,mcp]==6.9.0
 spaces
 websockets>=10.4
 simple_ocio
+--find-links https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.4.0_cu124.html
 kaolin
 # NeAR demo / inference (see NeAR setup.sh --basic --demo)
 igraph
 transformers==4.57.6
 pyexr
+# gsplat
 pyyaml
 # hy3dshape (vendored Hunyuan3D-2.1)
 git+https://github.com/EasternJournalist/utils3d.git@9a4eb15e4021b67b12c460c7057d642626897ec8
 # nvdiffrast: custom wheel (torch 2.8/cu128 ABI) — pip needs /resolve/, not /blob/
+# https://huggingface.co/luh0502/near-wheels/resolve/main/nvdiffrast-0.4.0-cp310-cp310-linux_x86_64.whl
+https://huggingface.co/spaces/JeffreyXiang/TRELLIS/resolve/main/wheels/nvdiffrast-0.3.3-cp310-cp310-linux_x86_64.whl?download=true
+# gsplat==1.5.3
+# https://huggingface.co/luh0502/near-wheels/resolve/main/gsplat-1.5.3-cp310-cp310-linux_x86_64.whl
+https://huggingface.co/luh0502/near-wheels/resolve/main/gsplat-1.5.3+pt24cu124-cp310-cp310-linux_x86_64.whl
+# https://docs.gsplat.studio/whl/gsplat/gsplat-1.5.3+pt24cu124-cp310-cp310-linux_x86_64.whl

tests/test_app_architecture.py CHANGED Viewed

@@ -48,14 +48,20 @@ class AppArchitectureTests(unittest.TestCase):
         generate_mesh = _get_function(_load_tree(), "generate_mesh")
         called = _called_names(generate_mesh)
-        self.assertIn("ensure_geometry_pipeline", called)
-        self.assertNotIn("ensure_near_pipeline", called)
-    def test_render_preview_warms_up_gsplat_only_on_render_path(self) -> None:
         render_preview = _get_function(_load_tree(), "render_preview")
         called = _called_names(render_preview)
-        self.assertIn("ensure_gsplat_ready", called)
 if __name__ == "__main__":

         generate_mesh = _get_function(_load_tree(), "generate_mesh")
         called = _called_names(generate_mesh)
+        self.assertIn("ensure_geometry_on_cuda", called)
+        self.assertNotIn("ensure_near_on_cuda", called)
+    def test_render_preview_calls_pipeline_render_view(self) -> None:
         render_preview = _get_function(_load_tree(), "render_preview")
         called = _called_names(render_preview)
+        self.assertIn("render_view", called)
+    def test_generate_slat_uses_near_cuda_loader(self) -> None:
+        generate_slat = _get_function(_load_tree(), "generate_slat")
+        called = _called_names(generate_slat)
+        self.assertIn("ensure_near_on_cuda", called)
 if __name__ == "__main__":

tests/test_app_gsplat_architecture.py CHANGED Viewed

@@ -49,17 +49,17 @@ class AppGsplatArchitectureTests(unittest.TestCase):
             )
     def test_raster_helper_calls_gsplat_rasterization(self) -> None:
-        raster = _get_function(_load_tree(), "_raster_rgb_ed_once")
         called = _called_names(raster)
-        self.assertIn("_gsplat_rasterization", called)
     def test_run_probe_is_gpu_decorated_and_logs_entry(self) -> None:
         source = APP_PATH.read_text(encoding="utf-8")
         self.assertIn("@GPU", source)
-        self.assertIn("def run_gsplat_probe", source)
-        self.assertIn("[GsplatProbe] run_gsplat_probe entered", source)
 if __name__ == "__main__":

             )
     def test_raster_helper_calls_gsplat_rasterization(self) -> None:
+        raster = _get_function(_load_tree(), "_raster_rgb_ed")
         called = _called_names(raster)
+        self.assertIn("rasterize", called)
     def test_run_probe_is_gpu_decorated_and_logs_entry(self) -> None:
         source = APP_PATH.read_text(encoding="utf-8")
         self.assertIn("@GPU", source)
+        self.assertIn("def run_once", source)
+        self.assertIn("torch.cuda.is_available()", source)
 if __name__ == "__main__":