luh1124 commited on
Commit
c98c836
Β·
1 Parent(s): 1dcc01e

app: CPU preload Hunyuan+NeAR, cuda move in GPU callbacks; drop gsplat warmup

Browse files

- Background thread loads geometry and NeAR on CPU (NEAR_MODEL_CPU_PRELOAD_AT_START).
- ensure_geometry_on_cuda / ensure_near_on_cuda for first GPU use.
- Remove _warmup_gsplat_rasterization and ensure_gsplat_ready.
- Simplify app_gsplat; gsplat wheel pip uses --no-deps; update docs and tests.

Made-with: Cursor

DEPLOY_HF_SPACE.md CHANGED
@@ -60,10 +60,7 @@ If you maintain a separate template tree (e.g. `NeAR_space`), copy changes **int
60
  - **`import spaces`** (optional `try/except` for local runs without the package).
61
  - Decorate **every Gradio callback that uses CUDA** with **`@spaces.GPU`** (same as [E-RayZer](https://huggingface.co/spaces/qitaoz/E-RayZer): no `duration=` in app code β€” platform defaults apply). This repo aliases it as **`GPU`** in `app.py` and uses **`@GPU`**; locally, without the `spaces` package, it is a no-op. The decorator is effectively a no-op off ZeroGPU per HF docs.
62
  - Keep **page-load defaults and HDRI preview off the heavy model path**. This repo now uses a lightweight CPU image-preprocess path and a CPU-only HDRI preview path, so first page load no longer triggers full model initialization.
63
- - **Lazy-load** large models **inside** GPU callbacks. This repo now splits loading by responsibility:
64
- - **`ensure_geometry_pipeline()`** for Hunyuan3D mesh generation
65
- - **`ensure_near_pipeline()`** for NeAR SLaT/render/export
66
- - **`ensure_gsplat_ready()`** only before the first real render/export path
67
  - **Space Variables**: at the top of `app.py` (before `import spaces`), **`NEAR_ZEROGPU_MAX_SECONDS`** / **`NEAR_ZEROGPU_DURATION_CAP`** are **rewritten in `os.environ`** if they exceed **`NEAR_ZEROGPU_HF_CEILING_S`** (default **90**, max **120**) so values like `300` cannot break the Hub runtime. This does not set per-callback `duration` in Python; it only clamps env vars HF may read.
68
 
69
  ### 2b1. Recommended runtime variable matrix
@@ -75,7 +72,6 @@ If you maintain a separate template tree (e.g. `NeAR_space`), copy changes **int
75
  | `NEAR_DINO_REPO_SUBDIR` | `dinov2` | `dinov2` |
76
  | `NEAR_DINO_MODEL_ID` | leave unset unless your mirror renames hubconf entries | leave unset unless your mirror renames hubconf entries |
77
  | `NEAR_DINO_FILENAME` | optional validation file inside the mirror repo | optional validation file inside the mirror repo |
78
- | `NEAR_GSPLAT_WARMUP` | `0` | `1` |
79
  | `NEAR_GSPLAT_SOURCE_SPEC` | unset unless you have a proven build path | optional if you want build-time source compile |
80
  | `NEAR_ZEROGPU_HF_CEILING_S` | `90` | tune to your tier |
81
  | `NEAR_HYSHAPE_GEOMETRY_CPU_PRELOAD_AT_START` | `1` when Space entry is **`app_hyshape.py`** (default: background thread runs `from_pretrained(..., device="cpu")` at startup β€” **no** `@spaces.GPU`) | `0` to defer CPU load until the first **Generate Mesh** click (inside the GPU callback; longer first click) |
@@ -104,7 +100,6 @@ Similar to [E-RayZer](https://huggingface.co/spaces/qitaoz/E-RayZer), the first
104
  |----------|---------|
105
  | **`NEAR_GSPLAT_WHEEL_URL`** | If set, `app.py` / `app_gsplat.py` runs `pip install --force-reinstall` on this URL **before** importing gsplat/trellis. Use when you host a **cp310 manylinux** wheel on `near-wheels` built against **your exact** PyTorch/CUDA pin (official gsplat wheels top out at **pt2.4+cu124**; see `requirements.txt`). |
106
  | **`NEAR_GSPLAT_SOURCE_SPEC`** | If set, `app.py` runs `pip install --no-build-isolation` on this spec **before** importing `trellis` (e.g. `./third_party/gsplat` after vendoring, or `git+https://github.com/nerfstudio-project/gsplat.git@<tag>`). Needs **nvcc** β€” usually absent on the default Gradio builder. |
107
- | **`NEAR_GSPLAT_WARMUP`** | Default **on** (`1`). After models load on CUDA, runs one **tiny** `RGB+ED` raster pass so the first user preview/video is less likely to hit JIT alone. Set to **`0`** if the extra time risks **ZeroGPU** timeout on the **first** GPU callback. |
108
 
109
  Alternatively, pin a **VCS** gsplat line in `requirements.txt` (e.g. `gsplat @ git+https://...`) so the **Space build** step compiles once; no `NEAR_GSPLAT_SOURCE_SPEC` needed.
110
 
@@ -116,7 +111,7 @@ Alternatively, pin a **VCS** gsplat line in `requirements.txt` (e.g. `gsplat @ g
116
  **Trade-offs vs fixed GPU + Docker**
117
 
118
  - **ZeroGPU + Gradio builder**: the image may **not** include a full CUDA toolkit (`nvcc`, `CUDA_HOME`). **`git+…nvdiffrast`** (source install) often fails here. This repo uses a **prebuilt `nvdiffrast` wheel** URL in `requirements.txt` (see **Β§5**) so the builder only downloads a wheel. If that wheel is ABI-incompatible with our PyTorch pin, build your own wheel or add a **`Dockerfile`** with `nvidia/cuda:*-devel` and fixed GPU.
119
- - **Quota**: visitors consume **daily GPU seconds**. This repo keeps plain `@spaces.GPU` in Python and tunes runtime behavior through environment variables such as `NEAR_GSPLAT_WARMUP` and `NEAR_ZEROGPU_HF_CEILING_S`, rather than setting `duration=` in app code.
120
 
121
  ### 2c. Example gallery empty on the Space (`assets/` β€œnot deployed”)
122
 
 
60
  - **`import spaces`** (optional `try/except` for local runs without the package).
61
  - Decorate **every Gradio callback that uses CUDA** with **`@spaces.GPU`** (same as [E-RayZer](https://huggingface.co/spaces/qitaoz/E-RayZer): no `duration=` in app code β€” platform defaults apply). This repo aliases it as **`GPU`** in `app.py` and uses **`@GPU`**; locally, without the `spaces` package, it is a no-op. The decorator is effectively a no-op off ZeroGPU per HF docs.
62
  - Keep **page-load defaults and HDRI preview off the heavy model path**. This repo now uses a lightweight CPU image-preprocess path and a CPU-only HDRI preview path, so first page load no longer triggers full model initialization.
63
+ - **Model init**: a background thread (when **`NEAR_MODEL_CPU_PRELOAD_AT_START=1`**) loads **Hunyuan + NeAR** on **CPU** at process start (no GPU lease). **`@spaces.GPU`** callbacks call **`ensure_geometry_on_cuda()`** / **`ensure_near_on_cuda()`** to move weights to CUDA once, then run inference. **gsplat** is exercised only when the pipeline renders (first call may still JIT if no prebuilt wheel).
 
 
 
64
  - **Space Variables**: at the top of `app.py` (before `import spaces`), **`NEAR_ZEROGPU_MAX_SECONDS`** / **`NEAR_ZEROGPU_DURATION_CAP`** are **rewritten in `os.environ`** if they exceed **`NEAR_ZEROGPU_HF_CEILING_S`** (default **90**, max **120**) so values like `300` cannot break the Hub runtime. This does not set per-callback `duration` in Python; it only clamps env vars HF may read.
65
 
66
  ### 2b1. Recommended runtime variable matrix
 
72
  | `NEAR_DINO_REPO_SUBDIR` | `dinov2` | `dinov2` |
73
  | `NEAR_DINO_MODEL_ID` | leave unset unless your mirror renames hubconf entries | leave unset unless your mirror renames hubconf entries |
74
  | `NEAR_DINO_FILENAME` | optional validation file inside the mirror repo | optional validation file inside the mirror repo |
 
75
  | `NEAR_GSPLAT_SOURCE_SPEC` | unset unless you have a proven build path | optional if you want build-time source compile |
76
  | `NEAR_ZEROGPU_HF_CEILING_S` | `90` | tune to your tier |
77
  | `NEAR_HYSHAPE_GEOMETRY_CPU_PRELOAD_AT_START` | `1` when Space entry is **`app_hyshape.py`** (default: background thread runs `from_pretrained(..., device="cpu")` at startup β€” **no** `@spaces.GPU`) | `0` to defer CPU load until the first **Generate Mesh** click (inside the GPU callback; longer first click) |
 
100
  |----------|---------|
101
  | **`NEAR_GSPLAT_WHEEL_URL`** | If set, `app.py` / `app_gsplat.py` runs `pip install --force-reinstall` on this URL **before** importing gsplat/trellis. Use when you host a **cp310 manylinux** wheel on `near-wheels` built against **your exact** PyTorch/CUDA pin (official gsplat wheels top out at **pt2.4+cu124**; see `requirements.txt`). |
102
  | **`NEAR_GSPLAT_SOURCE_SPEC`** | If set, `app.py` runs `pip install --no-build-isolation` on this spec **before** importing `trellis` (e.g. `./third_party/gsplat` after vendoring, or `git+https://github.com/nerfstudio-project/gsplat.git@<tag>`). Needs **nvcc** β€” usually absent on the default Gradio builder. |
 
103
 
104
  Alternatively, pin a **VCS** gsplat line in `requirements.txt` (e.g. `gsplat @ git+https://...`) so the **Space build** step compiles once; no `NEAR_GSPLAT_SOURCE_SPEC` needed.
105
 
 
111
  **Trade-offs vs fixed GPU + Docker**
112
 
113
  - **ZeroGPU + Gradio builder**: the image may **not** include a full CUDA toolkit (`nvcc`, `CUDA_HOME`). **`git+…nvdiffrast`** (source install) often fails here. This repo uses a **prebuilt `nvdiffrast` wheel** URL in `requirements.txt` (see **Β§5**) so the builder only downloads a wheel. If that wheel is ABI-incompatible with our PyTorch pin, build your own wheel or add a **`Dockerfile`** with `nvidia/cuda:*-devel` and fixed GPU.
114
+ - **Quota**: visitors consume **daily GPU seconds**. This repo keeps plain `@spaces.GPU` in Python and clamps **`NEAR_ZEROGPU_*`** via **`NEAR_ZEROGPU_HF_CEILING_S`** rather than setting `duration=` in app code.
115
 
116
  ### 2c. Example gallery empty on the Space (`assets/` β€œnot deployed”)
117
 
README.md CHANGED
@@ -51,7 +51,7 @@ This repository combines:
51
  - The Space is temporarily pointed at **`app_gsplat.py`** (gsplat **RGB+ED** raster only) to test JIT / ZeroGPU without NeAR or Hunyuan. Switch **`app_file`** to **`app_hyshape.py`**, **`app.py`**, etc. in the YAML header above as needed.
52
  - **`app_hyshape.py`** (when used as entry): defaults to **`NEAR_HYSHAPE_GEOMETRY_CPU_PRELOAD_AT_START=1`** β€” background **CPU** Hunyuan load at start; **Generate Mesh** pays **GPU move + inference** in `@spaces.GPU`.
53
  - The full `app.py` Space keeps **page-load image defaults** and **HDRI preview** on lightweight CPU paths so the first page visit does not spend the first ZeroGPU allocation on model initialization.
54
- - Runtime loading is split by responsibility: **Hunyuan3D geometry** is loaded only for mesh generation, **NeAR relighting** is loaded only for SLaT/render/export, and **gsplat warmup** is delayed until the first real render.
55
  - Binary wheels and mirrored auxiliary assets are stored separately:
56
  - **`luh0502/near-wheels`**: prebuilt wheels such as `nvdiffrast` and optional future `gsplat` wheels
57
  - **`luh0502/near-assets`**: torch.hub-compatible mirrored auxiliary assets such as the DINOv2 repo used by NeAR/TRELLIS image-conditioning
 
51
  - The Space is temporarily pointed at **`app_gsplat.py`** (gsplat **RGB+ED** raster only) to test JIT / ZeroGPU without NeAR or Hunyuan. Switch **`app_file`** to **`app_hyshape.py`**, **`app.py`**, etc. in the YAML header above as needed.
52
  - **`app_hyshape.py`** (when used as entry): defaults to **`NEAR_HYSHAPE_GEOMETRY_CPU_PRELOAD_AT_START=1`** β€” background **CPU** Hunyuan load at start; **Generate Mesh** pays **GPU move + inference** in `@spaces.GPU`.
53
  - The full `app.py` Space keeps **page-load image defaults** and **HDRI preview** on lightweight CPU paths so the first page visit does not spend the first ZeroGPU allocation on model initialization.
54
+ - **`app.py`**: optional background **CPU** preload of Hunyuan + NeAR (`NEAR_MODEL_CPU_PRELOAD_AT_START`); **`@spaces.GPU`** callbacks move each pipeline to CUDA once, then run inference. **gsplat** is used when the pipeline renders (no separate app-level warmup pass).
55
  - Binary wheels and mirrored auxiliary assets are stored separately:
56
  - **`luh0502/near-wheels`**: prebuilt wheels such as `nvdiffrast` and optional future `gsplat` wheels
57
  - **`luh0502/near-assets`**: torch.hub-compatible mirrored auxiliary assets such as the DINOv2 repo used by NeAR/TRELLIS image-conditioning
app.py CHANGED
@@ -1,20 +1,12 @@
1
  import os
2
  import sys
3
 
4
- # transformers/huggingface_hub authenticate gated repos via HF_TOKEN (or HUGGING_FACE_HUB_TOKEN).
5
- # Space secrets become env vars named exactly like the secret: a secret named "near" sets "near",
6
- # not HF_TOKEN, so Hub calls stay anonymous until we mirror it here.
7
  if not os.environ.get("HF_TOKEN") and not os.environ.get("HUGGING_FACE_HUB_TOKEN"):
8
  _hub_tok = (os.environ.get("near") or os.environ.get("NEAR") or "").strip()
9
  if _hub_tok:
10
  os.environ["HF_TOKEN"] = _hub_tok
11
- print(
12
- "[NeAR] HF_TOKEN unset; using Space secret 'near' as HF_TOKEN. "
13
- "Prefer renaming that secret to HF_TOKEN (standard Hub env name).",
14
- flush=True,
15
- )
16
 
17
- # ZeroGPU: must run before `import spaces`. Space Variables often leave NEAR_* at 300/1800; HF still rejects those.
18
  try:
19
  _raw_zerogpu_cap = int(os.environ.get("NEAR_ZEROGPU_HF_CEILING_S", "90"))
20
  except ValueError:
@@ -27,12 +19,7 @@ for _ek in ("NEAR_ZEROGPU_MAX_SECONDS", "NEAR_ZEROGPU_DURATION_CAP"):
27
  os.environ[_ek] = str(_ZEROGPU_ENV_CAP_S)
28
  except ValueError:
29
  pass
30
- print(
31
- f"[NeAR] ZeroGPU: NEAR_ZEROGPU_MAX_SECONDS / NEAR_ZEROGPU_DURATION_CAP clamped to cap {_ZEROGPU_ENV_CAP_S}s "
32
- f"(adjust NEAR_ZEROGPU_HF_CEILING_S up to 120 if your tier allows). "
33
- f"Gradio callbacks use plain spaces.GPU (platform default duration).",
34
- flush=True,
35
- )
36
 
37
  import shutil
38
  import subprocess
@@ -71,72 +58,12 @@ except Exception:
71
  sys.path.insert(0, "./hy3dshape")
72
  os.environ.setdefault("ATTN_BACKEND", "xformers")
73
  os.environ.setdefault("SPCONV_ALGO", "native")
74
- # Cloud GPUs (T4, A10G, L4, …) vs H100: override with TORCH_CUDA_ARCH_LIST if cutlass/spconv complains.
75
  os.environ.setdefault("TORCH_CUDA_ARCH_LIST", "7.5;8.0;8.6;8.9;9.0")
76
 
77
 
78
- def _maybe_reinstall_gsplat_wheel() -> None:
79
- """Optional: force-install a prebuilt gsplat wheel (e.g. from ``near-wheels``) before trellis import."""
80
-
81
- url = (os.environ.get("NEAR_GSPLAT_WHEEL_URL") or "").strip()
82
- if not url:
83
- return
84
- cmd = [
85
- sys.executable,
86
- "-m",
87
- "pip",
88
- "install",
89
- "--no-cache-dir",
90
- "--force-reinstall",
91
- url,
92
- ]
93
- print(f"[NeAR] NEAR_GSPLAT_WHEEL_URL set; pip install: {url}", flush=True)
94
- r = subprocess.run(cmd, check=False)
95
- if r.returncode != 0:
96
- print(
97
- f"[NeAR] WARNING: gsplat wheel install failed (exit {r.returncode}).",
98
- flush=True,
99
- )
100
-
101
-
102
- def _maybe_reinstall_gsplat_from_source() -> None:
103
- """Optional pip install before importing trellis (E-RayZer-style source build).
104
-
105
- Set ``NEAR_GSPLAT_SOURCE_SPEC`` to a pip requirement, e.g.:
106
- ``./third_party/gsplat`` or ``git+https://github.com/nerfstudio-project/gsplat.git@v1.5.3``
107
-
108
- Compiles CUDA extensions at container start instead of on first rasterization.
109
- """
110
-
111
- spec = (os.environ.get("NEAR_GSPLAT_SOURCE_SPEC") or "").strip()
112
- if not spec:
113
- return
114
- cmd = [
115
- sys.executable,
116
- "-m",
117
- "pip",
118
- "install",
119
- "--no-build-isolation",
120
- "--no-cache-dir",
121
- spec,
122
- ]
123
- print(f"[NeAR] NEAR_GSPLAT_SOURCE_SPEC set; building gsplat via: {' '.join(cmd)}", flush=True)
124
- r = subprocess.run(cmd, check=False)
125
- if r.returncode != 0:
126
- print(
127
- f"[NeAR] WARNING: gsplat install from NEAR_GSPLAT_SOURCE_SPEC failed (exit {r.returncode}).",
128
- flush=True,
129
- )
130
-
131
-
132
- _maybe_reinstall_gsplat_wheel()
133
- _maybe_reinstall_gsplat_from_source()
134
-
135
  from trellis.pipelines import NeARImageToRelightable3DPipeline
136
  from hy3dshape.pipelines import Hunyuan3DDiTFlowMatchingPipeline # pyright: ignore[reportMissingImports]
137
 
138
-
139
- # Hugging Face ZeroGPU: same style as E-RayZer β€” bare ``spaces.GPU`` (no custom duration in app code).
140
  GPU = spaces.GPU if spaces is not None else (lambda f: f)
141
 
142
  APP_DIR = Path(__file__).resolve().parent
@@ -145,8 +72,6 @@ CACHE_DIR.mkdir(exist_ok=True)
145
 
146
 
147
  def _path_is_git_lfs_pointer(p: Path) -> bool:
148
- """True if this path is a tiny Git LFS pointer file (real media was never smudged / pushed)."""
149
-
150
  try:
151
  if not p.is_file():
152
  return False
@@ -159,8 +84,6 @@ def _path_is_git_lfs_pointer(p: Path) -> bool:
159
 
160
 
161
  def _warn_example_assets() -> None:
162
- """Log once if bundled examples are missing or still LFS pointer stubs (common Space deploy issue)."""
163
-
164
  img_dir = APP_DIR / "assets/example_image"
165
  if not img_dir.is_dir():
166
  print(
@@ -186,10 +109,6 @@ DEFAULT_PORT = 7860
186
  MAX_SEED = np.iinfo(np.int32).max
187
 
188
 
189
- # ---------------------------------------------------------------------------
190
- # Session helpers
191
- # ---------------------------------------------------------------------------
192
-
193
  _SESSION_LAST_TOUCH: Dict[str, float] = {}
194
  _SESSION_TOUCH_LOCK = threading.Lock()
195
 
@@ -212,7 +131,6 @@ def ensure_session_dir(req: Optional[gr.Request]) -> Path:
212
  return d
213
 
214
 
215
- @GPU
216
  def clear_session_dir(req: Optional[gr.Request]) -> str:
217
  d = ensure_session_dir(req)
218
  shutil.rmtree(d, ignore_errors=True)
@@ -230,8 +148,6 @@ def end_session(req: gr.Request):
230
 
231
 
232
  def _session_dir_latest_mtime(path: Path) -> float:
233
- """Latest mtime among path and all nested files (best-effort for 'last activity')."""
234
-
235
  try:
236
  latest = path.stat().st_mtime
237
  except OSError:
@@ -321,21 +237,30 @@ def get_file_path(file_obj: Any) -> Optional[str]:
321
  return None
322
 
323
 
324
- # ---------------------------------------------------------------------------
325
- # Model loading (lazy β€” ZeroGPU may have no CUDA until @spaces.GPU runs)
326
- # ---------------------------------------------------------------------------
327
-
328
  _model_lock = threading.Lock()
329
  PIPELINE: Optional[NeARImageToRelightable3DPipeline] = None
330
  GEOMETRY_PIPELINE: Optional[Hunyuan3DDiTFlowMatchingPipeline] = None
331
  _light_preprocess_lock = threading.Lock()
332
  _light_preprocessor: Any | None = None
333
- _gsplat_warmup_done = False
 
334
 
335
- # Dropdown defaults before lazy load; use allow_custom_value for full OCIO view names.
336
  _FALLBACK_TONE_MAPPER_CHOICES = ["AgX", "False", "Khronos neutrals", "Filmic", "Khronos glTF PBR"]
337
 
338
 
 
 
 
 
 
 
 
 
 
 
 
 
 
339
  def _default_tone_mapper_choices() -> list[str]:
340
  try:
341
  views = getattr(ToneMapper(), "available_views", None)
@@ -349,75 +274,6 @@ def _default_tone_mapper_choices() -> list[str]:
349
  TONE_MAPPER_CHOICES = _default_tone_mapper_choices()
350
 
351
 
352
- def _warmup_gsplat_rasterization(device: str) -> None:
353
- """One tiny RGB+ED raster pass so first user render does not pay JIT alone.
354
-
355
- Matches channel layout used in ``trellis/renderers/gaussian_render.py``.
356
- Disable with ``NEAR_GSPLAT_WARMUP=0`` if ZeroGPU budget is too tight.
357
- """
358
-
359
- if device != "cuda":
360
- return
361
- if os.environ.get("NEAR_GSPLAT_WARMUP", "1").strip().lower() in ("0", "false", "no", "off"):
362
- return
363
- try:
364
- from gsplat.rendering import rasterization as _gsplat_rasterization
365
- except Exception as exc:
366
- print(f"[NeAR] gsplat warmup skipped (import): {exc}", flush=True)
367
- return
368
-
369
- dev = torch.device("cuda")
370
- t_w = time.time()
371
- n, h, w = 1, 64, 64
372
- try:
373
- means = torch.zeros(n, 3, device=dev, dtype=torch.float32)
374
- quats = torch.tensor([[1.0, 0.0, 0.0, 0.0]], device=dev, dtype=torch.float32)
375
- scales = torch.ones(n, 3, device=dev, dtype=torch.float32) * 0.02
376
- opacities = torch.ones(n, device=dev, dtype=torch.float32)
377
- colors = torch.ones(n, 8, device=dev, dtype=torch.float32)
378
- viewmat = torch.eye(4, device=dev, dtype=torch.float32)
379
- k = torch.tensor(
380
- [[80.0, 0.0, w * 0.5], [0.0, 80.0, h * 0.5], [0.0, 0.0, 1.0]],
381
- device=dev,
382
- dtype=torch.float32,
383
- )
384
- backgrounds = torch.zeros(1, 9, device=dev, dtype=torch.float32)
385
- _gsplat_rasterization(
386
- means=means,
387
- quats=quats,
388
- scales=scales,
389
- opacities=opacities,
390
- colors=colors,
391
- viewmats=viewmat[None],
392
- Ks=k[None],
393
- backgrounds=backgrounds,
394
- width=w,
395
- height=h,
396
- near_plane=0.01,
397
- far_plane=100.0,
398
- distributed=False,
399
- render_mode="RGB+ED",
400
- rasterize_mode="antialiased",
401
- packed=False,
402
- )
403
- torch.cuda.synchronize()
404
- print(f"[NeAR] gsplat warmup (RGB+ED {w}x{h}) done in {time.time() - t_w:.1f}s", flush=True)
405
- except Exception as exc:
406
- print(f"[NeAR] gsplat warmup skipped (rasterization): {exc}", flush=True)
407
-
408
-
409
- def _runtime_device() -> str:
410
- return "cuda" if torch.cuda.is_available() else "cpu"
411
-
412
-
413
- def _log_timing(stage: str, started_at: float, **extra: object) -> float:
414
- elapsed = time.time() - started_at
415
- details = ", ".join(f"{k}={v}" for k, v in extra.items() if v is not None)
416
- suffix = f" ({details})" if details else ""
417
- print(f"[NeAR] timing {stage}: {elapsed:.1f}s{suffix}", flush=True)
418
- return elapsed
419
-
420
-
421
  def _get_light_image_preprocessor():
422
  global _light_preprocessor
423
  if _light_preprocessor is not None:
@@ -427,12 +283,11 @@ def _get_light_image_preprocessor():
427
  from hy3dshape.rembg import BackgroundRemover # pyright: ignore[reportMissingImports]
428
 
429
  _light_preprocessor = BackgroundRemover()
430
- print("[NeAR] Background remover ready for lightweight image preprocessing.", flush=True)
431
  return _light_preprocessor
432
 
433
 
434
  def _preprocess_image_rgba_light(input_image: Image.Image) -> Image.Image:
435
- """Background-remove, crop, and resize without loading the NeAR pipeline."""
436
  image = _ensure_rgba(input_image)
437
  has_alpha = False
438
  if image.mode == "RGBA":
@@ -488,69 +343,79 @@ def _update_tone_mapper_choices(tone_mapper: Any) -> None:
488
  TONE_MAPPER_CHOICES = [str(v) for v in views]
489
 
490
 
491
- def ensure_geometry_pipeline() -> Hunyuan3DDiTFlowMatchingPipeline:
492
  global GEOMETRY_PIPELINE
493
  if GEOMETRY_PIPELINE is not None:
494
- return GEOMETRY_PIPELINE
495
- with _model_lock:
496
- if GEOMETRY_PIPELINE is not None:
497
- return GEOMETRY_PIPELINE
498
- device = _runtime_device()
499
- hy_id = os.environ.get("NEAR_HUNYUAN_PRETRAINED", "tencent/Hunyuan3D-2.1")
500
- t0 = time.time()
501
- print("[NeAR] Loading Hunyuan3D geometry pipeline...", flush=True)
502
- gp = Hunyuan3DDiTFlowMatchingPipeline.from_pretrained(hy_id, device="cpu")
503
- print(f"[NeAR] Hunyuan3D from_pretrained (cpu): {time.time() - t0:.1f}s", flush=True)
504
- t_move = time.time()
505
- gp.to(device)
506
- print(f"[NeAR] Hunyuan3D moved to {device} in {time.time() - t_move:.1f}s", flush=True)
507
- GEOMETRY_PIPELINE = gp
508
- print(f"[NeAR] Geometry pipeline ready on {device} ({time.time() - t0:.1f}s total).", flush=True)
509
- return GEOMETRY_PIPELINE
510
 
511
 
512
- def ensure_near_pipeline() -> NeARImageToRelightable3DPipeline:
513
  global PIPELINE
514
  if PIPELINE is not None:
515
- return PIPELINE
 
 
 
 
 
 
 
 
 
 
 
 
516
  with _model_lock:
517
- if PIPELINE is not None:
518
- return PIPELINE
519
- device = _runtime_device()
520
- # briaai/RMBG-2.0 is gated: accept the license on the model card, then add HF_TOKEN
521
- # (read) in Space Settings -> Secrets. Never commit tokens into git.
522
- near_id = os.environ.get("NEAR_PRETRAINED", "luh0502/NeAR")
523
- t0 = time.time()
524
- print(f"[NeAR] Loading NeAR relighting pipeline from {near_id!r} on target {device}...", flush=True)
525
- t_stage = time.time()
526
- pipeline = NeARImageToRelightable3DPipeline.from_pretrained(near_id)
527
- _log_timing("ensure_near_pipeline.from_pretrained", t_stage, repo=near_id)
528
- t_move = time.time()
529
- pipeline.to(device)
530
- _log_timing("ensure_near_pipeline.to_device", t_move, device=device)
531
- PIPELINE = pipeline
532
- _update_tone_mapper_choices(pipeline.tone_mapper)
533
- _log_timing("ensure_near_pipeline.total", t0, device=device)
534
- return PIPELINE
535
 
536
 
537
- def ensure_gsplat_ready() -> None:
538
- global _gsplat_warmup_done
539
- if _gsplat_warmup_done:
540
- return
541
  with _model_lock:
542
- if _gsplat_warmup_done:
543
- return
544
- device = _runtime_device()
 
 
 
 
 
 
 
 
 
545
  t0 = time.time()
546
- print(f"[NeAR] Preparing gsplat on {device}...", flush=True)
547
- _warmup_gsplat_rasterization(device)
548
- _gsplat_warmup_done = True
549
- _log_timing("ensure_gsplat_ready.total", t0, device=device)
 
 
 
 
 
 
 
 
 
 
 
550
 
551
 
552
  def set_tone_mapper(view_name: str):
553
- pipeline = ensure_near_pipeline()
554
  if view_name:
555
  pipeline.setup_tone_mapper(view_name)
556
  return pipeline
@@ -577,7 +442,6 @@ def switch_asset_source(mode: str):
577
 
578
 
579
  def _ensure_rgba(img: Image.Image) -> Image.Image:
580
- """Normalize to RGBA so alpha is preserved for mesh (white matte) vs SLaT (black matte)."""
581
  if img.mode == "RGBA":
582
  return img
583
  if img.mode == "RGB":
@@ -602,10 +466,6 @@ def save_slat_npz(slat, save_path: Path):
602
  )
603
 
604
 
605
- # ---------------------------------------------------------------------------
606
- # Core pipeline functions
607
- # ---------------------------------------------------------------------------
608
-
609
  @GPU
610
  @torch.inference_mode()
611
  def generate_mesh(
@@ -613,10 +473,7 @@ def generate_mesh(
613
  req: gr.Request,
614
  progress=gr.Progress(track_tqdm=True),
615
  ):
616
- """Step β‘ : generate Hunyuan3D geometry from an already preprocessed image.
617
- Returns: (state, mesh_glb_path, status)
618
- """
619
- geometry_pipeline = ensure_geometry_pipeline()
620
  session_dir = ensure_session_dir(req)
621
 
622
  if image_input is None:
@@ -657,7 +514,7 @@ def generate_slat(
657
  req: gr.Request,
658
  progress=gr.Progress(track_tqdm=True),
659
  ):
660
- pipeline = ensure_near_pipeline()
661
  session_dir = ensure_session_dir(req)
662
 
663
  if not asset_state or not asset_state.get("mesh_path"):
@@ -749,30 +606,17 @@ def render_preview(
749
  t0 = time.time()
750
  session_dir = ensure_session_dir(req)
751
  progress(0.1, desc="Loading SLaT and HDRI")
752
- t_load = time.time()
753
  pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
754
- load_elapsed = _log_timing("render_preview.load_asset_and_hdri", t_load)
755
- t_gsplat = time.time()
756
- ensure_gsplat_ready()
757
- gsplat_elapsed = _log_timing("render_preview.ensure_gsplat_ready", t_gsplat)
758
 
759
  progress(0.5, desc="Rendering")
760
- t_render = time.time()
761
  views = pipeline.render_view(
762
  slat, hdri_np,
763
  yaw_deg=yaw, pitch_deg=pitch, fov=fov, radius=radius,
764
  hdri_rot_deg=hdri_rot, resolution=int(resolution),
765
  )
766
- render_elapsed = _log_timing("render_preview.render_view", t_render, resolution=int(resolution))
767
  for key, image in views.items():
768
  image.save(session_dir / f"preview_{key}.png")
769
- _log_timing(
770
- "render_preview.total",
771
- t0,
772
- load_asset_and_hdri=f"{load_elapsed:.1f}s",
773
- ensure_gsplat_ready=f"{gsplat_elapsed:.1f}s",
774
- render_view=f"{render_elapsed:.1f}s",
775
- )
776
 
777
  msg = (
778
  f"**Preview done** β€” "
@@ -808,32 +652,18 @@ def render_camera_video(
808
  t0 = time.time()
809
  session_dir = ensure_session_dir(req)
810
  progress(0.1, desc="Loading SLaT and HDRI")
811
- t_load = time.time()
812
  pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
813
- load_elapsed = _log_timing("render_camera_video.load_asset_and_hdri", t_load)
814
- t_gsplat = time.time()
815
- ensure_gsplat_ready()
816
- gsplat_elapsed = _log_timing("render_camera_video.ensure_gsplat_ready", t_gsplat)
817
 
818
  progress(0.4, desc="Rendering camera path")
819
- t_render = time.time()
820
  frames = pipeline.render_camera_path_video(
821
  slat, hdri_np,
822
  num_views=int(num_views), fov=fov, radius=radius,
823
  hdri_rot_deg=hdri_rot, full_video=full_video, shadow_video=shadow_video,
824
  bg_color=(1, 1, 1), verbose=True,
825
  )
826
- render_elapsed = _log_timing("render_camera_video.render_path", t_render, num_views=int(num_views))
827
  video_path = session_dir / ("camera_path_full.mp4" if full_video else "camera_path.mp4")
828
  imageio.mimsave(video_path, frames, fps=int(fps))
829
- _log_timing(
830
- "render_camera_video.total",
831
- t0,
832
- load_asset_and_hdri=f"{load_elapsed:.1f}s",
833
- ensure_gsplat_ready=f"{gsplat_elapsed:.1f}s",
834
- render_path=f"{render_elapsed:.1f}s",
835
- fps=int(fps),
836
- )
837
  return str(video_path), f"**Camera path video saved**"
838
 
839
 
@@ -857,34 +687,20 @@ def render_hdri_video(
857
  t0 = time.time()
858
  session_dir = ensure_session_dir(req)
859
  progress(0.1, desc="Loading SLaT and HDRI")
860
- t_load = time.time()
861
  pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
862
- load_elapsed = _log_timing("render_hdri_video.load_asset_and_hdri", t_load)
863
- t_gsplat = time.time()
864
- ensure_gsplat_ready()
865
- gsplat_elapsed = _log_timing("render_hdri_video.ensure_gsplat_ready", t_gsplat)
866
 
867
  progress(0.4, desc="Rendering HDRI rotation")
868
- t_render = time.time()
869
  hdri_roll_frames, render_frames = pipeline.render_hdri_rotation_video(
870
  slat, hdri_np,
871
  num_frames=int(num_frames), yaw_deg=yaw, pitch_deg=pitch,
872
  fov=fov, radius=radius, full_video=full_video, shadow_video=shadow_video,
873
  bg_color=(1, 1, 1), verbose=True,
874
  )
875
- render_elapsed = _log_timing("render_hdri_video.render_rotation", t_render, num_frames=int(num_frames))
876
  hdri_roll_path = session_dir / "hdri_roll.mp4"
877
  render_path = session_dir / ("hdri_rotation_full.mp4" if full_video else "hdri_rotation.mp4")
878
  imageio.mimsave(hdri_roll_path, hdri_roll_frames, fps=int(fps))
879
  imageio.mimsave(render_path, render_frames, fps=int(fps))
880
- _log_timing(
881
- "render_hdri_video.total",
882
- t0,
883
- load_asset_and_hdri=f"{load_elapsed:.1f}s",
884
- ensure_gsplat_ready=f"{gsplat_elapsed:.1f}s",
885
- render_rotation=f"{render_elapsed:.1f}s",
886
- fps=int(fps),
887
- )
888
  return str(hdri_roll_path), str(render_path), "**HDRI rotation video saved**"
889
 
890
 
@@ -899,46 +715,24 @@ def export_glb(
899
  req: gr.Request,
900
  progress=gr.Progress(track_tqdm=True),
901
  ):
902
- """Returns: (glb_path, status)"""
903
  t0 = time.time()
904
  session_dir = ensure_session_dir(req)
905
  progress(0.1, desc="Loading SLaT and HDRI")
906
- t_load = time.time()
907
  pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
908
- load_elapsed = _log_timing("export_glb.load_asset_and_hdri", t_load)
909
- t_gsplat = time.time()
910
- ensure_gsplat_ready()
911
- gsplat_elapsed = _log_timing("export_glb.ensure_gsplat_ready", t_gsplat)
912
 
913
  progress(0.6, desc="Baking PBR textures")
914
- t_export = time.time()
915
  glb = pipeline.export_glb_from_slat(
916
  slat, hdri_np,
917
  hdri_rot_deg=hdri_rot, base_mesh=None,
918
  simplify=simplify, texture_size=int(texture_size), fill_holes=True,
919
  )
920
- export_elapsed = _log_timing(
921
- "export_glb.export_glb_from_slat",
922
- t_export,
923
- texture_size=int(texture_size),
924
- )
925
  glb_path = session_dir / "near_pbr.glb"
926
  glb.export(glb_path)
927
- _log_timing(
928
- "export_glb.total",
929
- t0,
930
- load_asset_and_hdri=f"{load_elapsed:.1f}s",
931
- ensure_gsplat_ready=f"{gsplat_elapsed:.1f}s",
932
- export_glb_from_slat=f"{export_elapsed:.1f}s",
933
- )
934
  return str(glb_path), f"PBR GLB exported: **{glb_path.name}**"
935
 
936
 
937
- # ---------------------------------------------------------------------------
938
- # CSS
939
- # ---------------------------------------------------------------------------
940
  CUSTOM_CSS = """
941
- /* Use full browser width (was max-width:1600px leaving empty margin on the right) */
942
  .gradio-container { max-width: 100% !important; width: 100% !important; }
943
  main.gradio-container { max-width: 100% !important; }
944
  .gradio-wrap { max-width: 100% !important; }
@@ -1163,13 +957,9 @@ NEAR_GRADIO_THEME = gr.themes.Base(
1163
  )
1164
 
1165
 
1166
- # ---------------------------------------------------------------------------
1167
- # UI
1168
- # ---------------------------------------------------------------------------
1169
  def build_app() -> gr.Blocks:
1170
  with gr.Blocks(
1171
  title="NeAR",
1172
- # (600, 600) deletes Gradio /tmp/gradio uploads in ~10m while the UI may still reference paths.
1173
  delete_cache=None,
1174
  fill_width=True,
1175
  ) as demo:
@@ -1208,9 +998,6 @@ def build_app() -> gr.Blocks:
1208
 
1209
  with gr.Row(equal_height=False):
1210
 
1211
- # ════════════════════════════════════════════════════════════════
1212
- # LEFT β€” controls only (TRELLIS-style narrow column)
1213
- # ════════════════════════════════════════════════════════════════
1214
  with gr.Column(scale=1, min_width=360):
1215
 
1216
  with gr.Group():
@@ -1242,10 +1029,6 @@ def build_app() -> gr.Blocks:
1242
  slat_button = gr.Button(
1243
  "β‘‘ Generate / Load SLaT", variant="primary", min_width=100,
1244
  )
1245
- # gr.HTML(
1246
- # "<div style='font-size:0.78rem;color:#9ca3af;margin-top:0.2rem;'>"
1247
- # "Image mode: run β‘  then β‘‘. SLaT mode: β‘‘ loads file directly.</div>"
1248
- # )
1249
 
1250
  with gr.Group():
1251
  gr.HTML('<p class="section-kicker">HDRI</p>')
@@ -1277,9 +1060,6 @@ def build_app() -> gr.Blocks:
1277
  with gr.Row():
1278
  clear_button = gr.Button("Clear Cache", variant="secondary", min_width=100)
1279
 
1280
- # ═��══════════════════════════════════════════════════════════════
1281
- # CENTER β€” status at top, then Camera & HDRI, then tabs
1282
- # ════════════════════════════════════════════════════════════════
1283
  with gr.Column(scale=10, min_width=560):
1284
 
1285
  status_md = gr.Markdown(
@@ -1363,10 +1143,6 @@ def build_app() -> gr.Blocks:
1363
  label="HDRI Roll", autoplay=True, loop=True, height=180,
1364
  )
1365
 
1366
-
1367
- # ════════════════════════════════════════════════════════════════
1368
- # RIGHT β€” examples sidebar (TRELLIS-style narrow column)
1369
- # ════════════════════════════════════════════════════════════════
1370
  with gr.Column(scale=1, min_width=172):
1371
  with gr.Column(visible=True, elem_classes=["sidebar-examples", "img-gallery"]) as col_img_examples:
1372
  if _img_ex:
@@ -1403,7 +1179,6 @@ def build_app() -> gr.Blocks:
1403
  else:
1404
  gr.Markdown("*No `.exr` examples in `assets/hdris`*")
1405
 
1406
- # ── Event wiring ─────────────────────────────────────────────────────
1407
  demo.unload(end_session)
1408
 
1409
  source_mode.change(switch_asset_source, inputs=[source_mode], outputs=[source_tabs])
@@ -1423,7 +1198,6 @@ def build_app() -> gr.Blocks:
1423
  outputs=[hdri_preview, status_md],
1424
  )
1425
 
1426
- # Same as TRELLIS.2 app.py: only on upload β€” avoids infinite preprocess loop.
1427
  image_input.upload(
1428
  preprocess_image_only,
1429
  inputs=[image_input],
@@ -1514,11 +1288,11 @@ def _near_launch(*args: Any, **kwargs: Any):
1514
 
1515
  demo.launch = _near_launch # type: ignore[method-assign]
1516
 
 
 
 
1517
  start_tmp_gradio_pruner()
1518
 
1519
- # ---------------------------------------------------------------------------
1520
- # Entry point
1521
- # ---------------------------------------------------------------------------
1522
  if __name__ == "__main__":
1523
  import argparse
1524
 
 
1
  import os
2
  import sys
3
 
 
 
 
4
  if not os.environ.get("HF_TOKEN") and not os.environ.get("HUGGING_FACE_HUB_TOKEN"):
5
  _hub_tok = (os.environ.get("near") or os.environ.get("NEAR") or "").strip()
6
  if _hub_tok:
7
  os.environ["HF_TOKEN"] = _hub_tok
8
+ print("[NeAR] HF_TOKEN from Space secret 'near'.", flush=True)
 
 
 
 
9
 
 
10
  try:
11
  _raw_zerogpu_cap = int(os.environ.get("NEAR_ZEROGPU_HF_CEILING_S", "90"))
12
  except ValueError:
 
19
  os.environ[_ek] = str(_ZEROGPU_ENV_CAP_S)
20
  except ValueError:
21
  pass
22
+ print(f"[NeAR] ZeroGPU cap {_ZEROGPU_ENV_CAP_S}s (NEAR_ZEROGPU_HF_CEILING_S).", flush=True)
 
 
 
 
 
23
 
24
  import shutil
25
  import subprocess
 
58
  sys.path.insert(0, "./hy3dshape")
59
  os.environ.setdefault("ATTN_BACKEND", "xformers")
60
  os.environ.setdefault("SPCONV_ALGO", "native")
 
61
  os.environ.setdefault("TORCH_CUDA_ARCH_LIST", "7.5;8.0;8.6;8.9;9.0")
62
 
63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  from trellis.pipelines import NeARImageToRelightable3DPipeline
65
  from hy3dshape.pipelines import Hunyuan3DDiTFlowMatchingPipeline # pyright: ignore[reportMissingImports]
66
 
 
 
67
  GPU = spaces.GPU if spaces is not None else (lambda f: f)
68
 
69
  APP_DIR = Path(__file__).resolve().parent
 
72
 
73
 
74
  def _path_is_git_lfs_pointer(p: Path) -> bool:
 
 
75
  try:
76
  if not p.is_file():
77
  return False
 
84
 
85
 
86
  def _warn_example_assets() -> None:
 
 
87
  img_dir = APP_DIR / "assets/example_image"
88
  if not img_dir.is_dir():
89
  print(
 
109
  MAX_SEED = np.iinfo(np.int32).max
110
 
111
 
 
 
 
 
112
  _SESSION_LAST_TOUCH: Dict[str, float] = {}
113
  _SESSION_TOUCH_LOCK = threading.Lock()
114
 
 
131
  return d
132
 
133
 
 
134
  def clear_session_dir(req: Optional[gr.Request]) -> str:
135
  d = ensure_session_dir(req)
136
  shutil.rmtree(d, ignore_errors=True)
 
148
 
149
 
150
  def _session_dir_latest_mtime(path: Path) -> float:
 
 
151
  try:
152
  latest = path.stat().st_mtime
153
  except OSError:
 
237
  return None
238
 
239
 
 
 
 
 
240
  _model_lock = threading.Lock()
241
  PIPELINE: Optional[NeARImageToRelightable3DPipeline] = None
242
  GEOMETRY_PIPELINE: Optional[Hunyuan3DDiTFlowMatchingPipeline] = None
243
  _light_preprocess_lock = threading.Lock()
244
  _light_preprocessor: Any | None = None
245
+ _geometry_on_cuda = False
246
+ _near_on_cuda = False
247
 
 
248
  _FALLBACK_TONE_MAPPER_CHOICES = ["AgX", "False", "Khronos neutrals", "Filmic", "Khronos glTF PBR"]
249
 
250
 
251
+ def _truthy_env(name: str, default: str) -> bool:
252
+ v = (os.environ.get(name) if name in os.environ else default).strip().lower()
253
+ return v in ("1", "true", "yes", "on")
254
+
255
+
256
+ _CPU_PRELOAD_AT_START = _truthy_env("NEAR_MODEL_CPU_PRELOAD_AT_START", "1")
257
+ print(
258
+ f"[NeAR] NEAR_MODEL_CPU_PRELOAD_AT_START={'1' if _CPU_PRELOAD_AT_START else '0'} "
259
+ "(Hunyuan + NeAR weights on CPU at process start; GPU callbacks only .to(cuda) + infer).",
260
+ flush=True,
261
+ )
262
+
263
+
264
  def _default_tone_mapper_choices() -> list[str]:
265
  try:
266
  views = getattr(ToneMapper(), "available_views", None)
 
274
  TONE_MAPPER_CHOICES = _default_tone_mapper_choices()
275
 
276
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
277
  def _get_light_image_preprocessor():
278
  global _light_preprocessor
279
  if _light_preprocessor is not None:
 
283
  from hy3dshape.rembg import BackgroundRemover # pyright: ignore[reportMissingImports]
284
 
285
  _light_preprocessor = BackgroundRemover()
286
+ print("[NeAR] BackgroundRemover ready.", flush=True)
287
  return _light_preprocessor
288
 
289
 
290
  def _preprocess_image_rgba_light(input_image: Image.Image) -> Image.Image:
 
291
  image = _ensure_rgba(input_image)
292
  has_alpha = False
293
  if image.mode == "RGBA":
 
343
  TONE_MAPPER_CHOICES = [str(v) for v in views]
344
 
345
 
346
+ def _ensure_geometry_cpu_locked() -> None:
347
  global GEOMETRY_PIPELINE
348
  if GEOMETRY_PIPELINE is not None:
349
+ return
350
+ hy_id = os.environ.get("NEAR_HUNYUAN_PRETRAINED", "tencent/Hunyuan3D-2.1")
351
+ t0 = time.time()
352
+ print(f"[NeAR] Hunyuan geometry on CPU from {hy_id!r}...", flush=True)
353
+ GEOMETRY_PIPELINE = Hunyuan3DDiTFlowMatchingPipeline.from_pretrained(hy_id, device="cpu")
354
+ print(f"[NeAR] Hunyuan CPU load {time.time() - t0:.1f}s", flush=True)
 
 
 
 
 
 
 
 
 
 
355
 
356
 
357
+ def _ensure_near_cpu_locked() -> None:
358
  global PIPELINE
359
  if PIPELINE is not None:
360
+ return
361
+ near_id = os.environ.get("NEAR_PRETRAINED", "luh0502/NeAR")
362
+ t0 = time.time()
363
+ print(f"[NeAR] NeAR on CPU from {near_id!r}...", flush=True)
364
+ p = NeARImageToRelightable3DPipeline.from_pretrained(near_id)
365
+ p.to("cpu")
366
+ _update_tone_mapper_choices(p.tone_mapper)
367
+ PIPELINE = p
368
+ print(f"[NeAR] NeAR CPU load {time.time() - t0:.1f}s", flush=True)
369
+
370
+
371
+ def ensure_geometry_on_cuda() -> Hunyuan3DDiTFlowMatchingPipeline:
372
+ global _geometry_on_cuda
373
  with _model_lock:
374
+ _ensure_geometry_cpu_locked()
375
+ assert GEOMETRY_PIPELINE is not None
376
+ if torch.cuda.is_available() and not _geometry_on_cuda:
377
+ t0 = time.time()
378
+ GEOMETRY_PIPELINE.to("cuda")
379
+ _geometry_on_cuda = True
380
+ print(f"[NeAR] Hunyuan -> cuda {time.time() - t0:.1f}s", flush=True)
381
+ return GEOMETRY_PIPELINE
 
 
 
 
 
 
 
 
 
 
382
 
383
 
384
+ def ensure_near_on_cuda() -> NeARImageToRelightable3DPipeline:
385
+ global _near_on_cuda
 
 
386
  with _model_lock:
387
+ _ensure_near_cpu_locked()
388
+ assert PIPELINE is not None
389
+ if torch.cuda.is_available() and not _near_on_cuda:
390
+ t0 = time.time()
391
+ PIPELINE.to("cuda")
392
+ _near_on_cuda = True
393
+ print(f"[NeAR] NeAR -> cuda {time.time() - t0:.1f}s", flush=True)
394
+ return PIPELINE
395
+
396
+
397
+ def _preload_models_cpu_worker() -> None:
398
+ try:
399
  t0 = time.time()
400
+ print("[NeAR] background CPU preload start", flush=True)
401
+ with _model_lock:
402
+ _ensure_geometry_cpu_locked()
403
+ _ensure_near_cpu_locked()
404
+ print(f"[NeAR] background CPU preload done {time.time() - t0:.1f}s", flush=True)
405
+ except Exception as exc:
406
+ print(f"[NeAR] background CPU preload failed: {exc}", flush=True)
407
+
408
+
409
+ def start_model_cpu_preload_thread() -> None:
410
+ threading.Thread(
411
+ target=_preload_models_cpu_worker,
412
+ daemon=True,
413
+ name="near-model-cpu-preload",
414
+ ).start()
415
 
416
 
417
  def set_tone_mapper(view_name: str):
418
+ pipeline = ensure_near_on_cuda()
419
  if view_name:
420
  pipeline.setup_tone_mapper(view_name)
421
  return pipeline
 
442
 
443
 
444
  def _ensure_rgba(img: Image.Image) -> Image.Image:
 
445
  if img.mode == "RGBA":
446
  return img
447
  if img.mode == "RGB":
 
466
  )
467
 
468
 
 
 
 
 
469
  @GPU
470
  @torch.inference_mode()
471
  def generate_mesh(
 
473
  req: gr.Request,
474
  progress=gr.Progress(track_tqdm=True),
475
  ):
476
+ geometry_pipeline = ensure_geometry_on_cuda()
 
 
 
477
  session_dir = ensure_session_dir(req)
478
 
479
  if image_input is None:
 
514
  req: gr.Request,
515
  progress=gr.Progress(track_tqdm=True),
516
  ):
517
+ pipeline = ensure_near_on_cuda()
518
  session_dir = ensure_session_dir(req)
519
 
520
  if not asset_state or not asset_state.get("mesh_path"):
 
606
  t0 = time.time()
607
  session_dir = ensure_session_dir(req)
608
  progress(0.1, desc="Loading SLaT and HDRI")
 
609
  pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
 
 
 
 
610
 
611
  progress(0.5, desc="Rendering")
 
612
  views = pipeline.render_view(
613
  slat, hdri_np,
614
  yaw_deg=yaw, pitch_deg=pitch, fov=fov, radius=radius,
615
  hdri_rot_deg=hdri_rot, resolution=int(resolution),
616
  )
 
617
  for key, image in views.items():
618
  image.save(session_dir / f"preview_{key}.png")
619
+ print(f"[NeAR] render_preview {time.time() - t0:.1f}s", flush=True)
 
 
 
 
 
 
620
 
621
  msg = (
622
  f"**Preview done** β€” "
 
652
  t0 = time.time()
653
  session_dir = ensure_session_dir(req)
654
  progress(0.1, desc="Loading SLaT and HDRI")
 
655
  pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
 
 
 
 
656
 
657
  progress(0.4, desc="Rendering camera path")
 
658
  frames = pipeline.render_camera_path_video(
659
  slat, hdri_np,
660
  num_views=int(num_views), fov=fov, radius=radius,
661
  hdri_rot_deg=hdri_rot, full_video=full_video, shadow_video=shadow_video,
662
  bg_color=(1, 1, 1), verbose=True,
663
  )
 
664
  video_path = session_dir / ("camera_path_full.mp4" if full_video else "camera_path.mp4")
665
  imageio.mimsave(video_path, frames, fps=int(fps))
666
+ print(f"[NeAR] render_camera_video {time.time() - t0:.1f}s", flush=True)
 
 
 
 
 
 
 
667
  return str(video_path), f"**Camera path video saved**"
668
 
669
 
 
687
  t0 = time.time()
688
  session_dir = ensure_session_dir(req)
689
  progress(0.1, desc="Loading SLaT and HDRI")
 
690
  pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
 
 
 
 
691
 
692
  progress(0.4, desc="Rendering HDRI rotation")
 
693
  hdri_roll_frames, render_frames = pipeline.render_hdri_rotation_video(
694
  slat, hdri_np,
695
  num_frames=int(num_frames), yaw_deg=yaw, pitch_deg=pitch,
696
  fov=fov, radius=radius, full_video=full_video, shadow_video=shadow_video,
697
  bg_color=(1, 1, 1), verbose=True,
698
  )
 
699
  hdri_roll_path = session_dir / "hdri_roll.mp4"
700
  render_path = session_dir / ("hdri_rotation_full.mp4" if full_video else "hdri_rotation.mp4")
701
  imageio.mimsave(hdri_roll_path, hdri_roll_frames, fps=int(fps))
702
  imageio.mimsave(render_path, render_frames, fps=int(fps))
703
+ print(f"[NeAR] render_hdri_video {time.time() - t0:.1f}s", flush=True)
 
 
 
 
 
 
 
704
  return str(hdri_roll_path), str(render_path), "**HDRI rotation video saved**"
705
 
706
 
 
715
  req: gr.Request,
716
  progress=gr.Progress(track_tqdm=True),
717
  ):
 
718
  t0 = time.time()
719
  session_dir = ensure_session_dir(req)
720
  progress(0.1, desc="Loading SLaT and HDRI")
 
721
  pipeline, slat, hdri_np = load_asset_and_hdri(asset_state, hdri_file_obj, tone_mapper_name)
 
 
 
 
722
 
723
  progress(0.6, desc="Baking PBR textures")
 
724
  glb = pipeline.export_glb_from_slat(
725
  slat, hdri_np,
726
  hdri_rot_deg=hdri_rot, base_mesh=None,
727
  simplify=simplify, texture_size=int(texture_size), fill_holes=True,
728
  )
 
 
 
 
 
729
  glb_path = session_dir / "near_pbr.glb"
730
  glb.export(glb_path)
731
+ print(f"[NeAR] export_glb {time.time() - t0:.1f}s", flush=True)
 
 
 
 
 
 
732
  return str(glb_path), f"PBR GLB exported: **{glb_path.name}**"
733
 
734
 
 
 
 
735
  CUSTOM_CSS = """
 
736
  .gradio-container { max-width: 100% !important; width: 100% !important; }
737
  main.gradio-container { max-width: 100% !important; }
738
  .gradio-wrap { max-width: 100% !important; }
 
957
  )
958
 
959
 
 
 
 
960
  def build_app() -> gr.Blocks:
961
  with gr.Blocks(
962
  title="NeAR",
 
963
  delete_cache=None,
964
  fill_width=True,
965
  ) as demo:
 
998
 
999
  with gr.Row(equal_height=False):
1000
 
 
 
 
1001
  with gr.Column(scale=1, min_width=360):
1002
 
1003
  with gr.Group():
 
1029
  slat_button = gr.Button(
1030
  "β‘‘ Generate / Load SLaT", variant="primary", min_width=100,
1031
  )
 
 
 
 
1032
 
1033
  with gr.Group():
1034
  gr.HTML('<p class="section-kicker">HDRI</p>')
 
1060
  with gr.Row():
1061
  clear_button = gr.Button("Clear Cache", variant="secondary", min_width=100)
1062
 
 
 
 
1063
  with gr.Column(scale=10, min_width=560):
1064
 
1065
  status_md = gr.Markdown(
 
1143
  label="HDRI Roll", autoplay=True, loop=True, height=180,
1144
  )
1145
 
 
 
 
 
1146
  with gr.Column(scale=1, min_width=172):
1147
  with gr.Column(visible=True, elem_classes=["sidebar-examples", "img-gallery"]) as col_img_examples:
1148
  if _img_ex:
 
1179
  else:
1180
  gr.Markdown("*No `.exr` examples in `assets/hdris`*")
1181
 
 
1182
  demo.unload(end_session)
1183
 
1184
  source_mode.change(switch_asset_source, inputs=[source_mode], outputs=[source_tabs])
 
1198
  outputs=[hdri_preview, status_md],
1199
  )
1200
 
 
1201
  image_input.upload(
1202
  preprocess_image_only,
1203
  inputs=[image_input],
 
1288
 
1289
  demo.launch = _near_launch # type: ignore[method-assign]
1290
 
1291
+ if _CPU_PRELOAD_AT_START:
1292
+ start_model_cpu_preload_thread()
1293
+
1294
  start_tmp_gradio_pruner()
1295
 
 
 
 
1296
  if __name__ == "__main__":
1297
  import argparse
1298
 
app_gsplat.py CHANGED
@@ -1,10 +1,12 @@
1
  """
2
- Minimal Hugging Face / ZeroGPU probe for gsplat CUDA rasterization only.
3
 
4
- Switch Space entry in README front matter: ``app_file: app_gsplat.py``.
5
- Does not import NeAR, trellis, or hy3dshape.
6
  """
7
 
 
 
8
  import os
9
  import subprocess
10
  import sys
@@ -17,121 +19,53 @@ import numpy as np
17
  import torch
18
 
19
  if not os.environ.get("HF_TOKEN") and not os.environ.get("HUGGING_FACE_HUB_TOKEN"):
20
- _hub_tok = (os.environ.get("near") or os.environ.get("NEAR") or "").strip()
21
- if _hub_tok:
22
- os.environ["HF_TOKEN"] = _hub_tok
23
- print("[GsplatProbe] HF_TOKEN unset; using Space secret 'near' as HF_TOKEN.", flush=True)
24
-
25
- try:
26
- _raw_zerogpu_cap = int(os.environ.get("NEAR_ZEROGPU_HF_CEILING_S", "90"))
27
- except ValueError:
28
- _raw_zerogpu_cap = 90
29
- _ZEROGPU_ENV_CAP_S = min(max(15, _raw_zerogpu_cap), 120)
30
- for _ek in ("NEAR_ZEROGPU_MAX_SECONDS", "NEAR_ZEROGPU_DURATION_CAP"):
31
- if _ek in os.environ:
32
- try:
33
- if int(os.environ[_ek]) > _ZEROGPU_ENV_CAP_S:
34
- os.environ[_ek] = str(_ZEROGPU_ENV_CAP_S)
35
- except ValueError:
36
- pass
37
- print(
38
- f"[GsplatProbe] ZeroGPU cap set to {_ZEROGPU_ENV_CAP_S}s. Callbacks use plain spaces.GPU.",
39
- flush=True,
40
- )
41
 
42
  try:
43
  import spaces # pyright: ignore[reportMissingImports]
44
  except ImportError:
45
  spaces = None
46
 
47
- os.environ.setdefault("TORCH_CUDA_ARCH_LIST", "7.5;8.0;8.6;8.9;9.0")
48
-
49
  GPU = spaces.GPU if spaces is not None else (lambda f: f)
50
 
51
- DEFAULT_PORT = 7860
52
-
53
- _LAST_PROBE_ELAPSED_S: Optional[float] = None
54
-
55
-
56
- def _maybe_reinstall_gsplat_wheel() -> None:
57
- """Force-install a prebuilt wheel (e.g. from ``near-wheels``) before importing gsplat."""
58
- url = (os.environ.get("NEAR_GSPLAT_WHEEL_URL") or "").strip()
59
- if not url:
60
- return
61
- cmd = [
62
- sys.executable,
63
- "-m",
64
- "pip",
65
- "install",
66
- "--no-cache-dir",
67
- "--force-reinstall",
68
- url,
69
- ]
70
- print(f"[GsplatProbe] NEAR_GSPLAT_WHEEL_URL set; pip install: {url}", flush=True)
71
- r = subprocess.run(cmd, check=False)
72
- if r.returncode != 0:
73
- print(
74
- f"[GsplatProbe] WARNING: gsplat wheel install failed (exit {r.returncode}).",
75
- flush=True,
76
- )
77
 
78
 
79
- def _maybe_reinstall_gsplat_from_source() -> None:
80
- spec = (os.environ.get("NEAR_GSPLAT_SOURCE_SPEC") or "").strip()
81
- if not spec:
82
- return
83
- cmd = [
84
- sys.executable,
85
- "-m",
86
- "pip",
87
- "install",
88
- "--no-build-isolation",
89
- "--no-cache-dir",
90
- spec,
91
- ]
92
- print(f"[GsplatProbe] NEAR_GSPLAT_SOURCE_SPEC set; building gsplat via: {' '.join(cmd)}", flush=True)
93
- r = subprocess.run(cmd, check=False)
94
- if r.returncode != 0:
95
- print(
96
- f"[GsplatProbe] WARNING: gsplat install failed (exit {r.returncode}).",
97
- flush=True,
98
- )
99
-
100
-
101
- _maybe_reinstall_gsplat_wheel()
102
- _maybe_reinstall_gsplat_from_source()
103
-
104
-
105
- def _log_gsplat_cuda_backend_status() -> None:
106
- try:
107
- import gsplat.cuda._backend as _gsplat_be # pyright: ignore[reportMissingImports]
108
-
109
- c_mod = getattr(_gsplat_be, "_C", None)
110
- if c_mod is None:
111
- print(
112
- "[GsplatProbe] gsplat CUDA backend _C is None β€” no usable prebuilt extension and no nvcc. "
113
- "Use a prebuilt wheel (see requirements.txt / NEAR_GSPLAT_WHEEL_URL).",
114
- flush=True,
115
- )
116
- else:
117
- print("[GsplatProbe] gsplat CUDA backend _C loaded.", flush=True)
118
- except Exception as exc:
119
- print(f"[GsplatProbe] gsplat backend probe failed: {exc}", flush=True)
120
 
121
 
122
- _log_gsplat_cuda_backend_status()
123
 
124
 
125
- def _placeholder_preview() -> np.ndarray:
126
  return np.full((64, 64, 3), 48, dtype=np.uint8)
127
 
128
 
129
- def _raster_rgb_ed_once(width: int, height: int) -> tuple[Any, float]:
130
- """One RGB+ED pass matching ``app.py`` / NeAR renderer layout. Returns (render_colors, elapsed_s)."""
131
- from gsplat.rendering import rasterization as _gsplat_rasterization # pyright: ignore[reportMissingImports]
132
 
133
  dev = torch.device("cuda")
134
- t_w = time.time()
135
  n = 1
136
  means = torch.zeros(n, 3, device=dev, dtype=torch.float32)
137
  quats = torch.tensor([[1.0, 0.0, 0.0, 0.0]], device=dev, dtype=torch.float32)
@@ -145,7 +79,7 @@ def _raster_rgb_ed_once(width: int, height: int) -> tuple[Any, float]:
145
  dtype=torch.float32,
146
  )
147
  backgrounds = torch.zeros(1, 9, device=dev, dtype=torch.float32)
148
- render_colors, _render_alphas, _meta = _gsplat_rasterization(
149
  means=means,
150
  quats=quats,
151
  scales=scales,
@@ -164,125 +98,75 @@ def _raster_rgb_ed_once(width: int, height: int) -> tuple[Any, float]:
164
  packed=False,
165
  )
166
  torch.cuda.synchronize()
167
- elapsed = time.time() - t_w
168
- return render_colors, elapsed
169
 
170
 
171
- def _tensor_to_preview_rgb_uint8(render_colors: Any) -> np.ndarray:
172
- """First camera / batch plane as HWC RGB uint8 (aligned with ``gaussian_render.py``)."""
173
  t = render_colors
174
  if t.dim() == 4:
175
- img_hwc = t[0, :, :, :3]
176
  elif t.dim() == 3:
177
- img_hwc = t[:, :, :3]
178
  else:
179
- raise ValueError(f"Unexpected render_colors rank/shape: {tuple(t.shape)}")
180
- arr = img_hwc.detach().float().cpu().clamp(0.0, 1.0).numpy()
181
  return (arr * 255.0).astype(np.uint8)
182
 
183
 
184
  @GPU
185
  @torch.inference_mode()
186
- def run_gsplat_probe(resolution: int):
187
- """Single ZeroGPU task: import path + one RGB+ED raster (JIT on first success)."""
188
- global _LAST_PROBE_ELAPSED_S
189
-
190
- print(
191
- "[GsplatProbe] run_gsplat_probe entered "
192
- f"(cuda_available={torch.cuda.is_available()})",
193
- flush=True,
194
- )
195
  try:
196
- if not torch.cuda.is_available():
197
- return _placeholder_preview(), "CUDA is not available in this callback."
198
-
199
- side = int(max(16, min(512, resolution)))
200
- prev = _LAST_PROBE_ELAPSED_S
201
-
202
- try:
203
- render_colors, elapsed = _raster_rgb_ed_once(side, side)
204
- except Exception:
205
- tb = traceback.format_exc()
206
- print(tb, flush=True)
207
- return _placeholder_preview(), (
208
- "**Rasterization failed.** Full traceback (check Space logs too):\n\n"
209
- f"```\n{tb}\n```"
210
- )
211
 
212
- _LAST_PROBE_ELAPSED_S = elapsed
213
- try:
214
- preview = _tensor_to_preview_rgb_uint8(render_colors)
215
- except Exception:
216
- tb = traceback.format_exc()
217
- print(tb, flush=True)
218
- return _placeholder_preview(), (
219
- f"Raster OK in **{elapsed:.2f}s** but preview unpack failed:\n\n```\n{tb}\n```"
220
- )
221
 
222
- msg = (
223
- f"**RGB+ED** raster **{side}x{side}** finished in **{elapsed:.2f}s** "
224
- f"(cuda synchronized). Same layout as `app.py` warmup."
225
- )
226
- if prev is not None:
227
- msg += f" Previous run: **{prev:.2f}s**."
228
- print(f"[GsplatProbe] probe done in {elapsed:.2f}s", flush=True)
229
- return preview, msg
230
- except BaseException:
231
- tb = traceback.format_exc()
232
- print(tb, flush=True)
233
- return _placeholder_preview(), f"**Unhandled error:**\n\n```\n{tb}\n```"
234
 
235
 
236
  def build_app() -> gr.Blocks:
237
- with gr.Blocks(title="gsplat ZeroGPU Probe", delete_cache=None) as demo:
238
  gr.Markdown(
239
- """
240
- ## gsplat ZeroGPU probe
241
- Isolated **`gsplat.rendering.rasterization`** in **RGB+ED** mode (same tensor layout as NeAR `app.py` warmup).
242
-
243
- - One **Generate** click = one `@spaces.GPU` task (import + JIT + raster).
244
- - Second click usually faster if JIT already compiled.
245
- - **`NEAR_GSPLAT_WHEEL_URL`**: force `pip install --force-reinstall` of a prebuilt wheel before import (e.g. your own `near-wheels` build for torch 2.8).
246
- - **`NEAR_GSPLAT_SOURCE_SPEC`**: optional source build at start (needs nvcc β€” usually not on HF builder).
247
- """
248
  )
249
- resolution = gr.Slider(32, 256, value=64, step=16, label="Square resolution (px)")
250
- btn = gr.Button("Run gsplat RGB+ED pass", variant="primary")
251
- out_img = gr.Image(label="RGB preview (first 3 channels)", interactive=False, height=320)
252
- out_md = gr.Markdown("Click **Run** to start.")
253
-
254
- btn.click(
255
- run_gsplat_probe,
256
- inputs=[resolution],
257
- outputs=[out_img, out_md],
258
- )
259
-
260
  return demo
261
 
262
 
263
  demo = build_app()
264
  demo.queue(max_size=4)
265
 
266
-
267
  if __name__ == "__main__":
268
  import argparse
269
 
270
- parser = argparse.ArgumentParser()
271
- parser.add_argument(
272
- "--host",
273
- type=str,
274
- default=os.environ.get("GRADIO_SERVER_NAME", "0.0.0.0"),
275
- )
276
- parser.add_argument(
277
  "--port",
278
  type=int,
279
  default=int(os.environ.get("PORT", os.environ.get("GRADIO_SERVER_PORT", str(DEFAULT_PORT)))),
280
  )
281
- parser.add_argument("--share", action="store_true")
282
- args = parser.parse_args()
283
-
284
- demo.launch(
285
- server_name=args.host,
286
- server_port=args.port,
287
- share=args.share,
288
- )
 
1
  """
2
+ Minimal Gradio app: one gsplat RGB+ED raster pass on CUDA.
3
 
4
+ HF Space: set ``app_file: app_gsplat.py`` in README front matter.
5
+ No NeAR / trellis / hy3dshape imports.
6
  """
7
 
8
+ from __future__ import annotations
9
+
10
  import os
11
  import subprocess
12
  import sys
 
19
  import torch
20
 
21
  if not os.environ.get("HF_TOKEN") and not os.environ.get("HUGGING_FACE_HUB_TOKEN"):
22
+ _t = (os.environ.get("near") or os.environ.get("NEAR") or "").strip()
23
+ if _t:
24
+ os.environ["HF_TOKEN"] = _t
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
  try:
27
  import spaces # pyright: ignore[reportMissingImports]
28
  except ImportError:
29
  spaces = None
30
 
 
 
31
  GPU = spaces.GPU if spaces is not None else (lambda f: f)
32
 
33
+ DEFAULT_PORT = 7863
34
+ _LAST_ELAPSED_S: Optional[float] = None
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
 
37
+ # def _maybe_pip_gsplat_wheel() -> None:
38
+ # url = (os.environ.get("NEAR_GSPLAT_WHEEL_URL") or "").strip()
39
+ # if not url:
40
+ # return
41
+ # cmd = [
42
+ # sys.executable,
43
+ # "-m",
44
+ # "pip",
45
+ # "install",
46
+ # "--no-cache-dir",
47
+ # "--no-deps",
48
+ # "--force-reinstall",
49
+ # url,
50
+ # ]
51
+ # print(f"[gsplat-app] pip install wheel: {url}", flush=True)
52
+ # r = subprocess.run(cmd, check=False)
53
+ # if r.returncode != 0:
54
+ # print(f"[gsplat-app] wheel install failed (exit {r.returncode})", flush=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
 
57
+ # _maybe_pip_gsplat_wheel()
58
 
59
 
60
+ def _gray_preview() -> np.ndarray:
61
  return np.full((64, 64, 3), 48, dtype=np.uint8)
62
 
63
 
64
+ def _raster_rgb_ed(width: int, height: int) -> tuple[Any, float]:
65
+ from gsplat.rendering import rasterization as rasterize # pyright: ignore[reportMissingImports]
 
66
 
67
  dev = torch.device("cuda")
68
+ t0 = time.time()
69
  n = 1
70
  means = torch.zeros(n, 3, device=dev, dtype=torch.float32)
71
  quats = torch.tensor([[1.0, 0.0, 0.0, 0.0]], device=dev, dtype=torch.float32)
 
79
  dtype=torch.float32,
80
  )
81
  backgrounds = torch.zeros(1, 9, device=dev, dtype=torch.float32)
82
+ render_colors, _, _ = rasterize(
83
  means=means,
84
  quats=quats,
85
  scales=scales,
 
98
  packed=False,
99
  )
100
  torch.cuda.synchronize()
101
+ return render_colors, time.time() - t0
 
102
 
103
 
104
+ def _to_rgb_u8(render_colors: Any) -> np.ndarray:
 
105
  t = render_colors
106
  if t.dim() == 4:
107
+ img = t[0, :, :, :3]
108
  elif t.dim() == 3:
109
+ img = t[:, :, :3]
110
  else:
111
+ raise ValueError(f"bad render_colors shape: {tuple(t.shape)}")
112
+ arr = img.detach().float().cpu().clamp(0.0, 1.0).numpy()
113
  return (arr * 255.0).astype(np.uint8)
114
 
115
 
116
  @GPU
117
  @torch.inference_mode()
118
+ def run_once(resolution: int):
119
+ """Single GPU task: gsplat RGB+ED raster (matches NeAR warmup layout)."""
120
+ global _LAST_ELAPSED_S
121
+ if not torch.cuda.is_available():
122
+ return _gray_preview(), "CUDA not available."
123
+
124
+ side = int(max(16, min(512, resolution)))
125
+ prev = _LAST_ELAPSED_S
 
126
  try:
127
+ render_colors, elapsed = _raster_rgb_ed(side, side)
128
+ except Exception:
129
+ return _gray_preview(), f"Raster failed:\n```\n{traceback.format_exc()}\n```"
 
 
 
 
 
 
 
 
 
 
 
 
130
 
131
+ _LAST_ELAPSED_S = elapsed
132
+ try:
133
+ preview = _to_rgb_u8(render_colors)
134
+ except Exception:
135
+ return _gray_preview(), f"Raster {elapsed:.2f}s but preview failed:\n```\n{traceback.format_exc()}\n```"
 
 
 
 
136
 
137
+ msg = f"**{side}x{side}** in **{elapsed:.2f}s** (RGB+ED, cuda sync)."
138
+ if prev is not None:
139
+ msg += f" Previous: **{prev:.2f}s**."
140
+ return preview, msg
 
 
 
 
 
 
 
 
141
 
142
 
143
  def build_app() -> gr.Blocks:
144
+ with gr.Blocks(title="gsplat probe") as demo:
145
  gr.Markdown(
146
+ "One **Run** = one `gsplat.rendering.rasterization` pass (RGB+ED). "
147
+ "Optional env: **`NEAR_GSPLAT_WHEEL_URL`** = `/resolve/...whl` (installed with `--no-deps`)."
 
 
 
 
 
 
 
148
  )
149
+ res = gr.Slider(32, 256, value=64, step=16, label="Size (px)")
150
+ go = gr.Button("Run", variant="primary")
151
+ img = gr.Image(label="RGB", interactive=False, height=320)
152
+ md = gr.Markdown("β€”")
153
+ go.click(run_once, [res], [img, md])
 
 
 
 
 
 
154
  return demo
155
 
156
 
157
  demo = build_app()
158
  demo.queue(max_size=4)
159
 
 
160
  if __name__ == "__main__":
161
  import argparse
162
 
163
+ p = argparse.ArgumentParser()
164
+ p.add_argument("--host", default=os.environ.get("GRADIO_SERVER_NAME", "0.0.0.0"))
165
+ p.add_argument(
 
 
 
 
166
  "--port",
167
  type=int,
168
  default=int(os.environ.get("PORT", os.environ.get("GRADIO_SERVER_PORT", str(DEFAULT_PORT)))),
169
  )
170
+ p.add_argument("--share", action="store_true")
171
+ a = p.parse_args()
172
+ demo.launch(server_name=a.host, server_port=a.port, share=a.share)
 
 
 
 
 
requirements.txt CHANGED
@@ -1,23 +1,16 @@
1
- # PyTorch + CUDA 12.8 (aligned with NeAR setup.sh: torch 2.8 + cu128 + matching xformers).
2
- # nvdiffrast: custom cp310 Linux wheel (no nvcc on HF builder), ABI-matched to torch 2.8 + cu128 (see wheel URL below).
3
- # pip must install via https://.../resolve/.../file.whl β€” not /blob/ β€” see DEPLOY_HF_SPACE.md.
4
- --extra-index-url https://download.pytorch.org/whl/cu128
5
- torch==2.8.0
6
- torchvision==0.23.0
7
- torchaudio==2.8.0
8
- xformers==0.0.32.post2
9
 
10
  huggingface_hub>=0.26.0
11
- # Runtime HF mirrors:
12
- # - `luh0502/near-wheels` stores prebuilt binary wheels referenced directly below.
13
- # - `luh0502/near-assets` stores torch.hub-compatible auxiliary assets (for example a mirrored DINOv2 repo)
14
- # resolved at runtime via `huggingface_hub`, not via `requirements.txt`.
15
  gradio[oauth,mcp]==6.9.0
16
  spaces
17
  websockets>=10.4
18
  simple_ocio
19
 
20
- --find-links https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.8.0_cu128.html
21
  kaolin
22
 
23
  # NeAR demo / inference (see NeAR setup.sh --basic --demo)
@@ -39,12 +32,7 @@ pymeshfix
39
  igraph
40
  transformers==4.57.6
41
  pyexr
42
- # gsplat: PyPI publishes only `py3-none-any` (no CUDA). HF Gradio builder has no nvcc, so JIT cannot build `_C`.
43
- # Official prebuilt Linux cp310 wheels (torch/CUDA-tagged) are listed at https://docs.gsplat.studio/whl/gsplat/
44
- # Below: pt2.4+cu124 wheel β€” CUDA runtime is usually compatible with cu128 drivers; PyTorch ABI vs 2.8 may still break
45
- # (then build a cp310 manylinux wheel against torch 2.8/cu128 and host on `near-wheels`, install via NEAR_GSPLAT_WHEEL_URL).
46
- gsplat @ https://github.com/nerfstudio-project/gsplat/releases/download/v1.5.3/gsplat-1.5.3%2Bpt24cu124-cp310-cp310-linux_x86_64.whl ; python_version == "3.10" and sys_platform == "linux" and platform_machine == "x86_64"
47
- gsplat==1.5.3 ; python_version != "3.10" or sys_platform != "linux" or platform_machine != "x86_64"
48
  pyyaml
49
 
50
  # hy3dshape (vendored Hunyuan3D-2.1)
@@ -69,4 +57,15 @@ spconv-cu120
69
  git+https://github.com/EasternJournalist/utils3d.git@9a4eb15e4021b67b12c460c7057d642626897ec8
70
 
71
  # nvdiffrast: custom wheel (torch 2.8/cu128 ABI) β€” pip needs /resolve/, not /blob/
72
- https://huggingface.co/luh0502/near-wheels/resolve/main/nvdiffrast-0.4.0-cp310-cp310-linux_x86_64.whl
 
 
 
 
 
 
 
 
 
 
 
 
1
+ --extra-index-url https://download.pytorch.org/whl/cu124
2
+ torch==2.4.0
3
+ torchvision==0.19.0
4
+ torchaudio==2.4.0
5
+ xformers==0.0.27.post2
 
 
 
6
 
7
  huggingface_hub>=0.26.0
 
 
 
 
8
  gradio[oauth,mcp]==6.9.0
9
  spaces
10
  websockets>=10.4
11
  simple_ocio
12
 
13
+ --find-links https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.4.0_cu124.html
14
  kaolin
15
 
16
  # NeAR demo / inference (see NeAR setup.sh --basic --demo)
 
32
  igraph
33
  transformers==4.57.6
34
  pyexr
35
+ # gsplat
 
 
 
 
 
36
  pyyaml
37
 
38
  # hy3dshape (vendored Hunyuan3D-2.1)
 
57
  git+https://github.com/EasternJournalist/utils3d.git@9a4eb15e4021b67b12c460c7057d642626897ec8
58
 
59
  # nvdiffrast: custom wheel (torch 2.8/cu128 ABI) β€” pip needs /resolve/, not /blob/
60
+ # https://huggingface.co/luh0502/near-wheels/resolve/main/nvdiffrast-0.4.0-cp310-cp310-linux_x86_64.whl
61
+ https://huggingface.co/spaces/JeffreyXiang/TRELLIS/resolve/main/wheels/nvdiffrast-0.3.3-cp310-cp310-linux_x86_64.whl?download=true
62
+
63
+ # gsplat==1.5.3
64
+ # https://huggingface.co/luh0502/near-wheels/resolve/main/gsplat-1.5.3-cp310-cp310-linux_x86_64.whl
65
+ https://huggingface.co/luh0502/near-wheels/resolve/main/gsplat-1.5.3+pt24cu124-cp310-cp310-linux_x86_64.whl
66
+ # https://docs.gsplat.studio/whl/gsplat/gsplat-1.5.3+pt24cu124-cp310-cp310-linux_x86_64.whl
67
+
68
+
69
+
70
+
71
+
tests/test_app_architecture.py CHANGED
@@ -48,14 +48,20 @@ class AppArchitectureTests(unittest.TestCase):
48
  generate_mesh = _get_function(_load_tree(), "generate_mesh")
49
  called = _called_names(generate_mesh)
50
 
51
- self.assertIn("ensure_geometry_pipeline", called)
52
- self.assertNotIn("ensure_near_pipeline", called)
53
 
54
- def test_render_preview_warms_up_gsplat_only_on_render_path(self) -> None:
55
  render_preview = _get_function(_load_tree(), "render_preview")
56
  called = _called_names(render_preview)
57
 
58
- self.assertIn("ensure_gsplat_ready", called)
 
 
 
 
 
 
59
 
60
 
61
  if __name__ == "__main__":
 
48
  generate_mesh = _get_function(_load_tree(), "generate_mesh")
49
  called = _called_names(generate_mesh)
50
 
51
+ self.assertIn("ensure_geometry_on_cuda", called)
52
+ self.assertNotIn("ensure_near_on_cuda", called)
53
 
54
+ def test_render_preview_calls_pipeline_render_view(self) -> None:
55
  render_preview = _get_function(_load_tree(), "render_preview")
56
  called = _called_names(render_preview)
57
 
58
+ self.assertIn("render_view", called)
59
+
60
+ def test_generate_slat_uses_near_cuda_loader(self) -> None:
61
+ generate_slat = _get_function(_load_tree(), "generate_slat")
62
+ called = _called_names(generate_slat)
63
+
64
+ self.assertIn("ensure_near_on_cuda", called)
65
 
66
 
67
  if __name__ == "__main__":
tests/test_app_gsplat_architecture.py CHANGED
@@ -49,17 +49,17 @@ class AppGsplatArchitectureTests(unittest.TestCase):
49
  )
50
 
51
  def test_raster_helper_calls_gsplat_rasterization(self) -> None:
52
- raster = _get_function(_load_tree(), "_raster_rgb_ed_once")
53
  called = _called_names(raster)
54
 
55
- self.assertIn("_gsplat_rasterization", called)
56
 
57
  def test_run_probe_is_gpu_decorated_and_logs_entry(self) -> None:
58
  source = APP_PATH.read_text(encoding="utf-8")
59
 
60
  self.assertIn("@GPU", source)
61
- self.assertIn("def run_gsplat_probe", source)
62
- self.assertIn("[GsplatProbe] run_gsplat_probe entered", source)
63
 
64
 
65
  if __name__ == "__main__":
 
49
  )
50
 
51
  def test_raster_helper_calls_gsplat_rasterization(self) -> None:
52
+ raster = _get_function(_load_tree(), "_raster_rgb_ed")
53
  called = _called_names(raster)
54
 
55
+ self.assertIn("rasterize", called)
56
 
57
  def test_run_probe_is_gpu_decorated_and_logs_entry(self) -> None:
58
  source = APP_PATH.read_text(encoding="utf-8")
59
 
60
  self.assertIn("@GPU", source)
61
+ self.assertIn("def run_once", source)
62
+ self.assertIn("torch.cuda.is_available()", source)
63
 
64
 
65
  if __name__ == "__main__":