Spaces:

lablab-ai-amd-developer-hackathon
/

riprap-nyc

Configuration error

seriffic Claude Opus 4.7 (1M context) commited on about 17 hours ago

Commit

89c4f83

1 Parent(s): d43cf2b

feat: enable EO specialists via remote ML on HF Space

Prithvi-NYC-Pluvial, TerraMind LULC, TerraMind Buildings, and the
eo_chip fetch were all returning 'deps unavailable on this
deployment' on the HF Space because their deps gates required
terratorch (which doesn't fit the HF build sandbox per the four
prior failed attempts documented in the Dockerfile). But these
specialists already have remote-inference paths to the AMD MI300X
via app/inference.py — terratorch is only needed for *local*
inference, not for the chip-fetch + remote-call path that the HF
Space actually uses.

Refactor:

- Add planetary-computer, pystac-client, rioxarray, xarray, einops
to requirements.txt. These are pure-Python or thin numpy/rasterio
wrappers (~10 MB combined) — small enough to fit the HF build
budget that terratorch's 250 MB transitive dep cone overflows.

- prithvi_live._has_required_deps splits into two tiers: chip-fetch
deps (Tier 1, always required) and local-inference deps (Tier 2,
only required when remote inference is unavailable). When remote
is configured (RIPRAP_ML_BACKEND=remote + RIPRAP_ML_BASE_URL set,
which is the default on HF), Tier 2 is skipped and the specialist
runs via the existing app.inference.prithvi_pluvial() path.

terramind_lulc and terramind_buildings inherit the fix transparently:
they consume the upstream eo_chip, and terramind_nyc._run already
tries remote first before checking local deps.

terramind_synthesis is left unchanged — it has no remote path
(DEM-driven LULC synthesis is not in services/riprap-models/main.py)
and continues to skip cleanly on HF.

Verified locally with RIPRAP_ML_BACKEND=remote: prithvi_live and
eo_chip_cache report _DEPS_OK=True.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (3) hide show

app/context/terramind_synthesis.py +8 -11
app/flood_layers/prithvi_live.py +37 -16
requirements.txt +15 -4

app/context/terramind_synthesis.py CHANGED Viewed

@@ -75,15 +75,14 @@ LULC_CLASSES = [
 def _has_required_deps() -> tuple[bool, str | None]:
-    """Probe deps. Distinguishes a *truly missing* package
-    (ModuleNotFoundError) from a *transient race* (other ImportError —
-    typically sklearn's "partially initialized module" from concurrent
-    imports inside the parallel-fanout block).
-    Truly missing returns (False, names). Transient race returns
-    (True, None) — let the caller try again, the import will resolve
-    on the next attempt once the racing thread finishes.
-    """
     missing = []
     for name in ("terratorch", "rasterio"):
         try:
@@ -91,8 +90,6 @@ def _has_required_deps() -> tuple[bool, str | None]:
         except ModuleNotFoundError:
             missing.append(name)
         except ImportError:
-            # sklearn-style partial-init race; treat as available and
-            # let _ensure_model retry. Logged but not surfaced as missing.
             log.debug("terramind: import race on %s, will retry on demand", name)
     return (not missing, ", ".join(missing) if missing else None)

 def _has_required_deps() -> tuple[bool, str | None]:
+    """Probe deps. terramind_synthesis runs only locally (no remote path
+    in app/inference.py for DEM-driven synthesis), so it always needs
+    terratorch. On the HF Space terratorch isn't installed, so this
+    specialist returns a clean `skipped: deps unavailable` outcome.
+    Distinguishes a *truly missing* package (ModuleNotFoundError) from
+    a *transient race* (other ImportError — typically sklearn's
+    "partially initialized module" from concurrent imports)."""
     missing = []
     for name in ("terratorch", "rasterio"):
         try:
         except ModuleNotFoundError:
             missing.append(name)
         except ImportError:
             log.debug("terramind: import race on %s, will retry on demand", name)
     return (not missing, ", ".join(missing) if missing else None)

app/flood_layers/prithvi_live.py CHANGED Viewed

@@ -63,27 +63,48 @@ _INIT_LOCK = threading.Lock()  # serializes lazy load if multiple threads
 def _has_required_deps() -> tuple[bool, str | None]:
-    """Heavy-EO deps (terratorch / planetary_computer / rioxarray /
-    pystac-client / xarray / einops) live in requirements-experiments.txt
-    only — they don't fit Riprap's HF Spaces' Py3.10 dep cone alongside
-    transformers<5 / hf_hub<1 / granite-tsfm<0.3.4 / mellea<0.4.
-    Probe each importable name once at module load. If any are missing,
-    fetch() returns a clean `skipped: deps_unavailable` outcome instead
-    of crashing with a noisy ModuleNotFoundError in the trace. Local
-    dev + AMD path have these installed and the specialist runs."""
-    missing = []
-    for name in ("terratorch", "planetary_computer", "pystac_client",
-                 "rioxarray", "xarray", "einops"):
-        try:
-            __import__(name)
-        except ImportError:
-            missing.append(name)
     if missing:
         return False, ", ".join(missing)
     return True, None
 _DEPS_OK, _DEPS_MISSING = _has_required_deps()

 def _has_required_deps() -> tuple[bool, str | None]:
+    """Probe deps in two tiers.
+    Tier 1 — chip fetching (planetary_computer / pystac_client / rioxarray
+    / xarray / einops) is always required: prithvi_live always pulls a
+    Sentinel-2 chip from Microsoft Planetary Computer regardless of where
+    inference runs.
+    Tier 2 — local inference (terratorch) is only required when remote
+    inference is unavailable. On the HF Space we have remote inference
+    on the AMD MI300X via app/inference.py, so terratorch is not needed
+    even though chip-fetch is.
+    Returns (False, missing) if any required dep is missing. Splitting
+    the gate this way lets the HF Space deployment fetch chips and run
+    remote inference even though it doesn't fit terratorch's transitive
+    dep cone (~250 MB) in the HF build sandbox."""
+    chip_deps = ("planetary_computer", "pystac_client",
+                 "rioxarray", "xarray", "einops")
+    missing = [n for n in chip_deps
+               if not _has_module(n)]
     if missing:
         return False, ", ".join(missing)
+    # Tier 2: only need terratorch if we'd run inference locally.
+    try:
+        from app import inference as _inf
+        if _inf.remote_enabled():
+            return True, None
+    except Exception:
+        pass
+    if not _has_module("terratorch"):
+        return False, "terratorch (local inference)"
     return True, None
+def _has_module(name: str) -> bool:
+    try:
+        __import__(name)
+        return True
+    except ImportError:
+        return False
 _DEPS_OK, _DEPS_MISSING = _has_required_deps()

requirements.txt CHANGED Viewed

@@ -74,10 +74,21 @@ gliner>=0.2.13
 # 30-second pip budget.
 #
 # On HF Spaces the lazy-import path returns clean `skipped: deps
-# unavailable on this deployment` for both prithvi_live and
-# terramind_synthesis steps; the other 14 specialists run normally.
-#   - terratorch / torchgeo / pystac-client / planetary-computer
-#   - rioxarray / xarray / einops
 # Burr FSM
 burr>=0.40

 # 30-second pip budget.
 #
 # On HF Spaces the lazy-import path returns clean `skipped: deps
+# unavailable on this deployment` for terramind_synthesis (which has
+# no remote-inference path); the other EO specialists (prithvi_live,
+# terramind_lulc, terramind_buildings) work via app/inference.py
+# routing to the AMD MI300X droplet, provided we have the chip-fetch
+# deps below — they're small (pure-Python or thin wrappers around
+# numpy/rasterio which we already have) and don't pull terratorch or
+# torchvision binaries.
+#   - planetary-computer / pystac-client: STAC search at Microsoft PC
+#   - rioxarray / xarray: COG band reads
+#   - einops: tensor reshape used by prithvi_live._build_chip
+planetary-computer>=1.0
+pystac-client>=0.7
+rioxarray>=0.15
+xarray>=2024.1
+einops>=0.7
 # Burr FSM
 burr>=0.40