Spaces:

lablab-ai-amd-developer-hackathon
/

riprap-nyc

Running

seriffic Claude Opus 4.7 (1M context) commited on 3 days ago

Commit

599cc5c

1 Parent(s): 872b14a

Stones C3: add TerraMind-NYC adapter wrapper (lulc + buildings)

Adds app/context/terramind_nyc.py wrapping the Apache-2.0 LoRA family
published at msradam/TerraMind-NYC-Adapters. Exposes two specialist
entry points consumable by the FSM:

lulc(s2l2a, s1rtc, dem) -> 5-class macro NYC land-cover
buildings(s2l2a, s1rtc, dem) -> binary NYC building-footprint

Per-call shape:
{ok, n_pixels, shape, class_fractions, dominant_class, dominant_pct,
adapter, repo, elapsed_s} (lulc)
{ok, n_pixels, shape, pct_buildings, n_building_components,
class_labels, adapter, repo, elapsed_s} (buildings)

Lazy load — adapter weights pull from HF Hub on first call and cache
to ~/.cache/huggingface/hub. Base TerraMind 1.0 weights are downloaded
by terratorch's EncoderDecoderFactory, not redistributed by us.

CHIP-SIZE TRAP. TerraMind's positional embeddings don't generalise off
224x224. Inference goes through terratorch.tasks.tiled_inference with
a 224x224 / 128 stride window so chip sizes != 224 are handled
correctly — calling task.model({...}) directly on a larger chip
returns silent garbage. The dict-of-modalities form is supported by
terratorch >= the version pinned in this repo.

Gated by RIPRAP_TERRAMIND_NYC_ENABLE (default 1). Deployments without
terratorch / peft / safetensors / huggingface_hub installed silently
no-op via the same skipped-result shape every other heavy specialist
emits — no noisy ModuleNotFoundError in the FSM trace.

C4 wires this into the FSM as step_terramind_lulc and
step_terramind_buildings; the chip-cache (so a single S2L2A+S1RTC+DEM
fetch per query feeds both calls) lives in C4.

tests/test_terramind_nyc.py exercises the gate paths, public API,
and result-dict shapes without downloading any weights.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (2) hide show

app/context/terramind_nyc.py +360 -0
tests/test_terramind_nyc.py +112 -0

app/context/terramind_nyc.py ADDED Viewed

	@@ -0,0 +1,360 @@

+"""TerraMind-NYC adapters — LULC and Buildings inference for NYC chips.
+Wraps the Apache-2.0 [`msradam/TerraMind-NYC-Adapters`](https://huggingface.co/msradam/TerraMind-NYC-Adapters)
+LoRA family fine-tuned on NYC EO chips (Sentinel-2 L2A + Sentinel-1 RTC
++ Copernicus DEM, temporal stack of 4) on AMD MI300X via AMD Developer
+Cloud. Exposes two specialist entry points:
+    lulc(s2l2a, s1rtc, dem)       -> 5-class macro NYC LULC mask
+    buildings(s2l2a, s1rtc, dem)  -> binary NYC building footprint mask
+The base TerraMind 1.0 weights are downloaded by terratorch on first
+call; the LoRA adapter + UNet decoder weights come from the HF repo and
+are cached to `~/.cache/huggingface/hub`.
+CHIP-SIZE TRAP. TerraMind's positional embeddings don't generalise off
+its training resolution (224×224). Calling `task.model({...})` on a
+chip ≠ 224×224 produces silent garbage. We therefore wrap inference
+with `terratorch.tasks.tiled_inference.tiled_inference`, which slides
+a 224×224 crop window across the chip and stitches per-window logits.
+This matches the patch in
+`experiments/18_terramind_nyc_lora/shared/inference_ensemble.py` that
+the plan flags as required for production.
+Gated by RIPRAP_TERRAMIND_NYC_ENABLE — deployments without the deps
+installed (HF Spaces' Py3.10 cone, plain Ollama dev VMs) silently no-op
+through the same skipped-result shape every other heavy specialist
+emits.
+This module does NOT fetch its own S2/S1/DEM chips. C4 wires it into
+the FSM with a shared chip cache so the LULC and Buildings calls
+don't each refetch ~150 MB of imagery.
+"""
+from __future__ import annotations
+import logging
+import os
+import threading
+import time
+from typing import Any
+log = logging.getLogger("riprap.terramind_nyc")
+ENABLE = os.environ.get("RIPRAP_TERRAMIND_NYC_ENABLE", "1").lower() in ("1", "true", "yes")
+DEVICE = os.environ.get("RIPRAP_TERRAMIND_NYC_DEVICE", "cpu")
+ADAPTERS_REPO = "msradam/TerraMind-NYC-Adapters"
+# Per-task config knobs the HF README's quick-start fixes for these
+# adapters. Mirrored from experiments/18_terramind_nyc_lora/adapters/*/
+# config.yaml so a single source of truth lives next to the inference
+# code rather than being scraped from YAML at runtime.
+ADAPTER_SPECS: dict[str, dict[str, Any]] = {
+    "lulc": {
+        "subdir": "lulc_nyc",
+        "num_classes": 5,
+        "class_labels": [
+            "Trees / vegetation",
+            "Cropland",
+            "Built / impervious",
+            "Bare ground",
+            "Water",
+        ],
+    },
+    "buildings": {
+        "subdir": "buildings_nyc",
+        "num_classes": 2,
+        # The decoder emits class 0 = background, class 1 = building.
+        "class_labels": ["Background", "Building footprint"],
+    },
+}
+# Tile-window size — TerraMind's training resolution. Stride < window
+# yields overlap (smooths seams from window-boundary classification
+# noise); 96 px overlap matches the experiments/18 ensemble.
+TILE_SIZE = 224
+TILE_STRIDE = 128
+# One-shot lazy-init guards. The base TerraMind weights are heavy
+# (~1.6 GB) and we want to load them once across LULC and Buildings.
+_INIT_LOCK = threading.Lock()
+_BASE_LOADED = False
+_ADAPTERS: dict[str, Any] = {}  # name -> built terratorch task on DEVICE
+def _has_required_deps() -> tuple[bool, str | None]:
+    """Probe the heavy-EO deps. Same shape as prithvi_live's check —
+    a missing dep (terratorch / peft / safetensors / hf_hub) returns a
+    clean `skipped: deps_unavailable` outcome instead of a noisy
+    ModuleNotFoundError in the trace."""
+    missing: list[str] = []
+    for name in ("terratorch", "peft", "safetensors", "huggingface_hub",
+                 "torch", "yaml"):
+        try:
+            __import__(name)
+        except ImportError:
+            missing.append(name)
+    if missing:
+        return False, ", ".join(missing)
+    return True, None
+_DEPS_OK, _DEPS_MISSING = _has_required_deps()
+def _ensure_adapter(adapter_name: str):
+    """Build the terratorch SemanticSegmentationTask, inject the LoRA
+    scaffold, load the published Δ + decoder weights, return the task.
+    Per-task tasks share the TerraMind base inside terratorch's model
+    factory — calling SemanticSegmentationTask twice loads the base
+    twice in fp32 (~3.3 GB resident on CPU). For a two-task family this
+    is acceptable; we don't need the cross-task weight sharing the
+    experiments/18 ensemble does. If memory becomes a problem, swap
+    this for a single-task / hot-swap-adapter implementation.
+    """
+    if adapter_name not in ADAPTER_SPECS:
+        raise KeyError(f"unknown adapter {adapter_name!r}; "
+                       f"expected one of {list(ADAPTER_SPECS)}")
+    if adapter_name in _ADAPTERS:
+        return _ADAPTERS[adapter_name]
+    with _INIT_LOCK:
+        if adapter_name in _ADAPTERS:
+            return _ADAPTERS[adapter_name]
+        spec = ADAPTER_SPECS[adapter_name]
+        log.info("terramind_nyc: building task for %s", adapter_name)
+        from huggingface_hub import snapshot_download
+        from peft import LoraConfig, inject_adapter_in_model
+        from safetensors.torch import load_file
+        from terratorch.tasks import SemanticSegmentationTask
+        # 1. Pull the requested adapter subtree from the HF repo.
+        adapter_root = snapshot_download(
+            ADAPTERS_REPO,
+            allow_patterns=[f"{spec['subdir']}/*"],
+        )
+        # 2. Build the standard terratorch task with the same model_args
+        #    the published HF_README quick-start uses.
+        task = SemanticSegmentationTask(
+            model_factory="EncoderDecoderFactory",
+            model_args=dict(
+                backbone="terramind_v1_base",
+                backbone_pretrained=True,
+                backbone_modalities=["S2L2A", "S1RTC", "DEM"],
+                backbone_use_temporal=True,
+                backbone_temporal_pooling="concat",
+                backbone_temporal_n_timestamps=4,
+                necks=[
+                    {"name": "SelectIndices", "indices": [2, 5, 8, 11]},
+                    {"name": "ReshapeTokensToImage", "remove_cls_token": False},
+                    {"name": "LearnedInterpolateToPyramidal"},
+                ],
+                decoder="UNetDecoder",
+                decoder_channels=[512, 256, 128, 64],
+                head_dropout=0.1,
+                num_classes=spec["num_classes"],
+            ),
+            loss="ce", lr=1e-4, freeze_backbone=False, freeze_decoder=False,
+        )
+        # 3. Inject the LoRA scaffold the adapter weights were trained
+        #    against. Same hyperparameters every adapter in this family
+        #    used (see experiments/18 adapters/_template/config.yaml).
+        inject_adapter_in_model(LoraConfig(
+            r=16, lora_alpha=32, lora_dropout=0.05,
+            target_modules=["attn.qkv", "attn.proj"], bias="none",
+        ), task.model.encoder)
+        # 4. Restore Δ matrices (encoder LoRA) and the decoder/neck/head
+        #    weights from the safetensors bundle. The encoder.* prefix
+        #    is stripped because the encoder state-dict is rooted at
+        #    the encoder module, not the task.
+        adapter_dir = f"{adapter_root}/{spec['subdir']}"
+        lora_state = load_file(f"{adapter_dir}/adapter_model.safetensors")
+        head_state = load_file(f"{adapter_dir}/decoder_head.safetensors")
+        encoder_state = {
+            k.removeprefix("encoder."): v
+            for k, v in lora_state.items() if k.startswith("encoder.")
+        }
+        task.model.encoder.load_state_dict(encoder_state, strict=False)
+        for sub in ("decoder", "neck", "head", "aux_heads"):
+            sub_state = {
+                k[len(sub) + 1:]: v
+                for k, v in head_state.items() if k.startswith(sub + ".")
+            }
+            if sub_state and hasattr(task.model, sub):
+                getattr(task.model, sub).load_state_dict(sub_state,
+                                                          strict=False)
+        # 5. Move to the configured device. CUDA only if the caller
+        #    asked AND a CUDA device is actually available — silently
+        #    fall back to CPU otherwise.
+        target_device = DEVICE
+        if target_device == "cuda":
+            import torch
+            if not torch.cuda.is_available():
+                log.warning("terramind_nyc: CUDA unavailable, falling back to CPU")
+                target_device = "cpu"
+        task = task.to(target_device).eval()
+        _ADAPTERS[adapter_name] = task
+        log.info("terramind_nyc: %s ready on %s", adapter_name, target_device)
+        return task
+def _tiled_predict(task, modality_chips: dict, num_classes: int):
+    """Run the task's encoder-decoder forward in 224×224 tiles, returning
+    a (1, num_classes, H, W) logits tensor stitched from the windows.
+    TerraMind's positional embeddings are tied to the 224×224 training
+    resolution. terratorch's tiled_inference helper slides a window
+    across the input modalities (it accepts a dict of per-modality
+    tensors as long as all modalities share H×W), runs the model on
+    each crop, and averages overlapping logits. Without it, larger
+    chips return silent garbage; smaller chips error on the encoder
+    ViT.
+    """
+    import torch
+    from terratorch.tasks.tiled_inference import tiled_inference
+    # tiled_inference invokes `model_forward(patch)` per tile. The task
+    # model returns a ModelOutput-like with .output OR a plain tensor;
+    # coerce to tensor either way.
+    def _forward(x, **_extra):
+        out = task.model(x)
+        return out.output if hasattr(out, "output") else out
+    with torch.no_grad():
+        logits = tiled_inference(
+            _forward,
+            modality_chips,
+            out_channels=num_classes,
+            h_crop=TILE_SIZE,
+            w_crop=TILE_SIZE,
+            h_stride=TILE_STRIDE,
+            w_stride=TILE_STRIDE,
+            average_patches=True,
+            blend_overlaps=True,
+            padding="reflect",
+        )
+    return logits
+def _summarize_lulc(pred, class_labels: list[str]) -> dict[str, Any]:
+    """Per-class pixel fraction + dominant class from an integer mask."""
+    import numpy as np
+    pred_np = pred.detach().cpu().numpy() if hasattr(pred, "detach") else np.asarray(pred)
+    flat = pred_np.reshape(-1)
+    n = max(int(flat.size), 1)
+    fractions: dict[str, float] = {}
+    for idx, label in enumerate(class_labels):
+        pct = 100.0 * float((flat == idx).sum()) / n
+        if pct > 0:
+            fractions[label] = round(pct, 2)
+    dominant_idx = int(max(range(len(class_labels)),
+                            key=lambda i: int((flat == i).sum())))
+    return {
+        "ok": True,
+        "n_pixels": int(flat.size),
+        "shape": list(pred_np.shape),
+        "class_fractions": fractions,
+        "dominant_class": class_labels[dominant_idx],
+        "dominant_pct": fractions.get(class_labels[dominant_idx], 0.0),
+    }
+def _summarize_buildings(pred, class_labels: list[str]) -> dict[str, Any]:
+    """Building-pixel coverage + simple connected-component count."""
+    import numpy as np
+    pred_np = pred.detach().cpu().numpy() if hasattr(pred, "detach") else np.asarray(pred)
+    mask = (pred_np == 1).astype("uint8")
+    n_total = max(int(mask.size), 1)
+    pct_built = 100.0 * float(mask.sum()) / n_total
+    # Connected-component count is a cheap signal of "how many distinct
+    # buildings does this chip cover" — useful for the briefing without
+    # paying for full polygonisation.
+    n_components: int | None = None
+    try:
+        from scipy.ndimage import label
+        _, n_components = label(mask)
+    except Exception:  # scipy is optional in some HF Spaces build cones
+        log.debug("terramind_nyc: scipy.ndimage unavailable; "
+                  "skipping component count")
+    return {
+        "ok": True,
+        "n_pixels": int(mask.size),
+        "shape": list(mask.shape),
+        "pct_buildings": round(pct_built, 2),
+        "n_building_components": n_components,
+        "class_labels": class_labels,
+    }
+def _run(adapter_name: str, modality_chips: dict, summarizer):
+    """Common boilerplate: gate, time, load, tiled predict, summarize."""
+    if not ENABLE:
+        return {"ok": False,
+                "skipped": "RIPRAP_TERRAMIND_NYC_ENABLE=0"}
+    if not _DEPS_OK:
+        return {"ok": False,
+                "skipped": f"deps unavailable on this deployment: "
+                           f"{_DEPS_MISSING}"}
+    if not modality_chips:
+        return {"ok": False, "err": "no modality chips supplied"}
+    t0 = time.time()
+    try:
+        task = _ensure_adapter(adapter_name)
+        spec = ADAPTER_SPECS[adapter_name]
+        logits = _tiled_predict(task, modality_chips, spec["num_classes"])
+        # logits: (B, C, H, W). Argmax to per-pixel class id.
+        pred = logits.argmax(dim=1).squeeze(0)
+        result = summarizer(pred, spec["class_labels"])
+        result["elapsed_s"] = round(time.time() - t0, 2)
+        result["adapter"] = adapter_name
+        result["repo"] = ADAPTERS_REPO
+        return result
+    except Exception as e:
+        log.exception("terramind_nyc.%s failed", adapter_name)
+        return {"ok": False, "err": f"{type(e).__name__}: {e}",
+                "elapsed_s": round(time.time() - t0, 2)}
+def lulc(s2l2a, s1rtc=None, dem=None) -> dict[str, Any]:
+    """5-class NYC macro land-cover.
+    Inputs are torch tensors. The temporal models we trained expect
+    [C, T, H, W] (preferred) or [C, H, W] (will be expanded to T=1).
+    Pass S1 and DEM if you have them — the published adapter was
+    trained on the full triplet and accuracy degrades when modalities
+    are dropped.
+    """
+    chips = {"S2L2A": s2l2a}
+    if s1rtc is not None:
+        chips["S1RTC"] = s1rtc
+    if dem is not None:
+        chips["DEM"] = dem
+    return _run("lulc", chips, _summarize_lulc)
+def buildings(s2l2a, s1rtc=None, dem=None) -> dict[str, Any]:
+    """Binary NYC building-footprint mask. Same input contract as lulc()."""
+    chips = {"S2L2A": s2l2a}
+    if s1rtc is not None:
+        chips["S1RTC"] = s1rtc
+    if dem is not None:
+        chips["DEM"] = dem
+    return _run("buildings", chips, _summarize_buildings)
+def warm():
+    """Optional pre-load — amortizes the first-query model build cost."""
+    if not ENABLE or not _DEPS_OK:
+        return
+    try:
+        for name in ADAPTER_SPECS:
+            _ensure_adapter(name)
+    except Exception:
+        log.exception("terramind_nyc: warm() failed; specialists will no-op")

tests/test_terramind_nyc.py ADDED Viewed

	@@ -0,0 +1,112 @@

+"""Unit tests for the TerraMind-NYC adapter wrapper.
+These tests don't actually load weights or run inference — they verify
+the gate paths (ENABLE=0, missing deps), the public API surface, and
+the result-dict shape so the FSM specialist can consume the output
+without surprises. Real end-to-end smoke testing happens in commit 4
+once the FSM action wires this in and a chip cache is available.
+"""
+from __future__ import annotations
+import importlib
+import os
+import pytest
+def _reload_with_env(**env):
+    """Reimport the module with mutated environment so module-level
+    constants (ENABLE, DEVICE) re-evaluate."""
+    for k, v in env.items():
+        if v is None:
+            os.environ.pop(k, None)
+        else:
+            os.environ[k] = v
+    import app.context.terramind_nyc as m
+    return importlib.reload(m)
+def test_module_imports_without_loading_weights():
+    """Importing the module must not download or build the base model."""
+    m = _reload_with_env()
+    # Adapter cache empty by default.
+    assert m._ADAPTERS == {}
+    assert {"lulc", "buildings"} <= set(m.ADAPTER_SPECS)
+    assert m.ADAPTERS_REPO == "msradam/TerraMind-NYC-Adapters"
+def test_disabled_returns_skipped_outcome():
+    m = _reload_with_env(RIPRAP_TERRAMIND_NYC_ENABLE="0")
+    assert m.ENABLE is False
+    out = m.lulc(None)
+    assert out == {"ok": False, "skipped": "RIPRAP_TERRAMIND_NYC_ENABLE=0"}
+    out = m.buildings(None, s1rtc=None, dem=None)
+    assert out == {"ok": False, "skipped": "RIPRAP_TERRAMIND_NYC_ENABLE=0"}
+    # Restore default for the rest of the suite.
+    _reload_with_env(RIPRAP_TERRAMIND_NYC_ENABLE="1")
+def test_unknown_adapter_raises_keyerror():
+    m = _reload_with_env(RIPRAP_TERRAMIND_NYC_ENABLE="1")
+    with pytest.raises(KeyError):
+        m._ensure_adapter("nonsense")
+def test_summarize_lulc_shape():
+    """_summarize_lulc emits the dict shape the FSM doc-builder will
+    consume — class fractions, dominant class, dominant pct, n_pixels."""
+    import numpy as np
+    m = _reload_with_env()
+    pred = np.array([[0, 0, 0],
+                     [2, 2, 2],
+                     [4, 4, 4]])
+    labels = ["Trees", "Cropland", "Built", "Bare", "Water"]
+    out = m._summarize_lulc(pred, labels)
+    assert out["ok"] is True
+    assert out["n_pixels"] == 9
+    assert out["shape"] == [3, 3]
+    # Three classes appeared, equally; dominant is the FIRST in argmax tie.
+    assert set(out["class_fractions"]) == {"Trees", "Built", "Water"}
+    for v in out["class_fractions"].values():
+        assert v == pytest.approx(33.33, abs=0.1)
+    assert out["dominant_class"] in {"Trees", "Built", "Water"}
+    assert out["dominant_pct"] > 0
+def test_summarize_buildings_shape():
+    import numpy as np
+    m = _reload_with_env()
+    pred = np.array([[0, 0, 1],
+                     [0, 1, 1],
+                     [0, 0, 0]])
+    labels = ["Background", "Building footprint"]
+    out = m._summarize_buildings(pred, labels)
+    assert out["ok"] is True
+    assert out["n_pixels"] == 9
+    assert out["pct_buildings"] == pytest.approx(33.33, abs=0.1)
+    assert out["class_labels"] == labels
+    # scipy.ndimage may or may not be installed; the helper degrades
+    # rather than raising. If it's installed, two diagonal/adjacent
+    # building pixels should land in one connected component.
+    assert out["n_building_components"] in {None, 1}
+def test_public_api_signatures():
+    m = _reload_with_env()
+    import inspect
+    for fn in (m.lulc, m.buildings):
+        sig = inspect.signature(fn)
+        params = list(sig.parameters)
+        # Caller may pass S2 only OR S2+S1+DEM.
+        assert params[0] == "s2l2a"
+        assert "s1rtc" in params
+        assert "dem" in params
+def test_warm_is_no_op_when_disabled():
+    """warm() must not download anything when ENABLE=0 or deps missing."""
+    m = _reload_with_env(RIPRAP_TERRAMIND_NYC_ENABLE="0")
+    # No exceptions, no side effects.
+    m.warm()
+    assert m._ADAPTERS == {}
+    _reload_with_env(RIPRAP_TERRAMIND_NYC_ENABLE="1")