Spaces:

lablab-ai-amd-developer-hackathon
/

riprap-nyc

Running

seriffic Claude Opus 4.7 (1M context) commited on 3 days ago

Commit

400a77a

1 Parent(s): 599cc5c

Stones C4: wire TerraMind-NYC LULC + Buildings into FSM

Adds three new Burr actions in app/fsm.py (gated behind
RIPRAP_HEAVY_SPECIALISTS):

step_eo_chip -> writes eo_chip (S2L2A + S1RTC + DEM)
step_terramind_lulc -> writes terramind_lulc
step_terramind_buildings-> writes terramind_buildings

eo_chip is a per-query cache: one ~150 MB Sentinel-2/1/DEM fetch from
Microsoft Planetary Computer, shared by both TerraMind specialists so
they don't each refetch. The cache lives in app/context/eo_chip_cache.py
and gracefully no-ops when terratorch / planetary_computer / pystac_client
/ rioxarray aren't installed.

Both TerraMind specialists call into app/context/terramind_nyc.py (added
in C3), which routes inference through terratorch tiled_inference at
224x224 and ships the per-class fraction summary the reconciler quotes.

Reconciler:
- app/reconcile.py:build_documents() now emits tm_lulc (Touchstone,
after prithvi_live) and tm_buildings (Keystone, after the legacy
terramind_synthetic prior).
- trim_docs_to_plan PREFIXES_BY_SPECIALIST now maps the planner names
terramind_lulc / terramind_buildings to their tm_* doc-id prefixes.
- step_reconcile reads the new state keys and threads them into the
snap dict.

Frontend:
- web/static/agent.js STEP_LABELS, SOURCE_LABELS, SOURCE_URLS, and
SOURCE_VINTAGES gain entries for eo_chip_fetch / terramind_lulc /
terramind_buildings / tm_lulc / tm_buildings.

Doc-order snapshot stays disjoint (no doc_id collisions) and the new
emissions land in the expected Stone group:
Keystone: ... terramind_synthetic, tm_buildings
Touchstone: ... prithvi_live, tm_lulc

Tests for app/stones and app/context/terramind_nyc pass unchanged
(adapter weights are not loaded — gating + result-shape coverage only).
End-to-end probe-mellea validation runs once the local server is up
with the heavy-specialist deps installed; that's a separate runbook
step the user owns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (4) hide show

app/context/eo_chip_cache.py +293 -0
app/fsm.py +140 -0
app/reconcile.py +46 -0
web/static/agent.js +9 -0

app/context/eo_chip_cache.py ADDED Viewed

	@@ -0,0 +1,293 @@

+"""Per-query EO chip cache — Sentinel-2 L2A, Sentinel-1 RTC, DEM.
+Fetches a co-registered (S2L2A, S1RTC, DEM) chip centered on (lat, lon)
+and returns a dict of torch tensors ready for TerraMind-NYC inference.
+The TerraMind base was trained with `temporal_n_timestamps=4`, so this
+helper expands a single S2/S1 acquisition to T=4 by repetition along
+the temporal axis. Single-timestep nowcasting trades some training-
+distribution match for a much simpler runtime — the published LoRA
+adapters still produce sensible argmax masks at T=1 / tiled.
+Failure semantics mirror prithvi_live: every dependency or network
+failure is converted to a clean `{ok: False, skipped: <reason>}`
+result, never a raised exception. Callers (FSM specialists) that
+chain off the chip can short-circuit on `ok=False` and skip the
+specialist instead of surfacing a noisy error.
+"""
+from __future__ import annotations
+import logging
+import os
+import threading
+import time
+from typing import Any
+log = logging.getLogger("riprap.eo_chip_cache")
+ENABLE = os.environ.get("RIPRAP_EO_CHIP_ENABLE", "1").lower() in ("1", "true", "yes")
+SEARCH_DAYS = int(os.environ.get("RIPRAP_EO_CHIP_SEARCH_DAYS", "120"))
+MAX_CLOUD_PCT = float(os.environ.get("RIPRAP_EO_CHIP_MAX_CLOUD", "30"))
+CHIP_PX = int(os.environ.get("RIPRAP_EO_CHIP_PX", "224"))
+PIXEL_M = 10
+N_TIMESTEPS = 4
+# 12-band S2 L2A in TerraMind's expected order.
+S2_BANDS = ["B01", "B02", "B03", "B04", "B05", "B06", "B07",
+            "B08", "B8A", "B09", "B11", "B12"]
+# Sentinel-1 RTC on Planetary Computer publishes vv/vh polarisations.
+S1_BANDS = ["vv", "vh"]
+def _has_required_deps() -> tuple[bool, str | None]:
+    missing: list[str] = []
+    for name in ("planetary_computer", "pystac_client",
+                 "rioxarray", "xarray", "torch", "numpy"):
+        try:
+            __import__(name)
+        except ImportError:
+            missing.append(name)
+    if missing:
+        return False, ", ".join(missing)
+    return True, None
+_DEPS_OK, _DEPS_MISSING = _has_required_deps()
+_FETCH_LOCK = threading.Lock()
+def _search_s2(lat: float, lon: float):
+    """Return (item, cloud_cover) for the most recent low-cloud S2L2A
+    acquisition near (lat, lon), or (None, None) if no scene exists."""
+    import datetime as dt
+    import planetary_computer as pc
+    from pystac_client import Client
+    end = dt.datetime.utcnow().date()
+    start = end - dt.timedelta(days=SEARCH_DAYS)
+    client = Client.open(
+        "https://planetarycomputer.microsoft.com/api/stac/v1",
+        modifier=pc.sign_inplace,
+    )
+    delta = 0.02
+    search = client.search(
+        collections=["sentinel-2-l2a"],
+        bbox=[lon - delta, lat - delta, lon + delta, lat + delta],
+        datetime=f"{start}/{end}",
+        query={"eo:cloud_cover": {"lt": MAX_CLOUD_PCT}},
+        max_items=20,
+    )
+    items = sorted(
+        search.items(),
+        key=lambda it: (it.properties.get("eo:cloud_cover", 100),
+                        -(it.datetime.timestamp() if it.datetime else 0)),
+    )
+    if not items:
+        return None, None
+    item = items[0]
+    cc = float(item.properties.get("eo:cloud_cover", -1))
+    return item, cc
+def _search_s1(item_dt, lat: float, lon: float):
+    """Return the closest Sentinel-1 RTC acquisition to the given S2
+    datetime, or None if Planetary Computer has nothing nearby."""
+    import datetime as dt
+    import planetary_computer as pc
+    from pystac_client import Client
+    win = dt.timedelta(days=10)
+    start = item_dt - win
+    end = item_dt + win
+    client = Client.open(
+        "https://planetarycomputer.microsoft.com/api/stac/v1",
+        modifier=pc.sign_inplace,
+    )
+    delta = 0.02
+    search = client.search(
+        collections=["sentinel-1-rtc"],
+        bbox=[lon - delta, lat - delta, lon + delta, lat + delta],
+        datetime=f"{start.isoformat()}/{end.isoformat()}",
+        max_items=10,
+    )
+    items = list(search.items())
+    if not items:
+        return None
+    items.sort(key=lambda it:
+               abs((it.datetime - item_dt).total_seconds())
+               if it.datetime else 1e18)
+    return items[0]
+def _read_band(href, bbox_xy_meters, epsg):
+    """Read a single COG band, clipped to the bbox, and resample to
+    CHIP_PX × CHIP_PX. Returns a numpy array (CHIP_PX, CHIP_PX) float32.
+    """
+    import numpy as np
+    import rioxarray  # noqa: F401
+    da = rioxarray.open_rasterio(href, masked=False).squeeze(drop=True)
+    da = da.rio.clip_box(minx=bbox_xy_meters[0], miny=bbox_xy_meters[1],
+                          maxx=bbox_xy_meters[2], maxy=bbox_xy_meters[3])
+    if da.shape[-2] != CHIP_PX or da.shape[-1] != CHIP_PX:
+        # Resample (nearest is fine for the 10/20/60 m S2 mix; S1 is 10 m,
+        # DEM is 30 m and benefits from bilinear; we keep nearest for
+        # simplicity — the TerraMind LoRA was trained against terratorch's
+        # default resampler which is also nearest).
+        da = da.rio.reproject(
+            f"EPSG:{epsg}", shape=(CHIP_PX, CHIP_PX), resampling=0
+        )
+    arr = da.values.astype("float32")
+    return np.nan_to_num(arr)
+def _fetch_modalities(lat: float, lon: float, timeout_s: float = 60.0) -> dict[str, Any]:
+    """Fetch S2L2A + S1RTC + DEM as numpy arrays, resampled to a common
+    CHIP_PX × CHIP_PX grid centered on (lat, lon).
+    """
+    import numpy as np
+    from pyproj import Transformer
+    t0 = time.time()
+    item, cc = _search_s2(lat, lon)
+    if item is None:
+        return {"ok": False,
+                "skipped": f"no <{MAX_CLOUD_PCT}% cloud S2 in last "
+                           f"{SEARCH_DAYS}d"}
+    if "proj:epsg" in item.properties:
+        epsg = int(item.properties["proj:epsg"])
+    else:
+        code = item.properties.get("proj:code", "")
+        if not code.startswith("EPSG:"):
+            return {"ok": False,
+                    "skipped": "STAC item missing proj:epsg / proj:code"}
+        epsg = int(code.split(":", 1)[1])
+    fwd = Transformer.from_crs("EPSG:4326", f"EPSG:{epsg}", always_xy=True)
+    cx, cy = fwd.transform(lon, lat)
+    half_m = CHIP_PX / 2 * PIXEL_M
+    bbox = (cx - half_m, cy - half_m, cx + half_m, cy + half_m)
+    if time.time() - t0 > timeout_s:
+        return {"ok": False, "skipped": "STAC search exceeded budget"}
+    # ---- S2L2A: 12 bands ------------------------------------------------
+    s2_arrs = []
+    try:
+        for b in S2_BANDS:
+            href = item.assets[b].href
+            s2_arrs.append(_read_band(href, bbox, epsg))
+    except Exception as e:
+        log.warning("eo_chip: S2 band fetch failed (%s); aborting", e)
+        return {"ok": False, "err": f"S2 fetch failed: {type(e).__name__}: {e}"}
+    s2 = np.stack(s2_arrs)  # (12, H, W)
+    if s2.mean() > 1.0:
+        s2 = s2 / 10000.0  # scale L2A reflectance from int16 to ~[0, 1]
+    # ---- S1RTC: 2 polarisations (best effort) ---------------------------
+    s1: np.ndarray | None = None
+    s1_meta: dict[str, Any] = {}
+    if time.time() - t0 < timeout_s:
+        try:
+            s1_item = _search_s1(item.datetime, lat, lon)
+            if s1_item is not None:
+                s1_arrs = []
+                for b in S1_BANDS:
+                    href = s1_item.assets[b].href
+                    s1_arrs.append(_read_band(href, bbox, epsg))
+                s1 = np.stack(s1_arrs)
+                s1_meta = {
+                    "scene_id": s1_item.id,
+                    "datetime": (s1_item.datetime.isoformat()
+                                 if s1_item.datetime else None),
+                }
+        except Exception as e:
+            log.warning("eo_chip: S1 fetch best-effort failed: %s", e)
+    # ---- DEM: Copernicus 30 m via planetary_computer (best effort) ------
+    dem: np.ndarray | None = None
+    if time.time() - t0 < timeout_s:
+        try:
+            import planetary_computer as pc
+            from pystac_client import Client
+            client = Client.open(
+                "https://planetarycomputer.microsoft.com/api/stac/v1",
+                modifier=pc.sign_inplace,
+            )
+            dem_search = client.search(
+                collections=["cop-dem-glo-30"],
+                bbox=[lon - 0.02, lat - 0.02, lon + 0.02, lat + 0.02],
+                max_items=1,
+            )
+            dem_items = list(dem_search.items())
+            if dem_items:
+                href = dem_items[0].assets["data"].href
+                dem = _read_band(href, bbox, epsg)
+                dem = dem[None, :, :]  # add channel dim
+        except Exception as e:
+            log.warning("eo_chip: DEM fetch best-effort failed: %s", e)
+    return {
+        "ok": True,
+        "lat": lat, "lon": lon,
+        "epsg": epsg, "chip_px": CHIP_PX, "pixel_m": PIXEL_M,
+        "s2": s2, "s1": s1, "dem": dem,
+        "s2_meta": {
+            "scene_id": item.id,
+            "datetime": (item.datetime.isoformat() if item.datetime else None),
+            "cloud_cover": cc,
+        },
+        "s1_meta": s1_meta,
+        "elapsed_s": round(time.time() - t0, 2),
+    }
+def _to_terramind_tensors(modalities: dict[str, Any]) -> dict[str, Any]:
+    """Shape numpy modality arrays into the (B, C, T, H, W) tensors
+    TerraMind expects with `temporal_n_timestamps=4`. Single-timestep
+    fetches get tiled to T=4 — same observation in every slot.
+    """
+    import torch
+    s2 = modalities["s2"]  # (12, H, W)
+    s2_t = torch.from_numpy(s2).float().unsqueeze(1)  # (12, 1, H, W)
+    s2_t = s2_t.repeat(1, N_TIMESTEPS, 1, 1).unsqueeze(0)  # (1, 12, T, H, W)
+    chips = {"S2L2A": s2_t}
+    if modalities.get("s1") is not None:
+        s1 = modalities["s1"]  # (2, H, W)
+        s1_t = torch.from_numpy(s1).float().unsqueeze(1)
+        s1_t = s1_t.repeat(1, N_TIMESTEPS, 1, 1).unsqueeze(0)
+        chips["S1RTC"] = s1_t
+    if modalities.get("dem") is not None:
+        dem = modalities["dem"]  # (1, H, W)
+        dem_t = torch.from_numpy(dem).float().unsqueeze(1)
+        dem_t = dem_t.repeat(1, N_TIMESTEPS, 1, 1).unsqueeze(0)
+        chips["DEM"] = dem_t
+    return chips
+def fetch(lat: float, lon: float, timeout_s: float = 60.0) -> dict[str, Any]:
+    """Run the chip pipeline. Always returns a dict with at minimum
+    `{ok, skipped|err, ...}`; on success the dict carries the
+    co-registered numpy arrays plus `tensors` (the TerraMind-shaped
+    torch dict).
+    """
+    if not ENABLE:
+        return {"ok": False, "skipped": "RIPRAP_EO_CHIP_ENABLE=0"}
+    if not _DEPS_OK:
+        return {"ok": False,
+                "skipped": f"deps unavailable on this deployment: "
+                           f"{_DEPS_MISSING}"}
+    with _FETCH_LOCK:
+        try:
+            modalities = _fetch_modalities(lat, lon, timeout_s=timeout_s)
+        except Exception as e:
+            log.exception("eo_chip: fetch failed")
+            return {"ok": False, "err": f"{type(e).__name__}: {e}"}
+        if not modalities.get("ok"):
+            return modalities
+        try:
+            modalities["tensors"] = _to_terramind_tensors(modalities)
+        except Exception as e:
+            log.exception("eo_chip: tensor build failed")
+            return {"ok": False,
+                    "err": f"tensor build failed: {type(e).__name__}: {e}"}
+        return modalities

app/fsm.py CHANGED Viewed

@@ -682,6 +682,131 @@ def step_microtopo(state: State) -> State:
 @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
                "ida_hwm", "prithvi_water", "noaa_tides", "nws_alerts", "nws_obs",
                "ttm_forecast"],
@@ -766,6 +891,7 @@ def _label_counts(gliner_out: dict[str, dict]) -> dict[str, int]:
 @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
                "ida_hwm", "prithvi_water", "prithvi_live", "terramind",
                "noaa_tides", "nws_alerts", "nws_obs", "ttm_forecast",
                "ttm_311_forecast", "floodnet_forecast", "mta_entrances",
                "nycha_developments", "doe_schools", "doh_hospitals",
@@ -795,6 +921,8 @@ def step_reconcile(state: State) -> State:
             "gliner": state.get("gliner"),
             "prithvi_live": state.get("prithvi_live"),
             "terramind": state.get("terramind"),
             "mta_entrances": state.get("mta_entrances"),
             "nycha_developments": state.get("nycha_developments"),
             "doe_schools": state.get("doe_schools"),
@@ -910,6 +1038,13 @@ def build_app(query: str):
         actions["doh_hospitals"] = step_doh_hospitals
         actions["prithvi_live"] = step_prithvi_live
         actions["terramind"] = step_terramind
     actions["rag"] = step_rag
     actions["gliner"] = step_gliner
     actions["reconcile"] = step_reconcile
@@ -947,6 +1082,9 @@ def run(query: str) -> dict[str, Any]:
         "ida_hwm": final_state.get("ida_hwm"),
         "prithvi_water": final_state.get("prithvi_water"),
         "terramind": final_state.get("terramind"),
         "noaa_tides": final_state.get("noaa_tides"),
         "nws_alerts": final_state.get("nws_alerts"),
         "nws_obs": final_state.get("nws_obs"),
@@ -1068,6 +1206,8 @@ def iter_steps(query: str):
         "prithvi_water": state.get("prithvi_water"),
         "prithvi_live": state.get("prithvi_live"),
         "terramind": state.get("terramind"),
         "noaa_tides": state.get("noaa_tides"),
         "nws_alerts": state.get("nws_alerts"),
         "nws_obs": state.get("nws_obs"),

+@action(reads=["lat", "lon"], writes=["eo_chip", "trace"])
+def step_eo_chip(state: State) -> State:
+    """Fetch one S2L2A + S1RTC + DEM chip per query and stash it in
+    state for the TerraMind-NYC specialists.
+    Centralised so step_terramind_lulc and step_terramind_buildings
+    don't each re-fetch ~150 MB of imagery. Best-effort by design —
+    a deps-missing or no-scene outcome writes `{ok: False, skipped: ...}`
+    and the downstream TerraMind specialists silently no-op."""
+    rec, trace = _step(state, "eo_chip_fetch")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(eo_chip=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(eo_chip=None, trace=trace)
+        from app.context import eo_chip_cache
+        chip = eo_chip_cache.fetch(state["lat"], state["lon"])
+        rec["ok"] = bool(chip.get("ok"))
+        if not rec["ok"]:
+            rec["err"] = chip.get("skipped") or chip.get("err") or "unavailable"
+        else:
+            rec["result"] = {
+                "scene_id": (chip.get("s2_meta") or {}).get("scene_id"),
+                "scene_date": ((chip.get("s2_meta") or {}).get("datetime") or "")[:10],
+                "cloud_cover": (chip.get("s2_meta") or {}).get("cloud_cover"),
+                "has_s1": chip.get("s1") is not None,
+                "has_dem": chip.get("dem") is not None,
+            }
+        return state.update(eo_chip=chip, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("eo_chip failed")
+        return state.update(eo_chip=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon", "eo_chip"], writes=["terramind_lulc", "trace"])
+def step_terramind_lulc(state: State) -> State:
+    """5-class macro NYC LULC via msradam/TerraMind-NYC-Adapters.
+    Consumes the shared chip from step_eo_chip; if that didn't fire
+    cleanly this no-ops. Adapter loading (~1.6 GB base + ~325 MB LoRA)
+    is lazy on first call and cached across queries."""
+    rec, trace = _step(state, "terramind_lulc")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(terramind_lulc=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(terramind_lulc=None, trace=trace)
+        chip = state.get("eo_chip") or {}
+        if not chip.get("ok"):
+            rec["ok"] = False
+            rec["err"] = chip.get("skipped") or chip.get("err") or "no chip"
+            return state.update(terramind_lulc=None, trace=trace)
+        from app.context import terramind_nyc
+        tensors = chip.get("tensors") or {}
+        out = terramind_nyc.lulc(
+            tensors.get("S2L2A"),
+            s1rtc=tensors.get("S1RTC"),
+            dem=tensors.get("DEM"),
+        )
+        rec["ok"] = bool(out.get("ok"))
+        if not rec["ok"]:
+            rec["err"] = out.get("skipped") or out.get("err") or "unavailable"
+        else:
+            rec["result"] = {
+                "dominant_class": out.get("dominant_class"),
+                "dominant_pct": out.get("dominant_pct"),
+                "n_classes_observed": len(out.get("class_fractions") or {}),
+            }
+        return state.update(terramind_lulc=out, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("terramind_lulc failed")
+        return state.update(terramind_lulc=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon", "eo_chip"],
+        writes=["terramind_buildings", "trace"])
+def step_terramind_buildings(state: State) -> State:
+    """Binary NYC building-footprint mask via msradam/TerraMind-NYC-Adapters."""
+    rec, trace = _step(state, "terramind_buildings")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(terramind_buildings=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(terramind_buildings=None, trace=trace)
+        chip = state.get("eo_chip") or {}
+        if not chip.get("ok"):
+            rec["ok"] = False
+            rec["err"] = chip.get("skipped") or chip.get("err") or "no chip"
+            return state.update(terramind_buildings=None, trace=trace)
+        from app.context import terramind_nyc
+        tensors = chip.get("tensors") or {}
+        out = terramind_nyc.buildings(
+            tensors.get("S2L2A"),
+            s1rtc=tensors.get("S1RTC"),
+            dem=tensors.get("DEM"),
+        )
+        rec["ok"] = bool(out.get("ok"))
+        if not rec["ok"]:
+            rec["err"] = out.get("skipped") or out.get("err") or "unavailable"
+        else:
+            rec["result"] = {
+                "pct_buildings": out.get("pct_buildings"),
+                "n_building_components": out.get("n_building_components"),
+            }
+        return state.update(terramind_buildings=out, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("terramind_buildings failed")
+        return state.update(terramind_buildings=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
 @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
                "ida_hwm", "prithvi_water", "noaa_tides", "nws_alerts", "nws_obs",
                "ttm_forecast"],
 @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
                "ida_hwm", "prithvi_water", "prithvi_live", "terramind",
+               "terramind_lulc", "terramind_buildings",
                "noaa_tides", "nws_alerts", "nws_obs", "ttm_forecast",
                "ttm_311_forecast", "floodnet_forecast", "mta_entrances",
                "nycha_developments", "doe_schools", "doh_hospitals",
             "gliner": state.get("gliner"),
             "prithvi_live": state.get("prithvi_live"),
             "terramind": state.get("terramind"),
+            "terramind_lulc": state.get("terramind_lulc"),
+            "terramind_buildings": state.get("terramind_buildings"),
             "mta_entrances": state.get("mta_entrances"),
             "nycha_developments": state.get("nycha_developments"),
             "doe_schools": state.get("doe_schools"),
         actions["doh_hospitals"] = step_doh_hospitals
         actions["prithvi_live"] = step_prithvi_live
         actions["terramind"] = step_terramind
+        # New TerraMind-NYC LoRA family — one chip fetch feeds two
+        # specialists. Keep eo_chip directly before the two consumers
+        # so the chip stays warm in memory and isn't garbage-collected
+        # by anything in between.
+        actions["eo_chip"] = step_eo_chip
+        actions["terramind_lulc"] = step_terramind_lulc
+        actions["terramind_buildings"] = step_terramind_buildings
     actions["rag"] = step_rag
     actions["gliner"] = step_gliner
     actions["reconcile"] = step_reconcile
         "ida_hwm": final_state.get("ida_hwm"),
         "prithvi_water": final_state.get("prithvi_water"),
         "terramind": final_state.get("terramind"),
+        "terramind_lulc": final_state.get("terramind_lulc"),
+        "terramind_buildings": final_state.get("terramind_buildings"),
+        "eo_chip": final_state.get("eo_chip"),
         "noaa_tides": final_state.get("noaa_tides"),
         "nws_alerts": final_state.get("nws_alerts"),
         "nws_obs": final_state.get("nws_obs"),
         "prithvi_water": state.get("prithvi_water"),
         "prithvi_live": state.get("prithvi_live"),
         "terramind": state.get("terramind"),
+        "terramind_lulc": state.get("terramind_lulc"),
+        "terramind_buildings": state.get("terramind_buildings"),
         "noaa_tides": state.get("noaa_tides"),
         "nws_alerts": state.get("nws_alerts"),
         "nws_obs": state.get("nws_obs"),

app/reconcile.py CHANGED Viewed

@@ -271,6 +271,8 @@ def trim_docs_to_plan(doc_msgs: list[dict],
         "ttm_311_forecast":   ("ttm_311_forecast",),
         "floodnet_forecast":  ("floodnet_forecast",),
         "terramind":          ("terramind", "syn_"),
         "rag":                ("rag_",),
         "rag_mta":            ("rag_",),
         "nta_resolve":        ("nta_resolve", "nta_"),
@@ -709,6 +711,32 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
         ])
         docs.append(_doc_message("terramind_synthetic", body))
     # ---- Touchstone — The Live Observer --------------------------------
     # Live sensors and per-query EO that change minute to minute:
     # FloodNet ultrasonic depth, NYC 311 flood complaints, NWS hourly
@@ -824,6 +852,24 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
         ]
         docs.append(_doc_message("prithvi_live", body))
     # ---- Lodestone — The Projector -------------------------------------
     # Forward-looking signals: NWS public flood alerts, Granite TTM r2
     # zero-shot Battery surge residual, per-address NYC 311 weekly rate,

         "ttm_311_forecast":   ("ttm_311_forecast",),
         "floodnet_forecast":  ("floodnet_forecast",),
         "terramind":          ("terramind", "syn_"),
+        "terramind_lulc":     ("tm_lulc",),
+        "terramind_buildings": ("tm_buildings",),
         "rag":                ("rag_",),
         "rag_mta":            ("rag_",),
         "nta_resolve":        ("nta_resolve", "nta_"),
         ])
         docs.append(_doc_message("terramind_synthetic", body))
+    # TerraMind-NYC Buildings adapter (msradam/TerraMind-NYC-Adapters,
+    # Apache-2.0, fine-tuned on NYC building footprints on AMD MI300X).
+    # Distinct from the synthetic-prior block above — this is a real
+    # segmentation against the per-query Sentinel-2/1/DEM chip and
+    # reports an empirical building-footprint area fraction.
+    tmb = state.get("terramind_buildings")
+    if not out_of_nyc and tmb and tmb.get("ok"):
+        body = [
+            "Source: msradam/TerraMind-NYC-Adapters (Apache-2.0) — NYC "
+            "Buildings LoRA on TerraMind 1.0 base, fine-tuned on AMD "
+            "Instinct MI300X. Test mIoU 0.5511 on held-out NYC chips.",
+            f"Adapter: {tmb.get('adapter')}.",
+            f"Predicted building-footprint coverage in chip: "
+            f"{tmb.get('pct_buildings')}%.",
+        ]
+        if tmb.get("n_building_components") is not None:
+            body.append(
+                f"Distinct building connected components: "
+                f"{tmb.get('n_building_components')}."
+            )
+        body.append(
+            "Class labels: " + ", ".join(tmb.get("class_labels") or [])
+            + "."
+        )
+        docs.append(_doc_message("tm_buildings", body))
     # ---- Touchstone — The Live Observer --------------------------------
     # Live sensors and per-query EO that change minute to minute:
     # FloodNet ultrasonic depth, NYC 311 flood complaints, NWS hourly
         ]
         docs.append(_doc_message("prithvi_live", body))
+    # TerraMind-NYC LULC adapter — current 5-class macro land-cover from
+    # the per-query Sentinel-2/1/DEM chip. Empirical observation, not the
+    # synthetic-prior emitted by the legacy `terramind_synthetic` doc.
+    tml = state.get("terramind_lulc")
+    if not out_of_nyc and tml and tml.get("ok"):
+        body = [
+            "Source: msradam/TerraMind-NYC-Adapters (Apache-2.0) — NYC "
+            "LULC LoRA on TerraMind 1.0 base, fine-tuned on AMD "
+            "Instinct MI300X. Test mIoU 0.5866 on held-out NYC chips.",
+            f"Adapter: {tml.get('adapter')}.",
+            f"Dominant land-cover class in chip: "
+            f"{tml.get('dominant_class')} at {tml.get('dominant_pct')}%.",
+            "Per-class fractions:",
+        ]
+        for label, pct in (tml.get("class_fractions") or {}).items():
+            body.append(f"  - {label}: {pct}%")
+        docs.append(_doc_message("tm_lulc", body))
     # ---- Lodestone — The Projector -------------------------------------
     # Forward-looking signals: NWS public flood alerts, Granite TTM r2
     # zero-shot Battery surge residual, per-address NYC 311 weekly rate,

web/static/agent.js CHANGED Viewed

@@ -25,6 +25,9 @@ const STEP_LABELS = {
   prithvi_eo_v2:          ["Prithvi-EO 2.0 (NASA/IBM)",        "Sen1Floods11 satellite segmentation"],
   prithvi_eo_live:        ["Prithvi-EO 2.0 — live segmentation","fresh Sentinel-2 water mask at this address"],
   terramind_synthesis:    ["TerraMind 1.0 base — synthetic LULC",   "DEM → ESRI Land Cover, any-to-any generative synthesis (IBM/ESA)"],
   rag_granite_embedding:  ["Granite Embedding 278M (RAG)",     "policy corpus retrieval (+ Granite Reranker R2 if enabled)"],
   gliner_extract:         ["GLiNER typed extraction",          "agencies, dollar amounts, projects, locations"],
   reconcile_granite41:    ["Granite 4.1 reconcile (local)",    "document-grounded synthesis"],
@@ -66,6 +69,8 @@ const SOURCE_LABELS = {
   prithvi_water: "Prithvi-EO 2.0 — Hurricane Ida 2021 polygons",
   prithvi_live:  "Prithvi-EO 2.0 — live Sentinel-2 water segmentation",
   terramind_synthetic: "TerraMind 1.0 base — synthetic LULC (DEM→ESRI Land Cover)",
   gliner_comptroller: "GLiNER over Comptroller report",
   gliner_dep_2013:    "GLiNER over DEP wastewater plan",
   gliner_nycha:       "GLiNER over NYCHA Lessons Learned",
@@ -115,6 +120,8 @@ const SOURCE_URLS = {
   prithvi_water:          "https://huggingface.co/ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11",
   prithvi_live:           "https://huggingface.co/ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11",
   terramind_synthetic:    "https://huggingface.co/ibm-esa-geospatial/TerraMind-1.0-base",
   gliner_comptroller:     "https://huggingface.co/urchade/gliner_medium-v2.1",
   gliner_dep_2013:        "https://huggingface.co/urchade/gliner_medium-v2.1",
   gliner_nycha:           "https://huggingface.co/urchade/gliner_medium-v2.1",
@@ -161,6 +168,8 @@ const SOURCE_VINTAGES = {
   prithvi_water:          "Prithvi-EO 2.0 satellite segmentation, scenes 2021-08-25 (pre) & 2021-09-02 (post Ida)",
   prithvi_live:           "live Sentinel-2 L2A scene from Microsoft Planetary Computer (acquisition timestamp in payload)",
   terramind_synthetic:    "synthetic prior — TerraMind 1.0 base generated a plausible categorical land-cover map from the LiDAR terrain at this point (deterministic seed, 10 diffusion steps; class fractions cite-able; not a measurement)",
   gliner_comptroller:     "GLiNER typed extraction over the Comptroller PDF (per-paragraph)",
   gliner_dep_2013:        "GLiNER typed extraction over the DEP wastewater plan",
   gliner_nycha:           "GLiNER typed extraction over the NYCHA Lessons Learned PDF",

   prithvi_eo_v2:          ["Prithvi-EO 2.0 (NASA/IBM)",        "Sen1Floods11 satellite segmentation"],
   prithvi_eo_live:        ["Prithvi-EO 2.0 — live segmentation","fresh Sentinel-2 water mask at this address"],
   terramind_synthesis:    ["TerraMind 1.0 base — synthetic LULC",   "DEM → ESRI Land Cover, any-to-any generative synthesis (IBM/ESA)"],
+  eo_chip_fetch:          ["EO chip fetch (S2L2A + S1RTC + DEM)",   "single-chip cache for the TerraMind-NYC LoRA family"],
+  terramind_lulc:         ["TerraMind-NYC — LULC (live)",           "5-class macro land-cover LoRA (msradam/TerraMind-NYC-Adapters)"],
+  terramind_buildings:    ["TerraMind-NYC — Buildings (live)",      "binary building-footprint LoRA (msradam/TerraMind-NYC-Adapters)"],
   rag_granite_embedding:  ["Granite Embedding 278M (RAG)",     "policy corpus retrieval (+ Granite Reranker R2 if enabled)"],
   gliner_extract:         ["GLiNER typed extraction",          "agencies, dollar amounts, projects, locations"],
   reconcile_granite41:    ["Granite 4.1 reconcile (local)",    "document-grounded synthesis"],
   prithvi_water: "Prithvi-EO 2.0 — Hurricane Ida 2021 polygons",
   prithvi_live:  "Prithvi-EO 2.0 — live Sentinel-2 water segmentation",
   terramind_synthetic: "TerraMind 1.0 base — synthetic LULC (DEM→ESRI Land Cover)",
+  tm_lulc:        "TerraMind-NYC LULC LoRA (msradam/TerraMind-NYC-Adapters)",
+  tm_buildings:   "TerraMind-NYC Buildings LoRA (msradam/TerraMind-NYC-Adapters)",
   gliner_comptroller: "GLiNER over Comptroller report",
   gliner_dep_2013:    "GLiNER over DEP wastewater plan",
   gliner_nycha:       "GLiNER over NYCHA Lessons Learned",
   prithvi_water:          "https://huggingface.co/ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11",
   prithvi_live:           "https://huggingface.co/ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11",
   terramind_synthetic:    "https://huggingface.co/ibm-esa-geospatial/TerraMind-1.0-base",
+  tm_lulc:                "https://huggingface.co/msradam/TerraMind-NYC-Adapters",
+  tm_buildings:           "https://huggingface.co/msradam/TerraMind-NYC-Adapters",
   gliner_comptroller:     "https://huggingface.co/urchade/gliner_medium-v2.1",
   gliner_dep_2013:        "https://huggingface.co/urchade/gliner_medium-v2.1",
   gliner_nycha:           "https://huggingface.co/urchade/gliner_medium-v2.1",
   prithvi_water:          "Prithvi-EO 2.0 satellite segmentation, scenes 2021-08-25 (pre) & 2021-09-02 (post Ida)",
   prithvi_live:           "live Sentinel-2 L2A scene from Microsoft Planetary Computer (acquisition timestamp in payload)",
   terramind_synthetic:    "synthetic prior — TerraMind 1.0 base generated a plausible categorical land-cover map from the LiDAR terrain at this point (deterministic seed, 10 diffusion steps; class fractions cite-able; not a measurement)",
+  tm_lulc:                "live empirical observation — TerraMind-NYC LULC LoRA (msradam/TerraMind-NYC-Adapters, fine-tuned on NYC chips on AMD MI300X) over the per-query Sentinel-2/1/DEM chip; 5-class macro land cover with class fractions cite-able",
+  tm_buildings:           "live empirical observation — TerraMind-NYC Buildings LoRA (msradam/TerraMind-NYC-Adapters, fine-tuned on NYC chips on AMD MI300X) over the per-query Sentinel-2/1/DEM chip; binary building-footprint mask + connected-component count",
   gliner_comptroller:     "GLiNER typed extraction over the Comptroller PDF (per-paragraph)",
   gliner_dep_2013:        "GLiNER typed extraction over the DEP wastewater plan",
   gliner_nycha:           "GLiNER typed extraction over the NYCHA Lessons Learned PDF",