seriffic Claude Opus 4.7 (1M context) commited on
Commit
400a77a
Β·
1 Parent(s): 599cc5c

Stones C4: wire TerraMind-NYC LULC + Buildings into FSM

Browse files

Adds three new Burr actions in app/fsm.py (gated behind
RIPRAP_HEAVY_SPECIALISTS):

step_eo_chip -> writes eo_chip (S2L2A + S1RTC + DEM)
step_terramind_lulc -> writes terramind_lulc
step_terramind_buildings-> writes terramind_buildings

eo_chip is a per-query cache: one ~150 MB Sentinel-2/1/DEM fetch from
Microsoft Planetary Computer, shared by both TerraMind specialists so
they don't each refetch. The cache lives in app/context/eo_chip_cache.py
and gracefully no-ops when terratorch / planetary_computer / pystac_client
/ rioxarray aren't installed.

Both TerraMind specialists call into app/context/terramind_nyc.py (added
in C3), which routes inference through terratorch tiled_inference at
224x224 and ships the per-class fraction summary the reconciler quotes.

Reconciler:
- app/reconcile.py:build_documents() now emits tm_lulc (Touchstone,
after prithvi_live) and tm_buildings (Keystone, after the legacy
terramind_synthetic prior).
- trim_docs_to_plan PREFIXES_BY_SPECIALIST now maps the planner names
terramind_lulc / terramind_buildings to their tm_* doc-id prefixes.
- step_reconcile reads the new state keys and threads them into the
snap dict.

Frontend:
- web/static/agent.js STEP_LABELS, SOURCE_LABELS, SOURCE_URLS, and
SOURCE_VINTAGES gain entries for eo_chip_fetch / terramind_lulc /
terramind_buildings / tm_lulc / tm_buildings.

Doc-order snapshot stays disjoint (no doc_id collisions) and the new
emissions land in the expected Stone group:
Keystone: ... terramind_synthetic, tm_buildings
Touchstone: ... prithvi_live, tm_lulc

Tests for app/stones and app/context/terramind_nyc pass unchanged
(adapter weights are not loaded β€” gating + result-shape coverage only).
End-to-end probe-mellea validation runs once the local server is up
with the heavy-specialist deps installed; that's a separate runbook
step the user owns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (4) hide show
  1. app/context/eo_chip_cache.py +293 -0
  2. app/fsm.py +140 -0
  3. app/reconcile.py +46 -0
  4. web/static/agent.js +9 -0
app/context/eo_chip_cache.py ADDED
@@ -0,0 +1,293 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Per-query EO chip cache β€” Sentinel-2 L2A, Sentinel-1 RTC, DEM.
2
+
3
+ Fetches a co-registered (S2L2A, S1RTC, DEM) chip centered on (lat, lon)
4
+ and returns a dict of torch tensors ready for TerraMind-NYC inference.
5
+ The TerraMind base was trained with `temporal_n_timestamps=4`, so this
6
+ helper expands a single S2/S1 acquisition to T=4 by repetition along
7
+ the temporal axis. Single-timestep nowcasting trades some training-
8
+ distribution match for a much simpler runtime β€” the published LoRA
9
+ adapters still produce sensible argmax masks at T=1 / tiled.
10
+
11
+ Failure semantics mirror prithvi_live: every dependency or network
12
+ failure is converted to a clean `{ok: False, skipped: <reason>}`
13
+ result, never a raised exception. Callers (FSM specialists) that
14
+ chain off the chip can short-circuit on `ok=False` and skip the
15
+ specialist instead of surfacing a noisy error.
16
+ """
17
+ from __future__ import annotations
18
+
19
+ import logging
20
+ import os
21
+ import threading
22
+ import time
23
+ from typing import Any
24
+
25
+ log = logging.getLogger("riprap.eo_chip_cache")
26
+
27
+ ENABLE = os.environ.get("RIPRAP_EO_CHIP_ENABLE", "1").lower() in ("1", "true", "yes")
28
+ SEARCH_DAYS = int(os.environ.get("RIPRAP_EO_CHIP_SEARCH_DAYS", "120"))
29
+ MAX_CLOUD_PCT = float(os.environ.get("RIPRAP_EO_CHIP_MAX_CLOUD", "30"))
30
+ CHIP_PX = int(os.environ.get("RIPRAP_EO_CHIP_PX", "224"))
31
+ PIXEL_M = 10
32
+ N_TIMESTEPS = 4
33
+
34
+ # 12-band S2 L2A in TerraMind's expected order.
35
+ S2_BANDS = ["B01", "B02", "B03", "B04", "B05", "B06", "B07",
36
+ "B08", "B8A", "B09", "B11", "B12"]
37
+
38
+ # Sentinel-1 RTC on Planetary Computer publishes vv/vh polarisations.
39
+ S1_BANDS = ["vv", "vh"]
40
+
41
+
42
+ def _has_required_deps() -> tuple[bool, str | None]:
43
+ missing: list[str] = []
44
+ for name in ("planetary_computer", "pystac_client",
45
+ "rioxarray", "xarray", "torch", "numpy"):
46
+ try:
47
+ __import__(name)
48
+ except ImportError:
49
+ missing.append(name)
50
+ if missing:
51
+ return False, ", ".join(missing)
52
+ return True, None
53
+
54
+
55
+ _DEPS_OK, _DEPS_MISSING = _has_required_deps()
56
+ _FETCH_LOCK = threading.Lock()
57
+
58
+
59
+ def _search_s2(lat: float, lon: float):
60
+ """Return (item, cloud_cover) for the most recent low-cloud S2L2A
61
+ acquisition near (lat, lon), or (None, None) if no scene exists."""
62
+ import datetime as dt
63
+
64
+ import planetary_computer as pc
65
+ from pystac_client import Client
66
+ end = dt.datetime.utcnow().date()
67
+ start = end - dt.timedelta(days=SEARCH_DAYS)
68
+ client = Client.open(
69
+ "https://planetarycomputer.microsoft.com/api/stac/v1",
70
+ modifier=pc.sign_inplace,
71
+ )
72
+ delta = 0.02
73
+ search = client.search(
74
+ collections=["sentinel-2-l2a"],
75
+ bbox=[lon - delta, lat - delta, lon + delta, lat + delta],
76
+ datetime=f"{start}/{end}",
77
+ query={"eo:cloud_cover": {"lt": MAX_CLOUD_PCT}},
78
+ max_items=20,
79
+ )
80
+ items = sorted(
81
+ search.items(),
82
+ key=lambda it: (it.properties.get("eo:cloud_cover", 100),
83
+ -(it.datetime.timestamp() if it.datetime else 0)),
84
+ )
85
+ if not items:
86
+ return None, None
87
+ item = items[0]
88
+ cc = float(item.properties.get("eo:cloud_cover", -1))
89
+ return item, cc
90
+
91
+
92
+ def _search_s1(item_dt, lat: float, lon: float):
93
+ """Return the closest Sentinel-1 RTC acquisition to the given S2
94
+ datetime, or None if Planetary Computer has nothing nearby."""
95
+ import datetime as dt
96
+
97
+ import planetary_computer as pc
98
+ from pystac_client import Client
99
+ win = dt.timedelta(days=10)
100
+ start = item_dt - win
101
+ end = item_dt + win
102
+ client = Client.open(
103
+ "https://planetarycomputer.microsoft.com/api/stac/v1",
104
+ modifier=pc.sign_inplace,
105
+ )
106
+ delta = 0.02
107
+ search = client.search(
108
+ collections=["sentinel-1-rtc"],
109
+ bbox=[lon - delta, lat - delta, lon + delta, lat + delta],
110
+ datetime=f"{start.isoformat()}/{end.isoformat()}",
111
+ max_items=10,
112
+ )
113
+ items = list(search.items())
114
+ if not items:
115
+ return None
116
+ items.sort(key=lambda it:
117
+ abs((it.datetime - item_dt).total_seconds())
118
+ if it.datetime else 1e18)
119
+ return items[0]
120
+
121
+
122
+ def _read_band(href, bbox_xy_meters, epsg):
123
+ """Read a single COG band, clipped to the bbox, and resample to
124
+ CHIP_PX Γ— CHIP_PX. Returns a numpy array (CHIP_PX, CHIP_PX) float32.
125
+ """
126
+ import numpy as np
127
+ import rioxarray # noqa: F401
128
+ da = rioxarray.open_rasterio(href, masked=False).squeeze(drop=True)
129
+ da = da.rio.clip_box(minx=bbox_xy_meters[0], miny=bbox_xy_meters[1],
130
+ maxx=bbox_xy_meters[2], maxy=bbox_xy_meters[3])
131
+ if da.shape[-2] != CHIP_PX or da.shape[-1] != CHIP_PX:
132
+ # Resample (nearest is fine for the 10/20/60 m S2 mix; S1 is 10 m,
133
+ # DEM is 30 m and benefits from bilinear; we keep nearest for
134
+ # simplicity β€” the TerraMind LoRA was trained against terratorch's
135
+ # default resampler which is also nearest).
136
+ da = da.rio.reproject(
137
+ f"EPSG:{epsg}", shape=(CHIP_PX, CHIP_PX), resampling=0
138
+ )
139
+ arr = da.values.astype("float32")
140
+ return np.nan_to_num(arr)
141
+
142
+
143
+ def _fetch_modalities(lat: float, lon: float, timeout_s: float = 60.0) -> dict[str, Any]:
144
+ """Fetch S2L2A + S1RTC + DEM as numpy arrays, resampled to a common
145
+ CHIP_PX Γ— CHIP_PX grid centered on (lat, lon).
146
+ """
147
+ import numpy as np
148
+ from pyproj import Transformer
149
+
150
+ t0 = time.time()
151
+ item, cc = _search_s2(lat, lon)
152
+ if item is None:
153
+ return {"ok": False,
154
+ "skipped": f"no <{MAX_CLOUD_PCT}% cloud S2 in last "
155
+ f"{SEARCH_DAYS}d"}
156
+ if "proj:epsg" in item.properties:
157
+ epsg = int(item.properties["proj:epsg"])
158
+ else:
159
+ code = item.properties.get("proj:code", "")
160
+ if not code.startswith("EPSG:"):
161
+ return {"ok": False,
162
+ "skipped": "STAC item missing proj:epsg / proj:code"}
163
+ epsg = int(code.split(":", 1)[1])
164
+
165
+ fwd = Transformer.from_crs("EPSG:4326", f"EPSG:{epsg}", always_xy=True)
166
+ cx, cy = fwd.transform(lon, lat)
167
+ half_m = CHIP_PX / 2 * PIXEL_M
168
+ bbox = (cx - half_m, cy - half_m, cx + half_m, cy + half_m)
169
+
170
+ if time.time() - t0 > timeout_s:
171
+ return {"ok": False, "skipped": "STAC search exceeded budget"}
172
+
173
+ # ---- S2L2A: 12 bands ------------------------------------------------
174
+ s2_arrs = []
175
+ try:
176
+ for b in S2_BANDS:
177
+ href = item.assets[b].href
178
+ s2_arrs.append(_read_band(href, bbox, epsg))
179
+ except Exception as e:
180
+ log.warning("eo_chip: S2 band fetch failed (%s); aborting", e)
181
+ return {"ok": False, "err": f"S2 fetch failed: {type(e).__name__}: {e}"}
182
+ s2 = np.stack(s2_arrs) # (12, H, W)
183
+ if s2.mean() > 1.0:
184
+ s2 = s2 / 10000.0 # scale L2A reflectance from int16 to ~[0, 1]
185
+
186
+ # ---- S1RTC: 2 polarisations (best effort) ---------------------------
187
+ s1: np.ndarray | None = None
188
+ s1_meta: dict[str, Any] = {}
189
+ if time.time() - t0 < timeout_s:
190
+ try:
191
+ s1_item = _search_s1(item.datetime, lat, lon)
192
+ if s1_item is not None:
193
+ s1_arrs = []
194
+ for b in S1_BANDS:
195
+ href = s1_item.assets[b].href
196
+ s1_arrs.append(_read_band(href, bbox, epsg))
197
+ s1 = np.stack(s1_arrs)
198
+ s1_meta = {
199
+ "scene_id": s1_item.id,
200
+ "datetime": (s1_item.datetime.isoformat()
201
+ if s1_item.datetime else None),
202
+ }
203
+ except Exception as e:
204
+ log.warning("eo_chip: S1 fetch best-effort failed: %s", e)
205
+
206
+ # ---- DEM: Copernicus 30 m via planetary_computer (best effort) ------
207
+ dem: np.ndarray | None = None
208
+ if time.time() - t0 < timeout_s:
209
+ try:
210
+ import planetary_computer as pc
211
+ from pystac_client import Client
212
+ client = Client.open(
213
+ "https://planetarycomputer.microsoft.com/api/stac/v1",
214
+ modifier=pc.sign_inplace,
215
+ )
216
+ dem_search = client.search(
217
+ collections=["cop-dem-glo-30"],
218
+ bbox=[lon - 0.02, lat - 0.02, lon + 0.02, lat + 0.02],
219
+ max_items=1,
220
+ )
221
+ dem_items = list(dem_search.items())
222
+ if dem_items:
223
+ href = dem_items[0].assets["data"].href
224
+ dem = _read_band(href, bbox, epsg)
225
+ dem = dem[None, :, :] # add channel dim
226
+ except Exception as e:
227
+ log.warning("eo_chip: DEM fetch best-effort failed: %s", e)
228
+
229
+ return {
230
+ "ok": True,
231
+ "lat": lat, "lon": lon,
232
+ "epsg": epsg, "chip_px": CHIP_PX, "pixel_m": PIXEL_M,
233
+ "s2": s2, "s1": s1, "dem": dem,
234
+ "s2_meta": {
235
+ "scene_id": item.id,
236
+ "datetime": (item.datetime.isoformat() if item.datetime else None),
237
+ "cloud_cover": cc,
238
+ },
239
+ "s1_meta": s1_meta,
240
+ "elapsed_s": round(time.time() - t0, 2),
241
+ }
242
+
243
+
244
+ def _to_terramind_tensors(modalities: dict[str, Any]) -> dict[str, Any]:
245
+ """Shape numpy modality arrays into the (B, C, T, H, W) tensors
246
+ TerraMind expects with `temporal_n_timestamps=4`. Single-timestep
247
+ fetches get tiled to T=4 β€” same observation in every slot.
248
+ """
249
+ import torch
250
+ s2 = modalities["s2"] # (12, H, W)
251
+ s2_t = torch.from_numpy(s2).float().unsqueeze(1) # (12, 1, H, W)
252
+ s2_t = s2_t.repeat(1, N_TIMESTEPS, 1, 1).unsqueeze(0) # (1, 12, T, H, W)
253
+ chips = {"S2L2A": s2_t}
254
+ if modalities.get("s1") is not None:
255
+ s1 = modalities["s1"] # (2, H, W)
256
+ s1_t = torch.from_numpy(s1).float().unsqueeze(1)
257
+ s1_t = s1_t.repeat(1, N_TIMESTEPS, 1, 1).unsqueeze(0)
258
+ chips["S1RTC"] = s1_t
259
+ if modalities.get("dem") is not None:
260
+ dem = modalities["dem"] # (1, H, W)
261
+ dem_t = torch.from_numpy(dem).float().unsqueeze(1)
262
+ dem_t = dem_t.repeat(1, N_TIMESTEPS, 1, 1).unsqueeze(0)
263
+ chips["DEM"] = dem_t
264
+ return chips
265
+
266
+
267
+ def fetch(lat: float, lon: float, timeout_s: float = 60.0) -> dict[str, Any]:
268
+ """Run the chip pipeline. Always returns a dict with at minimum
269
+ `{ok, skipped|err, ...}`; on success the dict carries the
270
+ co-registered numpy arrays plus `tensors` (the TerraMind-shaped
271
+ torch dict).
272
+ """
273
+ if not ENABLE:
274
+ return {"ok": False, "skipped": "RIPRAP_EO_CHIP_ENABLE=0"}
275
+ if not _DEPS_OK:
276
+ return {"ok": False,
277
+ "skipped": f"deps unavailable on this deployment: "
278
+ f"{_DEPS_MISSING}"}
279
+ with _FETCH_LOCK:
280
+ try:
281
+ modalities = _fetch_modalities(lat, lon, timeout_s=timeout_s)
282
+ except Exception as e:
283
+ log.exception("eo_chip: fetch failed")
284
+ return {"ok": False, "err": f"{type(e).__name__}: {e}"}
285
+ if not modalities.get("ok"):
286
+ return modalities
287
+ try:
288
+ modalities["tensors"] = _to_terramind_tensors(modalities)
289
+ except Exception as e:
290
+ log.exception("eo_chip: tensor build failed")
291
+ return {"ok": False,
292
+ "err": f"tensor build failed: {type(e).__name__}: {e}"}
293
+ return modalities
app/fsm.py CHANGED
@@ -682,6 +682,131 @@ def step_microtopo(state: State) -> State:
682
 
683
 
684
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
685
  @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
686
  "ida_hwm", "prithvi_water", "noaa_tides", "nws_alerts", "nws_obs",
687
  "ttm_forecast"],
@@ -766,6 +891,7 @@ def _label_counts(gliner_out: dict[str, dict]) -> dict[str, int]:
766
 
767
  @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
768
  "ida_hwm", "prithvi_water", "prithvi_live", "terramind",
 
769
  "noaa_tides", "nws_alerts", "nws_obs", "ttm_forecast",
770
  "ttm_311_forecast", "floodnet_forecast", "mta_entrances",
771
  "nycha_developments", "doe_schools", "doh_hospitals",
@@ -795,6 +921,8 @@ def step_reconcile(state: State) -> State:
795
  "gliner": state.get("gliner"),
796
  "prithvi_live": state.get("prithvi_live"),
797
  "terramind": state.get("terramind"),
 
 
798
  "mta_entrances": state.get("mta_entrances"),
799
  "nycha_developments": state.get("nycha_developments"),
800
  "doe_schools": state.get("doe_schools"),
@@ -910,6 +1038,13 @@ def build_app(query: str):
910
  actions["doh_hospitals"] = step_doh_hospitals
911
  actions["prithvi_live"] = step_prithvi_live
912
  actions["terramind"] = step_terramind
 
 
 
 
 
 
 
913
  actions["rag"] = step_rag
914
  actions["gliner"] = step_gliner
915
  actions["reconcile"] = step_reconcile
@@ -947,6 +1082,9 @@ def run(query: str) -> dict[str, Any]:
947
  "ida_hwm": final_state.get("ida_hwm"),
948
  "prithvi_water": final_state.get("prithvi_water"),
949
  "terramind": final_state.get("terramind"),
 
 
 
950
  "noaa_tides": final_state.get("noaa_tides"),
951
  "nws_alerts": final_state.get("nws_alerts"),
952
  "nws_obs": final_state.get("nws_obs"),
@@ -1068,6 +1206,8 @@ def iter_steps(query: str):
1068
  "prithvi_water": state.get("prithvi_water"),
1069
  "prithvi_live": state.get("prithvi_live"),
1070
  "terramind": state.get("terramind"),
 
 
1071
  "noaa_tides": state.get("noaa_tides"),
1072
  "nws_alerts": state.get("nws_alerts"),
1073
  "nws_obs": state.get("nws_obs"),
 
682
 
683
 
684
 
685
+ @action(reads=["lat", "lon"], writes=["eo_chip", "trace"])
686
+ def step_eo_chip(state: State) -> State:
687
+ """Fetch one S2L2A + S1RTC + DEM chip per query and stash it in
688
+ state for the TerraMind-NYC specialists.
689
+
690
+ Centralised so step_terramind_lulc and step_terramind_buildings
691
+ don't each re-fetch ~150 MB of imagery. Best-effort by design β€”
692
+ a deps-missing or no-scene outcome writes `{ok: False, skipped: ...}`
693
+ and the downstream TerraMind specialists silently no-op."""
694
+ rec, trace = _step(state, "eo_chip_fetch")
695
+ try:
696
+ if state.get("lat") is None:
697
+ rec["ok"] = False; rec["err"] = "no coords"
698
+ return state.update(eo_chip=None, trace=trace)
699
+ if not _in_nyc(state["lat"], state["lon"]):
700
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
701
+ return state.update(eo_chip=None, trace=trace)
702
+ from app.context import eo_chip_cache
703
+ chip = eo_chip_cache.fetch(state["lat"], state["lon"])
704
+ rec["ok"] = bool(chip.get("ok"))
705
+ if not rec["ok"]:
706
+ rec["err"] = chip.get("skipped") or chip.get("err") or "unavailable"
707
+ else:
708
+ rec["result"] = {
709
+ "scene_id": (chip.get("s2_meta") or {}).get("scene_id"),
710
+ "scene_date": ((chip.get("s2_meta") or {}).get("datetime") or "")[:10],
711
+ "cloud_cover": (chip.get("s2_meta") or {}).get("cloud_cover"),
712
+ "has_s1": chip.get("s1") is not None,
713
+ "has_dem": chip.get("dem") is not None,
714
+ }
715
+ return state.update(eo_chip=chip, trace=trace)
716
+ except Exception as e:
717
+ rec["ok"] = False; rec["err"] = str(e)
718
+ log.exception("eo_chip failed")
719
+ return state.update(eo_chip=None, trace=trace)
720
+ finally:
721
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
722
+
723
+
724
+ @action(reads=["lat", "lon", "eo_chip"], writes=["terramind_lulc", "trace"])
725
+ def step_terramind_lulc(state: State) -> State:
726
+ """5-class macro NYC LULC via msradam/TerraMind-NYC-Adapters.
727
+
728
+ Consumes the shared chip from step_eo_chip; if that didn't fire
729
+ cleanly this no-ops. Adapter loading (~1.6 GB base + ~325 MB LoRA)
730
+ is lazy on first call and cached across queries."""
731
+ rec, trace = _step(state, "terramind_lulc")
732
+ try:
733
+ if state.get("lat") is None:
734
+ rec["ok"] = False; rec["err"] = "no coords"
735
+ return state.update(terramind_lulc=None, trace=trace)
736
+ if not _in_nyc(state["lat"], state["lon"]):
737
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
738
+ return state.update(terramind_lulc=None, trace=trace)
739
+ chip = state.get("eo_chip") or {}
740
+ if not chip.get("ok"):
741
+ rec["ok"] = False
742
+ rec["err"] = chip.get("skipped") or chip.get("err") or "no chip"
743
+ return state.update(terramind_lulc=None, trace=trace)
744
+ from app.context import terramind_nyc
745
+ tensors = chip.get("tensors") or {}
746
+ out = terramind_nyc.lulc(
747
+ tensors.get("S2L2A"),
748
+ s1rtc=tensors.get("S1RTC"),
749
+ dem=tensors.get("DEM"),
750
+ )
751
+ rec["ok"] = bool(out.get("ok"))
752
+ if not rec["ok"]:
753
+ rec["err"] = out.get("skipped") or out.get("err") or "unavailable"
754
+ else:
755
+ rec["result"] = {
756
+ "dominant_class": out.get("dominant_class"),
757
+ "dominant_pct": out.get("dominant_pct"),
758
+ "n_classes_observed": len(out.get("class_fractions") or {}),
759
+ }
760
+ return state.update(terramind_lulc=out, trace=trace)
761
+ except Exception as e:
762
+ rec["ok"] = False; rec["err"] = str(e)
763
+ log.exception("terramind_lulc failed")
764
+ return state.update(terramind_lulc=None, trace=trace)
765
+ finally:
766
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
767
+
768
+
769
+ @action(reads=["lat", "lon", "eo_chip"],
770
+ writes=["terramind_buildings", "trace"])
771
+ def step_terramind_buildings(state: State) -> State:
772
+ """Binary NYC building-footprint mask via msradam/TerraMind-NYC-Adapters."""
773
+ rec, trace = _step(state, "terramind_buildings")
774
+ try:
775
+ if state.get("lat") is None:
776
+ rec["ok"] = False; rec["err"] = "no coords"
777
+ return state.update(terramind_buildings=None, trace=trace)
778
+ if not _in_nyc(state["lat"], state["lon"]):
779
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
780
+ return state.update(terramind_buildings=None, trace=trace)
781
+ chip = state.get("eo_chip") or {}
782
+ if not chip.get("ok"):
783
+ rec["ok"] = False
784
+ rec["err"] = chip.get("skipped") or chip.get("err") or "no chip"
785
+ return state.update(terramind_buildings=None, trace=trace)
786
+ from app.context import terramind_nyc
787
+ tensors = chip.get("tensors") or {}
788
+ out = terramind_nyc.buildings(
789
+ tensors.get("S2L2A"),
790
+ s1rtc=tensors.get("S1RTC"),
791
+ dem=tensors.get("DEM"),
792
+ )
793
+ rec["ok"] = bool(out.get("ok"))
794
+ if not rec["ok"]:
795
+ rec["err"] = out.get("skipped") or out.get("err") or "unavailable"
796
+ else:
797
+ rec["result"] = {
798
+ "pct_buildings": out.get("pct_buildings"),
799
+ "n_building_components": out.get("n_building_components"),
800
+ }
801
+ return state.update(terramind_buildings=out, trace=trace)
802
+ except Exception as e:
803
+ rec["ok"] = False; rec["err"] = str(e)
804
+ log.exception("terramind_buildings failed")
805
+ return state.update(terramind_buildings=None, trace=trace)
806
+ finally:
807
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
808
+
809
+
810
  @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
811
  "ida_hwm", "prithvi_water", "noaa_tides", "nws_alerts", "nws_obs",
812
  "ttm_forecast"],
 
891
 
892
  @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
893
  "ida_hwm", "prithvi_water", "prithvi_live", "terramind",
894
+ "terramind_lulc", "terramind_buildings",
895
  "noaa_tides", "nws_alerts", "nws_obs", "ttm_forecast",
896
  "ttm_311_forecast", "floodnet_forecast", "mta_entrances",
897
  "nycha_developments", "doe_schools", "doh_hospitals",
 
921
  "gliner": state.get("gliner"),
922
  "prithvi_live": state.get("prithvi_live"),
923
  "terramind": state.get("terramind"),
924
+ "terramind_lulc": state.get("terramind_lulc"),
925
+ "terramind_buildings": state.get("terramind_buildings"),
926
  "mta_entrances": state.get("mta_entrances"),
927
  "nycha_developments": state.get("nycha_developments"),
928
  "doe_schools": state.get("doe_schools"),
 
1038
  actions["doh_hospitals"] = step_doh_hospitals
1039
  actions["prithvi_live"] = step_prithvi_live
1040
  actions["terramind"] = step_terramind
1041
+ # New TerraMind-NYC LoRA family β€” one chip fetch feeds two
1042
+ # specialists. Keep eo_chip directly before the two consumers
1043
+ # so the chip stays warm in memory and isn't garbage-collected
1044
+ # by anything in between.
1045
+ actions["eo_chip"] = step_eo_chip
1046
+ actions["terramind_lulc"] = step_terramind_lulc
1047
+ actions["terramind_buildings"] = step_terramind_buildings
1048
  actions["rag"] = step_rag
1049
  actions["gliner"] = step_gliner
1050
  actions["reconcile"] = step_reconcile
 
1082
  "ida_hwm": final_state.get("ida_hwm"),
1083
  "prithvi_water": final_state.get("prithvi_water"),
1084
  "terramind": final_state.get("terramind"),
1085
+ "terramind_lulc": final_state.get("terramind_lulc"),
1086
+ "terramind_buildings": final_state.get("terramind_buildings"),
1087
+ "eo_chip": final_state.get("eo_chip"),
1088
  "noaa_tides": final_state.get("noaa_tides"),
1089
  "nws_alerts": final_state.get("nws_alerts"),
1090
  "nws_obs": final_state.get("nws_obs"),
 
1206
  "prithvi_water": state.get("prithvi_water"),
1207
  "prithvi_live": state.get("prithvi_live"),
1208
  "terramind": state.get("terramind"),
1209
+ "terramind_lulc": state.get("terramind_lulc"),
1210
+ "terramind_buildings": state.get("terramind_buildings"),
1211
  "noaa_tides": state.get("noaa_tides"),
1212
  "nws_alerts": state.get("nws_alerts"),
1213
  "nws_obs": state.get("nws_obs"),
app/reconcile.py CHANGED
@@ -271,6 +271,8 @@ def trim_docs_to_plan(doc_msgs: list[dict],
271
  "ttm_311_forecast": ("ttm_311_forecast",),
272
  "floodnet_forecast": ("floodnet_forecast",),
273
  "terramind": ("terramind", "syn_"),
 
 
274
  "rag": ("rag_",),
275
  "rag_mta": ("rag_",),
276
  "nta_resolve": ("nta_resolve", "nta_"),
@@ -709,6 +711,32 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
709
  ])
710
  docs.append(_doc_message("terramind_synthetic", body))
711
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
712
  # ---- Touchstone β€” The Live Observer --------------------------------
713
  # Live sensors and per-query EO that change minute to minute:
714
  # FloodNet ultrasonic depth, NYC 311 flood complaints, NWS hourly
@@ -824,6 +852,24 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
824
  ]
825
  docs.append(_doc_message("prithvi_live", body))
826
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
827
  # ---- Lodestone β€” The Projector -------------------------------------
828
  # Forward-looking signals: NWS public flood alerts, Granite TTM r2
829
  # zero-shot Battery surge residual, per-address NYC 311 weekly rate,
 
271
  "ttm_311_forecast": ("ttm_311_forecast",),
272
  "floodnet_forecast": ("floodnet_forecast",),
273
  "terramind": ("terramind", "syn_"),
274
+ "terramind_lulc": ("tm_lulc",),
275
+ "terramind_buildings": ("tm_buildings",),
276
  "rag": ("rag_",),
277
  "rag_mta": ("rag_",),
278
  "nta_resolve": ("nta_resolve", "nta_"),
 
711
  ])
712
  docs.append(_doc_message("terramind_synthetic", body))
713
 
714
+ # TerraMind-NYC Buildings adapter (msradam/TerraMind-NYC-Adapters,
715
+ # Apache-2.0, fine-tuned on NYC building footprints on AMD MI300X).
716
+ # Distinct from the synthetic-prior block above β€” this is a real
717
+ # segmentation against the per-query Sentinel-2/1/DEM chip and
718
+ # reports an empirical building-footprint area fraction.
719
+ tmb = state.get("terramind_buildings")
720
+ if not out_of_nyc and tmb and tmb.get("ok"):
721
+ body = [
722
+ "Source: msradam/TerraMind-NYC-Adapters (Apache-2.0) β€” NYC "
723
+ "Buildings LoRA on TerraMind 1.0 base, fine-tuned on AMD "
724
+ "Instinct MI300X. Test mIoU 0.5511 on held-out NYC chips.",
725
+ f"Adapter: {tmb.get('adapter')}.",
726
+ f"Predicted building-footprint coverage in chip: "
727
+ f"{tmb.get('pct_buildings')}%.",
728
+ ]
729
+ if tmb.get("n_building_components") is not None:
730
+ body.append(
731
+ f"Distinct building connected components: "
732
+ f"{tmb.get('n_building_components')}."
733
+ )
734
+ body.append(
735
+ "Class labels: " + ", ".join(tmb.get("class_labels") or [])
736
+ + "."
737
+ )
738
+ docs.append(_doc_message("tm_buildings", body))
739
+
740
  # ---- Touchstone β€” The Live Observer --------------------------------
741
  # Live sensors and per-query EO that change minute to minute:
742
  # FloodNet ultrasonic depth, NYC 311 flood complaints, NWS hourly
 
852
  ]
853
  docs.append(_doc_message("prithvi_live", body))
854
 
855
+ # TerraMind-NYC LULC adapter β€” current 5-class macro land-cover from
856
+ # the per-query Sentinel-2/1/DEM chip. Empirical observation, not the
857
+ # synthetic-prior emitted by the legacy `terramind_synthetic` doc.
858
+ tml = state.get("terramind_lulc")
859
+ if not out_of_nyc and tml and tml.get("ok"):
860
+ body = [
861
+ "Source: msradam/TerraMind-NYC-Adapters (Apache-2.0) β€” NYC "
862
+ "LULC LoRA on TerraMind 1.0 base, fine-tuned on AMD "
863
+ "Instinct MI300X. Test mIoU 0.5866 on held-out NYC chips.",
864
+ f"Adapter: {tml.get('adapter')}.",
865
+ f"Dominant land-cover class in chip: "
866
+ f"{tml.get('dominant_class')} at {tml.get('dominant_pct')}%.",
867
+ "Per-class fractions:",
868
+ ]
869
+ for label, pct in (tml.get("class_fractions") or {}).items():
870
+ body.append(f" - {label}: {pct}%")
871
+ docs.append(_doc_message("tm_lulc", body))
872
+
873
  # ---- Lodestone β€” The Projector -------------------------------------
874
  # Forward-looking signals: NWS public flood alerts, Granite TTM r2
875
  # zero-shot Battery surge residual, per-address NYC 311 weekly rate,
web/static/agent.js CHANGED
@@ -25,6 +25,9 @@ const STEP_LABELS = {
25
  prithvi_eo_v2: ["Prithvi-EO 2.0 (NASA/IBM)", "Sen1Floods11 satellite segmentation"],
26
  prithvi_eo_live: ["Prithvi-EO 2.0 β€” live segmentation","fresh Sentinel-2 water mask at this address"],
27
  terramind_synthesis: ["TerraMind 1.0 base β€” synthetic LULC", "DEM β†’ ESRI Land Cover, any-to-any generative synthesis (IBM/ESA)"],
 
 
 
28
  rag_granite_embedding: ["Granite Embedding 278M (RAG)", "policy corpus retrieval (+ Granite Reranker R2 if enabled)"],
29
  gliner_extract: ["GLiNER typed extraction", "agencies, dollar amounts, projects, locations"],
30
  reconcile_granite41: ["Granite 4.1 reconcile (local)", "document-grounded synthesis"],
@@ -66,6 +69,8 @@ const SOURCE_LABELS = {
66
  prithvi_water: "Prithvi-EO 2.0 β€” Hurricane Ida 2021 polygons",
67
  prithvi_live: "Prithvi-EO 2.0 β€” live Sentinel-2 water segmentation",
68
  terramind_synthetic: "TerraMind 1.0 base β€” synthetic LULC (DEMβ†’ESRI Land Cover)",
 
 
69
  gliner_comptroller: "GLiNER over Comptroller report",
70
  gliner_dep_2013: "GLiNER over DEP wastewater plan",
71
  gliner_nycha: "GLiNER over NYCHA Lessons Learned",
@@ -115,6 +120,8 @@ const SOURCE_URLS = {
115
  prithvi_water: "https://huggingface.co/ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11",
116
  prithvi_live: "https://huggingface.co/ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11",
117
  terramind_synthetic: "https://huggingface.co/ibm-esa-geospatial/TerraMind-1.0-base",
 
 
118
  gliner_comptroller: "https://huggingface.co/urchade/gliner_medium-v2.1",
119
  gliner_dep_2013: "https://huggingface.co/urchade/gliner_medium-v2.1",
120
  gliner_nycha: "https://huggingface.co/urchade/gliner_medium-v2.1",
@@ -161,6 +168,8 @@ const SOURCE_VINTAGES = {
161
  prithvi_water: "Prithvi-EO 2.0 satellite segmentation, scenes 2021-08-25 (pre) & 2021-09-02 (post Ida)",
162
  prithvi_live: "live Sentinel-2 L2A scene from Microsoft Planetary Computer (acquisition timestamp in payload)",
163
  terramind_synthetic: "synthetic prior β€” TerraMind 1.0 base generated a plausible categorical land-cover map from the LiDAR terrain at this point (deterministic seed, 10 diffusion steps; class fractions cite-able; not a measurement)",
 
 
164
  gliner_comptroller: "GLiNER typed extraction over the Comptroller PDF (per-paragraph)",
165
  gliner_dep_2013: "GLiNER typed extraction over the DEP wastewater plan",
166
  gliner_nycha: "GLiNER typed extraction over the NYCHA Lessons Learned PDF",
 
25
  prithvi_eo_v2: ["Prithvi-EO 2.0 (NASA/IBM)", "Sen1Floods11 satellite segmentation"],
26
  prithvi_eo_live: ["Prithvi-EO 2.0 β€” live segmentation","fresh Sentinel-2 water mask at this address"],
27
  terramind_synthesis: ["TerraMind 1.0 base β€” synthetic LULC", "DEM β†’ ESRI Land Cover, any-to-any generative synthesis (IBM/ESA)"],
28
+ eo_chip_fetch: ["EO chip fetch (S2L2A + S1RTC + DEM)", "single-chip cache for the TerraMind-NYC LoRA family"],
29
+ terramind_lulc: ["TerraMind-NYC β€” LULC (live)", "5-class macro land-cover LoRA (msradam/TerraMind-NYC-Adapters)"],
30
+ terramind_buildings: ["TerraMind-NYC β€” Buildings (live)", "binary building-footprint LoRA (msradam/TerraMind-NYC-Adapters)"],
31
  rag_granite_embedding: ["Granite Embedding 278M (RAG)", "policy corpus retrieval (+ Granite Reranker R2 if enabled)"],
32
  gliner_extract: ["GLiNER typed extraction", "agencies, dollar amounts, projects, locations"],
33
  reconcile_granite41: ["Granite 4.1 reconcile (local)", "document-grounded synthesis"],
 
69
  prithvi_water: "Prithvi-EO 2.0 β€” Hurricane Ida 2021 polygons",
70
  prithvi_live: "Prithvi-EO 2.0 β€” live Sentinel-2 water segmentation",
71
  terramind_synthetic: "TerraMind 1.0 base β€” synthetic LULC (DEMβ†’ESRI Land Cover)",
72
+ tm_lulc: "TerraMind-NYC LULC LoRA (msradam/TerraMind-NYC-Adapters)",
73
+ tm_buildings: "TerraMind-NYC Buildings LoRA (msradam/TerraMind-NYC-Adapters)",
74
  gliner_comptroller: "GLiNER over Comptroller report",
75
  gliner_dep_2013: "GLiNER over DEP wastewater plan",
76
  gliner_nycha: "GLiNER over NYCHA Lessons Learned",
 
120
  prithvi_water: "https://huggingface.co/ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11",
121
  prithvi_live: "https://huggingface.co/ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11",
122
  terramind_synthetic: "https://huggingface.co/ibm-esa-geospatial/TerraMind-1.0-base",
123
+ tm_lulc: "https://huggingface.co/msradam/TerraMind-NYC-Adapters",
124
+ tm_buildings: "https://huggingface.co/msradam/TerraMind-NYC-Adapters",
125
  gliner_comptroller: "https://huggingface.co/urchade/gliner_medium-v2.1",
126
  gliner_dep_2013: "https://huggingface.co/urchade/gliner_medium-v2.1",
127
  gliner_nycha: "https://huggingface.co/urchade/gliner_medium-v2.1",
 
168
  prithvi_water: "Prithvi-EO 2.0 satellite segmentation, scenes 2021-08-25 (pre) & 2021-09-02 (post Ida)",
169
  prithvi_live: "live Sentinel-2 L2A scene from Microsoft Planetary Computer (acquisition timestamp in payload)",
170
  terramind_synthetic: "synthetic prior β€” TerraMind 1.0 base generated a plausible categorical land-cover map from the LiDAR terrain at this point (deterministic seed, 10 diffusion steps; class fractions cite-able; not a measurement)",
171
+ tm_lulc: "live empirical observation β€” TerraMind-NYC LULC LoRA (msradam/TerraMind-NYC-Adapters, fine-tuned on NYC chips on AMD MI300X) over the per-query Sentinel-2/1/DEM chip; 5-class macro land cover with class fractions cite-able",
172
+ tm_buildings: "live empirical observation β€” TerraMind-NYC Buildings LoRA (msradam/TerraMind-NYC-Adapters, fine-tuned on NYC chips on AMD MI300X) over the per-query Sentinel-2/1/DEM chip; binary building-footprint mask + connected-component count",
173
  gliner_comptroller: "GLiNER typed extraction over the Comptroller PDF (per-paragraph)",
174
  gliner_dep_2013: "GLiNER typed extraction over the DEP wastewater plan",
175
  gliner_nycha: "GLiNER typed extraction over the NYCHA Lessons Learned PDF",