Spaces:

lablab-ai-amd-developer-hackathon
/

riprap-nyc

Running

seriffic commited on 3 days ago

Commit

6a82282

1 Parent(s): 9e70214

Backend evolution: Phases 1-10 specialists + agentic FSM + Mellea + LiteLLM router

Telescoped from ~50 spine commits across May 1-4 into one coherent
backend snapshot — what was built between v0.1's 9-specialist
sequential pass and the post-overhaul demo state.

Major additions:
- Planner-routed FSM with 5 intents (live_now, single_address,
neighborhood, development_check, compare). Granite 4.1:3b picks
the intent + targets; Granite 4.1:8b reconciles.
- Mellea-validated reconciliation (rejection sampling with 4
grounding checks: numerics_grounded, no_placeholder_tokens,
citations_dense sentence-scoped, citations_resolve).
- LiteLLM Router for backend abstraction — Ollama default, vLLM
on AMD MI300X via env vars, automatic fallback.
- Phase 1: Prithvi-EO live water segmentation (lazy import).
- Phase 2: GLiNER structured extraction.
- Phase 3: Granite Embedding Reranker R2.
- Phase 4: TerraMind synthetic SAR (parked, lazy import).
- Phase 5: TerraMind micro-finetune (AMD MI300X experiment).
- Phase 6: Refusal classifier (Guardian, later removed).
- Phases 7-10: Register specialists — MTA entrances, NYCHA,
NYC DOE schools, NYS DOH hospitals — with buffered footprint
overlap.
- Live: NOAA tides, NWS alerts/observations, FloodNet TTM forecast.
- Areas: NTA polygons for neighborhood-mode.
- Context: DOB permits scraper for development_check.
- Granite TimeSeries TTM r2 nowcast for surge.

Plus: tests/, experiments/ source + RESULTS, audit/, pitch/cold_open,
ARCHITECTURE.md / METHODOLOGY.md / MONDAY.md / CLAUDE.md docs,
pyproject.toml + requirements-experiments.txt + riprap.py CLI.

Granular spine history is preserved in /Users/amsrahman/riprap-nyc;
this commit is the demo-week consolidated landing.

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitignore +19 -0
ARCHITECTURE.md +696 -0
CLAUDE.md +546 -0
METHODOLOGY.md +264 -0
MONDAY.md +224 -0
agent.py +1 -1
app/areas/__init__.py +0 -0
app/areas/nta.py +224 -0
app/assets/mta_entrances.py +0 -1
app/context/dob_permits.py +258 -0
app/context/gliner_extract.py +125 -0
app/context/microtopo.py +70 -4
app/context/noaa_tides.py +110 -0
app/context/nws_alerts.py +71 -0
app/context/nws_obs.py +108 -0
app/context/nyc311.py +80 -4
app/context/terramind_synthesis.py +357 -0
app/flood_layers/dep_stormwater.py +39 -1
app/flood_layers/ida_hwm.py +13 -0
app/flood_layers/prithvi_live.py +299 -0
app/flood_layers/sandy_inundation.py +29 -1
app/fsm.py +762 -69
app/geocode.py +102 -10
app/intents/__init__.py +3 -0
app/intents/development_check.py +324 -0
app/intents/live_now.py +231 -0
app/intents/neighborhood.py +492 -0
app/intents/single_address.py +73 -0
app/live/__init__.py +0 -0
app/live/floodnet_forecast.py +184 -0
app/live/ttm_forecast.py +363 -0
app/llm.py +296 -0
app/mellea_validator.py +501 -0
app/planner.py +250 -0
app/rag.py +80 -9
app/reconcile.py +749 -82
app/register_builder.py +4 -3
app/registers/__init__.py +0 -0
app/registers/_footprint.py +84 -0
app/registers/doe_schools.py +200 -0
app/registers/doh_hospitals.py +194 -0
app/registers/mta_entrances.py +244 -0
app/registers/nycha.py +270 -0
app/score.py +326 -28
audit/2026-05-03-evening-audit.md +183 -0
audit/2026-05-04-morning-handoff.md +165 -0
data/hospitals.geojson +3 -0
data/nyc_ntas_2020.geojson +3 -0
experiments/00_endpoints/RESULTS.md +51 -0
experiments/00_endpoints/_runner.py +51 -0

.gitignore CHANGED Viewed

@@ -10,7 +10,26 @@ node_modules/
 .ruff_cache/
 .pytest_cache/
 # legacy / intermediate Prithvi artifacts (not shipped)
 data/hls_stack_*.tif
 data/prithvi_runs/
 data/*.legacy_*

 .ruff_cache/
 .pytest_cache/
+# Session-local Claude Code context (per-machine, not for the public repo)
+CLAUDE.local.md
+.claude/
 # legacy / intermediate Prithvi artifacts (not shipped)
 data/hls_stack_*.tif
 data/prithvi_runs/
 data/*.legacy_*
+web/svelte/node_modules/
+web/sveltekit/node_modules/
+web/sveltekit/.svelte-kit/
+# Experiments — cached HF model downloads, training artifacts, intermediate
+# fixtures. RESULTS.md, NOTES.md, and source code stay tracked.
+experiments/**/.cache/
+experiments/**/restore/
+experiments/**/publish/
+experiments/**/*.tif
+experiments/**/*.png
+experiments/**/*.jpg
+experiments/**/*.parquet
+experiments/**/*.npy
+pitch/screenshots-*/

ARCHITECTURE.md ADDED Viewed

	@@ -0,0 +1,696 @@

+# Riprap — Architecture
+> **What it is.** A web tool that takes any NYC address and produces a
+> short, citation-grounded **flood-exposure briefing** — a tier (1–4)
+> with a paragraph of evidence, where every numeric claim links back to
+> the specific dataset, agency report, or model output it came from.
+>
+> **Who it's for.** Urban planners, journalists on deadline, NYCEM
+> grant writers filing FEMA BRIC sub-applications, agency capital
+> planners, researchers under FOIL/IRB constraints — *not* consumers
+> shopping for flood insurance.
+>
+> **Why local foundation models.** A newsroom with FOIL'd documents
+> can't paste them into a vendor LLM. We run Granite 4.1 (3 B-param
+> chat model), Granite Embedding 278M (RAG), Prithvi-EO 2.0 (300 M-param
+> Earth-observation model, offline pre-compute) and Granite TimeSeries
+> TTM r2 (1.5 M-param zero-shot forecaster) inside one container. No
+> vendor LLM is contacted at runtime.
+---
+## 1. A 60-second primer on NYC flooding
+Skip if you already know this. Most architecture docs assume you do —
+this one doesn't.
+### 1.1 Three kinds of flood
+NYC gets hit by three flood mechanisms that look completely different
+on a map and are caused by different physics:
+- **Coastal / surge flooding** — The ocean rises into the city.
+  Driven by storm surge (wind pushing water against the coast),
+  astronomical high tide, and wave run-up. Affects the **shoreline:**
+  Brighton Beach, Coney Island, Red Hook, Lower Manhattan, the
+  Rockaways, Staten Island east shore. **Hurricane Sandy 2012** is
+  the canonical event — water came over the seawall and flooded
+  subway tunnels, hospitals, and electrical substations. Affects
+  buildings that were dry that morning.
+- **Pluvial / stormwater flooding** — Rain falls faster than the
+  drainage system can carry it away. Affects **inland low points,
+  basement apartments, and chronically under-sewered neighborhoods**:
+  Hollis (Queens), Carroll Gardens (Brooklyn), Jamaica. **Hurricane
+  Ida 2021** is the canonical event for NYC — most of the deaths
+  were in basement apartments far from any coast. Optical satellites
+  largely *can't see* this kind of flooding because the water drains
+  fast and is often sub-surface.
+- **Compound flooding** — Coastal + pluvial happening at the same
+  time, with groundwater rising too. Currently the active research
+  frontier (NPCC4 Ch. 3 calls it out explicitly). Most agencies model
+  these mechanisms separately; reality combines them.
+A good civic flood tool has to cover all three and be honest about
+what each signal can and cannot see. Riprap surfaces evidence for all
+three but **doesn't predict damage** — see scope below.
+### 1.2 Empirical vs modeled vs proxy
+Each piece of flood evidence falls into one of three classes, and the
+distinction matters for how much weight to give it:
+- **Empirical** — Something flooded a place and was measured. USGS
+  high-water marks (people went out after Hurricane Ida and surveyed
+  where water reached on building walls). The 2012 Sandy Inundation
+  Zone (mapped by the city after the storm). FloodNet ultrasonic
+  sensors that recorded an actual depth. **Highest-confidence**: this
+  flood happened here.
+- **Modeled scenarios** — Hydraulic models simulate "what if" cases.
+  FEMA's regulatory floodplains (1 % and 0.2 % annual chance). NYC
+  DEP's Stormwater Maps (modeled water depth under three rainfall
+  scenarios with varying sea-level-rise assumptions). **Useful but
+  scenario-bounded**: this could happen here under those conditions.
+- **Proxy signals** — Indirect indicators of flooding. NYC 311
+  complaints ("street flooding", "sewer backup") clustering around an
+  address. Topographic indices (HAND, TWI) suggesting water *would*
+  pool here based on terrain. **Useful but biased**: 311 reflects
+  civic engagement as well as flooding; terrain says nothing about
+  drainage capacity.
+Riprap surfaces all three classes. The score weights them in that
+order (empirical > modeled > proxy), with empirical hits granted a
+**floor rule** — see [§5](#5-the-scoring-rubric).
+### 1.3 Hydrology indices used in this app
+Two terrain-derived numbers come up repeatedly. They're cheap to
+compute from a Digital Elevation Model (DEM) and they're the
+hydrological literature's canonical exposure proxies:
+- **HAND (Height Above Nearest Drainage)** — Vertical distance from
+  the address up to the nearest river/drainage channel. **<1 m** = at
+  drainage level (water *will* reach here in flood). **>10 m** =
+  hillslope (very dry). Nobre et al. 2011.
+- **TWI (Topographic Wetness Index)** — `ln(catchment_area / tan
+  slope)`. **High TWI** = water tends to accumulate here (large
+  contributing area, gentle slope). Beven & Kirkby 1979.
+Neither is a flood prediction; both are exposure indicators that say
+"water *would* pool here based on terrain alone."
+---
+## 2. What Riprap actually produces
+For a given address (or any of three modes — see [§4](#4-three-user-modes)),
+Riprap returns:
+1. **A tier 1–4** computed by a deterministic, published rubric
+   ([§5](#5-the-scoring-rubric)). Tier 1 = "high exposure"; Tier 4 =
+   "limited exposure"; Tier 0 = "no flagged exposure."
+2. **A 4-section briefing paragraph** synthesised by Granite 4.1 with
+   `[doc_id]` citations after every numeric claim. Sections:
+   *Status*, *Empirical evidence*, *Modeled scenarios*, *Policy
+   context*. A section is omitted entirely if no specialist fired for
+   it (silence-over-confabulation contract).
+3. **Evidence cards** — one per fired specialist, with the raw values
+   and a link to the source dataset.
+4. **Map overlay** — the address pinned, with the empirical and
+   modeled flood extents that overlap it.
+5. **Live "right now" signals** — active NWS flood alerts, current
+   tide residual at the Battery, recent precipitation at the nearest
+   ASOS, and a Granite TTM short-horizon forecast of the surge
+   residual. **These do not modify the tier** (per IPCC AR6 WG II's
+   distinction between exposure and event occurrence).
+The full output is a JSON blob with all specialist outputs preserved,
+so a journalist or planner can audit every number that appears in the
+prose.
+---
+## 3. The Burr FSM and how the specialists chain
+Riprap is a **state machine** — a Burr FSM (DAGWorks) — that walks
+through a fixed list of "specialist" functions in order. Each
+specialist either produces a structured fact or stays silent. At the
+end, the reconciler reads all the produced facts and writes the
+paragraph.
+The full chain, in execution order:
+```
+            ┌─────────────────────────────┐
+  query ──► │ 1. geocode (DCP Geosearch)  │  address text → lat/lon, BBL, borough
+            └────────────┬────────────────┘
+                         ▼
+         ┌─────────────────────────────────────────────┐
+         │     STATIC EMPIRICAL + REGULATORY LAYERS    │
+         │ (snapshot of city-published flood layers)   │
+         ├─────────────────────────────────────────────┤
+         │ 2. sandy           in 2012 Sandy zone? Y/N  │  empirical
+         │ 3. dep_stormwater  in 3 modeled scenarios?  │  modeled
+         │ 4. floodnet        live sensor history      │  empirical
+         │ 5. nyc311          flood complaints in 200m │  proxy
+         └────────────┬────────────────────────────────┘
+                      ▼
+         ┌─────────────────────────────────────────────┐
+         │            LIVE "RIGHT NOW" LAYER           │
+         │ (out of static score; reported separately)  │
+         ├─────────────────────────────────────────────┤
+         │ 6. noaa_tides    Battery / Kings Pt level   │  live, 6-min
+         │ 7. nws_alerts    active flood-relevant      │  live
+         │ 8. nws_obs       nearest ASOS recent precip │  live
+         │ 9. ttm_forecast  9.6h surge-residual nowcast│  Granite TTM r2
+         └────────────┬────────────────────────────────┘
+                      ▼
+         ┌─────────────────────────────────────────────┐
+         │   TERRAIN + EVENT-LEVEL EMPIRICAL LAYERS    │
+         ├─────────────────────────────────────────────┤
+         │ 10. microtopo    DEM + TWI + HAND at point  │  proxy
+         │ 11. ida_hwm      USGS Ida 2021 HWM proximity│  empirical
+         │ 12. prithvi      Prithvi-EO Ida flood polys │  empirical (model-derived)
+         └────────────┬────────────────────────────────┘
+                      ▼
+         ┌─────────────────────────────────────────────┐
+         │ 13. rag (Granite Embedding 278M)            │  retrieves policy paragraphs
+         │     query corpus of 5 NYC agency PDFs        │  relevant to this address
+         └────────────┬────────────────────────────────┘
+                      ▼
+         ┌─────────────────────────────────────────────┐
+         │ 14. reconcile (Granite 4.1 :3b on Ollama)   │  document-grounded synthesis
+         │     reads all "documents" produced by 1-13  │  → 4-section cited paragraph
+         │     drops sentences with ungrounded numbers  │  → audit trail
+         └────────────┬────────────────────────────────┘
+                      ▼
+                 cited briefing
+                 + tier badge + evidence cards + map
+```
+Each step is implemented as a `@action` in `app/fsm.py`. The Burr
+runtime handles the state-passing between actions and emits a trace
+record per step (timing, ok/err, summary fields) which the front-end
+shows live as the FSM runs.
+### 3.1 What every specialist does, plain language
+| # | Specialist                | Plain-language description                                                                                                                                                       | Class              |
+|---|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|
+| 1 | **geocode**               | Resolve the user's text ("116-50 Sutphin Blvd, Queens") to a (lat, lon) and a NYC tax-lot ID (BBL). Uses NYC Planning's free Geosearch API.                                      | n/a                |
+| 2 | **sandy**                 | Did the address get flooded by Hurricane Sandy in 2012? Point-in-polygon over the official NYC Sandy Inundation Zone.                                                            | empirical          |
+| 3 | **dep_stormwater**        | Three modeled stormwater-flooding scenarios from NYC DEP: Moderate-2050, Extreme-2080, Tidal-2050. Each tells you depth (none / 0.4–0.8 ft / etc.) at this point.                | modeled            |
+| 4 | **floodnet**              | NYC's ultrasonic flood-sensor network. How many sensors are within 600 m, and have any of them registered a flood event in the last 3 years?                                     | empirical          |
+| 5 | **nyc311**                | The 311 service-request archive. How many flood-related complaints (street flooding, sewer backup, catch-basin clogged) within 200 m of the address over the last 5 years?       | proxy              |
+| 6 | **noaa_tides** *(live)*   | Current tide observation at the nearest of three NOAA gauges (Battery / Kings Pt / Sandy Hook). Reports observed water level, predicted astronomical tide, and the **residual** (≈ surge). | live               |
+| 7 | **nws_alerts** *(live)*   | Are there active NWS flood-relevant alerts at this point right now? Flash Flood Warnings, Coastal Flood Advisories, etc.                                                         | live               |
+| 8 | **nws_obs** *(live)*      | Recent precipitation from the nearest airport ASOS station (KNYC / KLGA / KJFK / KEWR / KFRG).                                                                                   | live               |
+| 9 | **ttm_forecast** *(live)* | Granite TTM r2 zero-shot forecast of the surge **residual** at the Battery for the next ~9.6 h. NOAA already publishes the astronomical tide; TTM forecasts the part NOAA doesn't. | live (model-derived) |
+| 10 | **microtopo**            | LiDAR-derived terrain features at the point: elevation, HAND, TWI, local relief percentile.                                                                                      | proxy              |
+| 11 | **ida_hwm**              | USGS Hurricane Ida 2021 high-water marks — actual measured water heights surveyed in the days after the storm.                                                                   | empirical          |
+| 12 | **prithvi**              | NASA/IBM Prithvi-EO 2.0 segmentation of Sentinel-2 imagery for the Ida pre/post pair. Pre-computed offline; serves point-in-polygon queries against the resulting 166 polygons.   | empirical (model-derived) |
+| 13 | **rag**                  | Granite Embedding 278M retrieves the most-relevant paragraphs from 5 NYC policy PDFs (Comptroller, NPCC4, MTA, NYCHA, ConEd) given the address's borough + which scenarios fired. | policy             |
+| 14 | **reconcile**            | Granite 4.1 :3b reads all the documents produced by steps 1–13 and writes the cited briefing paragraph. See [§6](#6-document-grounded-reconciliation).                            | LLM synthesis      |
+### 3.2 Worked example: 2940 Brighton 3rd St, Brooklyn
+To make the chain concrete, here's what fires for a Brighton Beach
+address:
+| Step | What it returns |
+|---|---|
+| geocode    | `(40.5780, -73.9617)`, BBL `3-08660-0001`, Brooklyn |
+| sandy      | **YES** — inside the 2012 Sandy Inundation Zone |
+| dep_stormwater | `dep_moderate_2050`: depth 0.4-0.8 ft; `dep_extreme_2080`: depth 0.8-2.0 ft |
+| floodnet   | 2 sensors within 600 m; 1 trigger event in last 3 yr (peak 14 cm) |
+| nyc311     | 11 flood-related complaints in 200 m, 5-yr window |
+| noaa_tides | Sandy Hook gauge, +0.49 ft residual *(today's reading)* |
+| nws_alerts | 0 active alerts |
+| nws_obs    | KJFK ASOS, no recent precipitation |
+| ttm_forecast | Forecast peak residual +0.6 ft in 4.2 h *(today's run)* |
+| microtopo  | Elevation 2.36 m, HAND 0.7 m, TWI 11.3, percentile 8 (very low) |
+| ida_hwm    | 0 USGS HWMs within 800 m (Ida hit Queens hardest, not Brighton) |
+| prithvi    | Inside an Ida-attributable polygon? **NO** (Ida was pluvial-inland) |
+| rag        | Top hits: NPCC4 Ch.3 (coastal), MTA Resilience (Coney Island D-train), Comptroller |
+| reconcile  | (see below) |
+| **Tier**   | **1 (High exposure)** with empirical floor applied |
+The reconciler then writes:
+```
+**Status.** This Brighton Beach address sits **inside the 2012 Sandy
+Inundation Zone** [sandy], on relatively low ground with HAND of 0.7 m
+[microtopo].
+**Empirical evidence.** NYC 311 records show **11 flood-related
+complaints** within 200 m over the last 5 years [nyc311]; 2 FloodNet
+sensors are within 600 m and one logged a 14 cm event in the last 3
+years [floodnet].
+**Modeled scenarios.** The address sits inside **DEP Moderate-2050**
+with depth class 0.4-0.8 ft and **DEP Extreme-2080** with depth class
+0.8-2.0 ft [dep_moderate_2050][dep_extreme_2080].
+**Policy context.** **NPCC4 Ch. 3** documents accelerating coastal-
+flood frequency along this stretch [rag_npcc4].
+```
+Note what *didn't* fire: no Ida HWM doc (Ida didn't flood here), no
+Prithvi doc (no Ida-attributable polygon), no NWS alerts (clear day),
+no TTM doc (forecast residual under threshold). The reconciler never
+saw those headers and didn't invent them.
+---
+## 4. Three user modes
+| Path                                  | Mode             | What it does |
+|---------------------------------------|------------------|---|
+| `/`                                   | **Single address** | Geocode → run the full FSM → cited paragraph + map. Live demo path. |
+| `/compare`                            | **Compare**      | Two addresses side by side; parallel FSM runs (`asyncio.to_thread`, `OLLAMA_NUM_PARALLEL=2`). Useful for "this site vs the alternative". |
+| `/register/{schools,nycha,mta_entrances}` | **Register** | Pre-computed bulk runs over NYC public-asset registries — 126 schools, 45 NYCHA developments, ~1,900 MTA subway entrances. Loaded from `data/registers/*.json` at boot. |
+Single-address is the live path. Registers are pre-computed because
+running 1,900 reconciler calls at request time is a non-starter; the
+registers job runs offline (see `scripts/build_*_register.py`) and
+the result is served from cache.
+---
+## 5. The scoring rubric
+This is the part of the system that produces the tier 1–4. It is
+**deterministic, published, and not done by the language model**.
+See `METHODOLOGY.md` for the full citation list; here's the
+high-level structure.
+### 5.1 Three thematic sub-indices
+Following Cutter et al. 2003 (SoVI hazards-of-place) and Tate 2012
+(uncertainty analysis), indicators are grouped into thematic sub-
+indices, equal-weighted within each group, normalized to [0, 1]:
+| Sub-index       | What it captures                                         | Top weights |
+|-----------------|----------------------------------------------------------|-------------|
+| **Regulatory**  | Inside FEMA / DEP / NPCC4 modeled or regulated zones     | FEMA 1 %; DEP-2050; DEP Tidal |
+| **Hydrological**| Terrain-based exposure (HAND, TWI, percentile, relief)   | HAND (Nobre 2011); TWI half-weighted (urban DEM noise) |
+| **Empirical**   | Did flooding actually happen here (Sandy, Ida HWMs, 311) | Sandy + HWM<100m → also trigger floor |
+The **composite** is the sum of the three sub-indices (range 0–3).
+Tier breakpoints: ≥1.5 → Tier 1, ≥1.0 → Tier 2, ≥0.5 → Tier 3, >0 →
+Tier 4, 0 → Tier 0.
+### 5.2 Max-empirical floor
+If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
+the tier is capped at **2 (Elevated)** — it cannot be worse,
+regardless of the additive composite.
+This recovers the *important* multiplicative behaviour Balica 2012
+argues for (empirical observations should not be cancelled by
+terrain or modeled scenarios) without giving up additive transparency.
+The 100 m radius is chosen because USGS HWM positional uncertainty is
+typically 5–30 m — 100 m gives ~3σ headroom for a confident "this
+address was inundated" signal.
+### 5.3 Live signals stay out
+NWS alerts, NOAA tide residual, and NWS hourly precipitation are
+**not** in the static tier. Per IPCC AR6 WG II glossary and NPCC4
+Ch. 3, exposure is a quasi-stationary property of place; event
+occurrence is time-varying. They appear separately as live evidence
+cards.
+---
+## 6. Document-grounded reconciliation
+`app/reconcile.py` builds a list of OpenAI-style chat messages where
+each specialist's emission is its own message with a stable `doc_id`
+ride-along on the role. Granite 4.1's Ollama chat template recognises
+any `role: "document <doc_id>"` message and lifts it into a
+`<documents>` block, prepending IBM's official grounded-generation
+system message ("Write the response by strictly aligning with the
+facts in the provided documents").
+Example packet for the Brighton Beach address (abbreviated):
+```python
+[
+    {"role": "system", "content": "<citation-discipline + 4-section skeleton>"},
+    {"role": "document sandy",            "content": "Address is INSIDE the 2012 Sandy zone. ..."},
+    {"role": "document dep_extreme_2080", "content": "Depth class 0.8-2.0 ft. ..."},
+    {"role": "document floodnet",         "content": "2 sensors; peak 14 cm. ..."},
+    {"role": "document nyc311",           "content": "11 flood complaints in 200 m. ..."},
+    {"role": "document microtopo",        "content": "Elev 2.36 m, HAND 0.7 m, TWI 11.3. ..."},
+    {"role": "document rag_npcc4",        "content": "<retrieved paragraph>"},
+    {"role": "user", "content": "Write the cited briefing now."},
+]
+```
+The four-section structure (`**Status.** / **Empirical evidence.** /
+**Modeled scenarios.** / **Policy context.**`) is enforced by the
+`EXTRA_SYSTEM_PROMPT`. Sections without supporting documents are
+omitted entirely.
+### 6.1 Two reconciler models
+- **`granite4.1:3b`** runs the planner and `live_now` (short outputs,
+  routing decisions). Always streamed.
+- **`granite4.1:8b`** runs the synthesis path for `single_address`,
+  `neighborhood`, and `development_check` (long outputs, dense
+  citations). Pre-warmed into VRAM in `entrypoint.sh` so the first
+  query doesn't pay the model-load tax. Both fit warm on the T4 with
+  `OLLAMA_MAX_LOADED_MODELS=2` and `OLLAMA_KEEP_ALIVE=24h`.
+### 6.2 Mellea-validated rejection sampling
+`app/mellea_validator.py` wraps the Granite-via-Ollama call in IBM
+Research's [Mellea](https://github.com/generative-computing/mellea)
+framework — instruct, validate, repair. The synthesis intents call
+`reconcile_strict_streaming(...)` which:
+1. **Streams** each generation attempt's tokens to the user (via the
+   FSM threadlocal `set_token_callback` for `single_address` or a
+   `progress_q` for the polygon intents).
+2. After each attempt, runs **four deterministic checks** on the
+   accumulated paragraph:
+   - **`numerics_grounded`** — every non-trivial number in the output
+     appears verbatim in a source document.
+   - **`no_placeholder_tokens`** — output contains no leaked
+     `[source]` / `<document>` template markup.
+   - **`citations_dense`** — every non-trivial number has a
+     `[doc_id]` citation **somewhere in the same sentence** (sentence
+     boundaries: `. ` / `.\n` / end-of-text).
+   - **`citations_resolve`** — cited `doc_id`s are a subset of the
+     input doc_ids.
+3. If any check fails, fires a `mellea_attempt` SSE event with the
+   failed-requirement names, then **rerolls** with a feedback prompt
+   that names the specific failing sentences (the model usually
+   responds well to surgical corrections). Loop budget: 3 attempts.
+The frontend renders an inline banner above the briefing — amber on
+reroll (with the failed-req list), green on first-try pass. The final
+reconcile step in the trace shows the `passed: N/4 · rerolls: M`
+metadata for full audit transparency.
+### 6.3 Number recognition is identifier-aware
+The numeric guardrail uses `\b-?\d[\d,]*(?:\.\d+)?\b` so that
+identifier codes embedded in prose (`QN1206` NTA codes, `BBL
+3-00589-0003` parcels, `BIN`, `B12` community boards) are *not*
+treated as numeric claims demanding citation. This was the dominant
+false-positive in early probing; without it, almost every neighborhood
+briefing failed `citations_dense` because the opening sentence
+typically reads "*X (NTA QN1206) in Queens…*".
+### 6.4 Why no native Granite 4.x inline citations
+We investigated using Granite's native `<|start_of_cite|>{document_id:
+X}fact<|end_of_cite|>` mode. **It's deprecated in 4.x.** Verified:
+- The official Ollama chat template for `granite4.x` has no citation
+  branch (the 3.3 / 4.0-preview templates did).
+- `granite_common` ships only `granite3/granite32` and
+  `granite3/granite33` subdirs — no 4.x equivalent.
+- `granite-io` has only `granite_3_2/` and `granite_3_3/` processor
+  dirs.
+The base 4.1 weights still contain the cite tokens (training residue),
+so the model emits them as real tokens when nudged — but only as an
+end-of-response list, not inline in prose. IBM's published 4.x
+grounding path is a separate **Citation Generation LoRA** (built on
+`granite-4.0-micro`, not 4.1) requiring HF transformers + LoRA
+loading. Mellea's `OllamaBackend` explicitly raises
+`NotImplementedError` for activated LoRAs. So our hand-rolled
+`[doc_id]` regex + reroll **is** the right pattern for our setup
+(Granite 4.1 via Ollama, inline placement).
+---
+## 7. The four foundation models
+| Model | Params | Runtime | Role |
+|-------|--------|---------|------|
+| **Granite 4.1 :3b**         | 3 B    | Ollama (GPU on T4)                   | Planner (intent + specialist routing) + `live_now` reconciler. |
+| **Granite 4.1 :8b**         | 8 B    | Ollama (GPU on T4)                   | Synthesis reconciler for `single_address`, `neighborhood`, `development_check`. Validated by Mellea (4 grounding requirements + reroll). |
+| **Granite Embedding 278M**  | 278 M  | sentence-transformers (CPU)          | RAG retrieval over 5 policy PDFs at query time.    |
+| **Prithvi-EO 2.0**          | 300 M  | TerraTorch (offline pre-compute)     | Sen1Floods11 fine-tune; segmented Hurricane Ida 2021 pre/post Sentinel-2 polygons baked into `data/`. |
+| **Granite TimeSeries TTM r2** | 1.5 M | granite-tsfm (CPU)                  | Zero-shot forecast of the Battery surge residual, ~9.6 h horizon. |
+**Granite 4.1 ≠ Granite Time Series.** Granite 4.1 is IBM's chat-LLM
+family. Granite TimeSeries TTM is a separate IBM Research product
+line (Ekambaram et al. 2024, NeurIPS). Both happen to share the
+"Granite" brand but have different architectures, training data, and
+authors.
+### 7.1 Why Prithvi runs offline
+Prithvi-EO 2.0 with TerraTorch needs a GPU and minutes per HLS tile.
+We segmented Hurricane Ida 2021 once (pre: 2021-08-25, post:
+2021-09-02 ~12 h after peak), filtered the output (>30 000 sqft to
+drop noise, <1 km² to drop tidal artifacts) into **166 polygons**
+baked into `data/prithvi_ida_2021.geojson`. The runtime FSM does a
+point-in-polygon test, not fresh inference. This is honest about
+where foundation models earn their keep: **once, to produce a
+defensible event-level signal — not per request**.
+### 7.2 Why TTM r2 runs live
+TTM r2 is **1.5 M params** — vastly smaller than Prithvi or Granite
+4.1. Inference is millisecond-scale even on CPU. It forecasts only
+the residual (surge component) at the Battery, which complements the
+NOAA snapshot specialist; it does **not** try to forecast the
+astronomical tide (NOAA already publishes that exactly).
+---
+## 8. Live signals separation
+Live data (steps 6–9 in the FSM diagram) is fundamentally different
+from static layers and is handled separately:
+- **Surface**: in evidence cards and a "Right now" section in the UI.
+- **Score**: explicitly excluded. Tier is reproducible across queries
+  unless source data changed.
+- **Cadence**: NOAA tides update every 6 min; NWS alerts on push;
+  NWS obs ~hourly; TTM is computed per query (cheap).
+- **Failure mode**: graceful — if NOAA times out, no `noaa_tides`
+  doc is emitted; the reconciler simply doesn't see it.
+This mirrors how First Street separates Flood Factor (static, 30-yr)
+from event-day Flood Lab products, and how Fathom separates Global
+Flood Map from real-time intelligence.
+---
+## 9. Repository layout
+```
+riprap-nyc/
+  ARCHITECTURE.md            this file
+  METHODOLOGY.md             scoring methodology + full citations
+  README.md                  HF Spaces frontmatter + user-facing summary
+  Dockerfile                 nvidia/cuda:12.4 base + Ollama + Granite
+  entrypoint.sh              Ollama daemon + uvicorn launcher
+  requirements.txt           runtime deps (FastAPI, geopandas, sentence-transformers, ollama, burr, granite-tsfm)
+  pyproject.toml             ruff + vulture config
+  riprap.py                  CLI driver for register builds
+  agent.py                   single-address CLI
+  app/
+    fsm.py                   Burr FSM (14 actions; Mellea hooks via threadlocal)
+    planner.py               Granite 4.1:3b intent router (5 intents)
+    geocode.py               NYC DCP Geosearch + borough-hint filter
+    reconcile.py             Granite 4.1 grounded reconciler + numeric guardrail
+    mellea_validator.py      streaming rejection sampler + 4 grounding checks
+    rag.py                   Granite Embedding 278M retrieval
+    score.py                 deterministic exposure rubric (3 sub-indices, floor)
+    spatial.py               geopandas join helpers
+    energy.py                per-query inference Wh accounting
+    register_builder.py      bulk-mode runner (offline)
+    intents/                 per-intent orchestration on top of fsm.py
+      live_now.py            shoreline tide + alerts (cheap, non-strict)
+      single_address.py      drives the linear FSM with strict reconcile
+      neighborhood.py        polygon-aggregated specialists
+      development_check.py   DOB permit overlap with flood polygons
+      compare.py             two-address side-by-side
+    areas/
+      nta.py                 NYC NTA 2020 polygon resolver
+    flood_layers/
+      sandy_inundation.py    NYC OD 5xsi-dfpx
+      dep_stormwater.py      9i7c-xyvv (3 scenarios)
+      ida_hwm.py             USGS STN Event 312
+      prithvi_water.py       Ida pre/post diff polygons (offline-built)
+    context/
+      microtopo.py           DEM + TWI + HAND raster sampling
+      nyc311.py              erm2-nwe9 buffer aggregation
+      floodnet.py            api.floodnet.nyc Hasura GraphQL
+      noaa_tides.py          live water level + residual
+      nws_alerts.py          live alerts at point
+      nws_obs.py             nearest ASOS hourly METAR
+    live/
+      ttm_forecast.py        Granite TTM r2 surge-residual nowcast
+    assets/
+      schools.py             DCP FacDB
+      nycha.py               phvi-damg
+      mta_entrances.py       i9wp-a4ja
+  web/
+    main.py                  FastAPI (5 pages, JSON endpoints, 2 SSE streams)
+    static/
+      index.html             classic single-address report (compatibility)
+      agent.html             primary UI: planner + live trace + briefing
+      agent.js               EventSource client; sets properties on
+                              <r-briefing> / <r-trace> / <r-sources-footer>
+      report.html / .js      auditable PDF-formatted export view
+      compare.html / .js     two-address side-by-side
+      register.html / .js    bulk register browser
+      style.css              IBM Plex Sans, Planning Labs idiom
+      dist/                  Svelte 5 custom-element bundle (committed —
+                              HF Spaces doesn't run a Node build).
+                              Built from web/svelte/ via `npm run build`.
+  web/svelte/                Svelte 5 source. Build → web/static/dist/.
+    package.json             vite + @sveltejs/vite-plugin-svelte
+    vite.config.js           lib mode; customElement: true globally
+    src/main.js              registers <r-briefing>, <r-trace>,
+                              <r-sources-footer>; re-exports stores
+    src/lib/stores.js        highlightedDocId, citeIndex (writable)
+    src/lib/Briefing.svelte
+    src/lib/Trace.svelte
+    src/lib/SourcesFooter.svelte
+  scripts/                   offline pre-compute + diagnostic probes
+    run_prithvi_ida.py
+    compute_hydrology_indices.py
+    fetch_nyc_dem.py
+    fetch_ida_hwms.py
+    build_schools_register.py
+    build_nycha_register.py
+    build_mta_entrances_register.py
+    probe_mellea.py          drives the SSE stream N times, dumps
+                              per-attempt pass/fail to CSV
+  corpus/                    5 LFS-tracked NYC policy PDFs
+  data/                      LFS-tracked baked fixtures
+    sandy_inundation.geojson
+    prithvi_ida_2021.geojson 166 Hurricane Ida polygons
+    ida_2021_hwms_ny.geojson
+    nyc_dem_30m.tif, twi.tif, hand.tif
+    schools.geojson, nycha.geojson, mta_entrances.geojson
+    dep/                     Esri FileGDBs (DEP scenarios)
+    registers/               pre-computed register outputs
+```
+---
+## 10. Honest scope (what Riprap does NOT do)
+- **Not a damage probability.** Riprap is exposure triage. We have no
+  labeled flood-damage outcomes (claim records, insurance loss data),
+  so we cannot calibrate. The tier is a literature-grounded prior,
+  not a prediction.
+- **Not a flood insurance rating.** For that, see FEMA Risk Rating 2.0
+  (claims-driven GLM over decades of labeled outcomes).
+- **Not a vulnerability assessment.** Engineering fragility (foundation
+  type, electrical hardening, drainage condition), social capacity,
+  and financial absorption are out of scope.
+- **No sub-surface flooding.** Optical satellites can't see basement
+  apartments or subway entrances — the dominant Hurricane Ida damage
+  mode in NYC. Prithvi correctly emits no polygons for Hollis or
+  Carroll Gardens. That silence is a feature, not a bug.
+- **Vintage-bounded.** FEMA NFHL is years stale; DEP Stormwater Maps
+  are 2021; corpus PDFs are point-in-time. All vintages are cited in
+  the methodology panel.
+- **Public infrastructure only.** ConEd substations, water-supply
+  components, and other adversarially-sensitive registers are not
+  published. NYC OD has the same redaction posture; we follow it.
+---
+## 11. Why local foundation models
+1. **Data governance.** A newsroom with FOIL'd documents, an agency
+   capital planner with internal data, or a researcher under IRB
+   constraints can't paste organization context into a vendor LLM.
+   All four models run inside this container; the org boundary
+   holds. Public NYC and USGS services receive resolved address
+   coordinates only; no LLM vendor does.
+2. **Inference energy.** Granite 4.1 :3b draws roughly **0.03 Wh per
+   query** vs an estimated **~0.3 Wh per query** for GPT-4o-class
+   frontier models ([Epoch AI, 2025](https://epoch.ai/gradient-updates/how-much-energy-does-chatgpt-use)).
+   Order of magnitude lower per-query inference energy. The
+   methodology panel reports a per-query Wh estimate so users can
+   verify.
+3. **Reproducibility.** Apache-2.0 stack end to end; no commercial
+   licenses required to reproduce the system.
+---
+## 12. Deployment
+### 12.1 Hugging Face Spaces (production)
+Docker SDK, base `nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04`
+(Python 3.10), hardware `nvidia-t4-small` (1× T4, 16 GB VRAM,
+4 vCPU, 15 GB RAM). Ollama + **both** Granite 4.1 variants
+(`:3b` for routing, `:8b` for synthesis) baked into the image at
+build time (~10 GB total). Granite Embedding 278M and Granite TTM r2
+download to `$HF_HOME` on first request (~280 MB and ~30 MB).
+`entrypoint.sh` starts Ollama, then **pre-warms `granite4.1:8b`** with
+a one-token generation so the first user reconcile doesn't pay the
+~30s VRAM-load tax. `OLLAMA_KEEP_ALIVE=24h` holds both models resident
+through the demo. `OLLAMA_FLASH_ATTENTION=1` and
+`OLLAMA_KV_CACHE_TYPE=q8_0` cut KV memory on the 8b path.
+Cold-start (first query after container restart) takes ~60–90 s while
+weights load and TTM downloads. Warm queries:
+- `live_now` ~3–6 s
+- `single_address` / `neighborhood` / `development_check` ~30–60 s
+  with Mellea (one streamed attempt + post-validation; one reroll
+  adds ~25 s)
+The Svelte bundle in `web/static/dist/` is committed, so HF Spaces
+runs no Node build step — only the Python deps + Ollama install.
+### 12.2 Local development
+```bash
+uv venv --python 3.12
+source .venv/bin/activate
+uv pip install -r requirements.txt
+ollama pull granite4.1:3b
+ollama pull granite4.1:8b
+uvicorn web.main:app --reload --port 8000
+# Frontend (only when changing components)
+cd web/svelte && npm install && npm run build
+```
+The fixtures in `data/` and the policy PDFs in `corpus/` are LFS-
+tracked. Granite Embedding and TTM download on first query.
+### 12.3 Diagnostic probes
+```bash
+# Drive the live stream N times, dump per-attempt Mellea outcomes:
+.venv/bin/python scripts/probe_mellea.py --query "Hollis" --runs 5
+# Output: outputs/probe_*.csv with per-attempt pass/fail, paragraph,
+#         elapsed time, reroll count.
+```
+---
+## 13. License
+Apache-2.0. All foundation models (Granite 4.1, Granite Embedding,
+Prithvi-EO 2.0, Granite TimeSeries TTM r2) and all input datasets
+(NYC OpenData, USGS, NOAA, NWS, FloodNet NYC, NASA/MS Planetary
+Computer for HLS Sentinel-2) are public. Visual idiom adapted from
+[NYC Planning Labs](https://planninglabs.nyc/).

CLAUDE.md ADDED Viewed

	@@ -0,0 +1,546 @@

+# Riprap — Claude Code orientation
+Citation-grounded NYC flood-exposure briefings. Granite 4.1 via a
+LiteLLM Router (Ollama for local/T4, vLLM-on-ROCm for the AMD MI300X
+demo path), Mellea-validated reconciliation, vanilla JS + Svelte 5
+custom elements, FastAPI on T4 (HF Spaces).
+**AMD hackathon demo: May 4–10, 2026.**
+`ARCHITECTURE.md` is the source of truth for *what the system does*.
+This file is for *how to work on it*.
+---
+## Critical constraints
+- **HF Spaces base image is Python 3.10.** This pins:
+  - `mellea<0.4` (0.4+ requires 3.11+) — no `find_citations` /
+    `flag_hallucinated_content` intrinsics in production.
+  - `transformers>=4.55,<5` + `huggingface_hub>=0.34,<1` — coexistence
+    with `granite-tsfm 0.3.x` (which calls `transformers.utils.download_url`,
+    removed in transformers 5.x).
+  - Don't bump these without testing the full HF rebuild end-to-end.
+  - Local venv is Python 3.12 — Mellea 0.4.x is installed there but
+    its RAG intrinsics need a HuggingFace transformers backend (LoRA
+    loading); they don't work over Ollama. Don't accidentally rely on
+    them.
+- **All LLM calls go through `app/llm.py`.** Never `import ollama`
+  in new code. The shim exposes `chat(model, messages, options,
+  stream, format)` with the same return shape as `ollama.chat`, and
+  routes through a LiteLLM Router. Two backends are wired:
+  - `RIPRAP_LLM_PRIMARY=ollama` (default) — local + HF Space path.
+    Quant override: `RIPRAP_OLLAMA_8B_TAG=granite4.1:8b-q3_K_M`
+    saves ~1 GB resident vs the default Q4_K_M.
+  - `RIPRAP_LLM_PRIMARY=vllm` + `RIPRAP_LLM_BASE_URL` +
+    `RIPRAP_LLM_API_KEY` — AMD MI300X demo path. Auto-fails over to
+    Ollama if vLLM is unreachable. Same env vars work for local dev,
+    HF Space → AMD, or AMD droplet → AMD self-host.
+  An mlx-lm-backed third backend was prototyped (Apple-Silicon-native
+  via `mlx_lm.server` with speculative decoding) but reverted — the
+  install bumped torch internals in a way that broke `terratorch`'s
+  Prithvi backbone with a `meta vs cpu` device mismatch. Stick with
+  Ollama on local; switch to vLLM for the AMD demo. mlx-lm can be
+  revisited once the EO toolchain isolates its torch state.
+- **Ollama and vLLM use different chat templates.** Ollama's
+  Modelfile recognises `role: "document <doc_id>"` and bundles those
+  into a `<documents>` block. The HF tokenizer chat template (used
+  by vLLM) silently drops non-standard roles. `app/llm.py` papers
+  over this: extracts document-role messages into
+  `extra_body.documents` / `chat_template_kwargs.documents` for vLLM,
+  while leaving them in `messages` for the Ollama fallback. It also
+  normalizes vLLM's `[doc_id=X]` emissions back to `[X]` so Mellea
+  checks and frontend chips see the same format from both paths.
+- **The vLLM deployment serves only the 8B.** One served-name per
+  vLLM process and we don't run two containers. The planner alias
+  (`granite-3b`) is mapped to the same served name as the reconciler
+  (`granite-4.1-8b`) when primary=vllm. On Ollama, 3B and 8B are
+  distinct. Override per-alias with `RIPRAP_LLM_VLLM_3B_NAME` /
+  `RIPRAP_LLM_VLLM_8B_NAME` if you stand up a second vLLM.
+- **No LoRA / aLoRA / Granite Citation LoRA in production.** Even
+  with vLLM available, we don't load LoRAs at runtime — Mellea's
+  Ollama backend raises `NotImplementedError` for activated LoRAs,
+  and we deliberately keep the call path identical across backends.
+  Hand-rolled `[doc_id]` regex + reroll is the citation discipline
+  mechanism. See §6 of ARCHITECTURE.md.
+- **Two committed JS bundles, two source dirs.** HF Spaces does not
+  run Node, so we ship pre-built artefacts:
+  - `web/sveltekit/build/` — **the new design-system UI** (SvelteKit +
+    adapter-static, IBM Plex, four-tier glyphs, MapLibre). Sources in
+    `web/sveltekit/src/`. Rebuild with `cd web/sveltekit && npm run
+    build`. FastAPI serves it at `/`, `/q/sample`, `/q/<query>`.
+  - `web/static/dist/riprap.js` — legacy custom-element bundle. Sources
+    in `web/svelte/src/`. Rebuild with `cd web/svelte && npm run
+    build`. FastAPI serves it at `/legacy`, `/single`, `/compare`,
+    `/register/*` while the new UI is being filled in.
+  Commit both build outputs after editing the corresponding sources.
+- **Models baked into the Docker image.** Both `granite4.1:3b` and
+  `granite4.1:8b` are pulled at build time (~10 GB), so HF rebuilds
+  take ~10 min. `entrypoint.sh` pre-warms the 8b into VRAM after
+  Ollama is up so the first reconcile doesn't pay a cold-load.
+---
+## Run / build / test
+```bash
+# Local server (default: routes to local Ollama)
+cd /Users/amsrahman/riprap-nyc
+.venv/bin/uvicorn web.main:app --host 127.0.0.1 --port 7860 --log-level info
+# → http://127.0.0.1:7860/  (primary UI; agent.html is the canonical home)
+# Local server pointed at AMD MI300X (vLLM primary, Ollama fallback)
+RIPRAP_LLM_PRIMARY=vllm \
+RIPRAP_LLM_BASE_URL=http://<droplet-ip>:8000/v1 \
+RIPRAP_LLM_API_KEY=<token> \
+.venv/bin/uvicorn web.main:app --host 127.0.0.1 --port 7860 --log-level info
+# Pill in the top-right shows "AMD MI300X · Granite 4.1 / vLLM" when
+# the primary is reachable; flips amber on Ollama fallback, red if
+# everything is down. Backed by GET /api/backend.
+# Frontend rebuilds (only when sources change)
+cd web/sveltekit && npm run build  # writes web/sveltekit/build/   (new UI)
+cd web/svelte && npm run build      # writes web/static/dist/riprap.js (legacy)
+# Static checks (all should be clean)
+.venv/bin/ruff check app/ web/ scripts/
+.venv/bin/vulture app/ web/main.py --min-confidence 90
+.venv/bin/radon cc app/ web/main.py -s -n C   # complexity hotspots
+# Programmatic Mellea probe (server must be running)
+.venv/bin/python scripts/probe_mellea.py --query "Hollis" --runs 5
+# Outputs outputs/probe_*.csv with per-attempt pass/fail, paragraph,
+# elapsed time, reroll count.
+# Smoke-test the streaming endpoint directly
+curl -sN "http://127.0.0.1:7860/api/agent/stream?q=Hollis" --max-time 120
+# Local-tuning env knobs (independent of backend):
+#   OLLAMA_KEEP_ALIVE=24h          keep granite4.1:8b resident across requests
+#   OLLAMA_NUM_PARALLEL=1          stop Ollama loading a 2nd copy under contention
+#   RIPRAP_MELLEA_MAX_ATTEMPTS=2   cap rejection-sampling rerolls (default 2 local, 3 remote)
+#   RIPRAP_TRIM_DOCS=1             drop doc messages whose specialist isn't in plan (default on)
+#   RIPRAP_OLLAMA_8B_TAG=granite4.1:8b-q3_K_M   ~1 GB lighter than default Q4_K_M
+```
+**Don't restart uvicorn while a model is mid-generation** — Ollama will
+keep the request alive but the FastAPI handler dies, leaving the user
+staring at a dead stream. Pre-flight: `pkill -f "uvicorn web.main:app"`.
+---
+## Deploy
+Single command for both remotes:
+```bash
+git push && git push huggingface main
+```
+GitHub remote = `origin` (msradam/riprap-nyc). HF Space remote =
+`huggingface` (msradam/riprap-nyc on huggingface.co).
+HF rebuild status:
+```bash
+curl -sf "https://huggingface.co/api/spaces/msradam/riprap-nyc/runtime" \
+  | python3 -m json.tool
+# stage: BUILDING | RUNNING_APP_STARTING | RUNNING
+# sha:   should match the latest local commit when RUNNING
+```
+Live URL: <https://msradam-riprap-nyc.hf.space>
+---
+## Repo map (high-signal files)
+```
+app/
+  llm.py                    LiteLLM Router shim. chat(model, messages, options,
+                            stream, format) — drop-in for ollama.chat. Routes
+                            to vLLM (AMD MI300X) when RIPRAP_LLM_PRIMARY=vllm,
+                            with Ollama fallback. Extracts role="document <id>"
+                            into extra_body.documents for vLLM's HF chat
+                            template; normalizes [doc_id=X] -> [X]. backend_info()
+                            powers the UI pill via web/main.py:/api/backend.
+  fsm.py                    Burr FSM. Threadlocal hooks: set_strict_mode,
+                            set_token_callback, set_mellea_attempt_callback.
+                            step_reconcile() routes to reconcile_strict_streaming
+                            when strict mode is on.
+  reconcile.py              EXTRA_SYSTEM_PROMPT (the 4-section skeleton + citation
+                            discipline). build_documents() is the doc_id ride-along.
+                            verify_paragraph() is the legacy non-strict guardrail.
+  mellea_validator.py       reconcile_strict_streaming() — the streaming rejection
+                            sampler with 4 grounding checks (numerics_grounded,
+                            no_placeholder_tokens, citations_dense,
+                            citations_resolve). Reroll feedback names the specific
+                            failing sentences.
+  planner.py                Granite 4.1:3b intent router → live_now / single_address
+                            / neighborhood / development_check / compare.
+  intents/                  Per-intent orchestration. Each run() takes
+                            (plan, query, progress_q, strict). Strict path uses
+                            reconcile_strict_streaming via either threadlocal
+                            (single_address, fsm-based) or direct call (neighborhood,
+                            dev_check).
+  rag.py                    Granite Embedding 278M retrieval over corpus/*.pdf.
+  flood_layers/             Sandy zone, DEP scenarios, Ida HWMs, Prithvi polygons.
+  context/                  Microtopo (HAND/TWI), 311, FloodNet, NOAA, NWS, DOB.
+  live/ttm_forecast.py      Granite TTM r2 surge residual nowcast.
+web/
+  main.py                   FastAPI; SSE stream at /api/agent/stream emits
+                            plan_token, plan, step, token, mellea_attempt,
+                            final, error, done events.
+  static/
+    agent.html              Primary UI. Mounts <r-briefing>, <r-trace>,
+                            <r-sources-footer> (Svelte custom elements).
+    agent.js                EventSource client. setBriefingText() sets the
+                            .text property on <r-briefing>; pushTraceStep()
+                            calls .pushStep() on <r-trace>. Form binding is
+                            BEFORE ensureMap() so a WebGL throw doesn't
+                            strand the Ask button.
+    dist/riprap.js          Built Svelte bundle (committed).
+    components/             OLD Lit components — kept for reference but
+                            not loaded by agent.html anymore.
+  main.py                   Adds GET /api/backend (live LLM-backend descriptor
+                            + reachability ping for the pill). All other LLM
+                            traffic goes through app/llm.py — don't add
+                            ollama.chat calls here.
+  svelte/src/lib/           Svelte 5 sources. customElement: true globally
+                            via vite.config.js.
+    stores.js               highlightedDocId, citeIndex (writable). The
+                            cross-component chip ↔ source-row highlight
+                            reacts via these.
+scripts/
+  probe_mellea.py           Drives the SSE stream N times, dumps CSV.
+  run_prithvi_ida.py        Offline Prithvi-EO 2.0 segmentation (one-shot).
+  build_*_register.py       Bulk-mode register builders (offline).
+corpus/                     5 LFS-tracked NYC policy PDFs (NPCC4 etc).
+data/                       LFS-tracked baked fixtures (Sandy, DEP, Prithvi
+                            polygons, DEM/HAND/TWI rasters, registers).
+```
+---
+## Project conventions
+### Document message convention
+Specialists emit data as chat messages with `role="document <doc_id>"`.
+Granite 4.1's Ollama template recognises this prefix and bundles them
+into a `<documents>` block + auto-injects IBM's grounded-generation
+system message. Don't reinvent — `app/reconcile.py:build_documents()`
+already wires it. `app/llm.py` additionally extracts the same messages
+into `chat_template_kwargs.documents` so vLLM's HF tokenizer template
+sees them too — both backends honour the same grounding contract from
+identical caller code.
+### The four Mellea grounding requirements
+1. **`numerics_grounded`** — every non-trivial number in the output
+   appears verbatim in a source document.
+2. **`no_placeholder_tokens`** — output contains no leaked
+   `[source]` / `<document>` template fragments.
+3. **`citations_dense`** — every non-trivial number has a `[doc_id]`
+   citation **somewhere in the same sentence**. Sentence scope, not a
+   character window. Identifier codes (`QN1206`, BBL parcels, `B12`)
+   are skipped via `\b` word-boundary regex so they don't get treated
+   as numeric claims.
+4. **`citations_resolve`** — cited `doc_id`s ⊆ input `doc_id`s.
+If you change the regex or sentence boundary, **re-run the probe**:
+```bash
+.venv/bin/python scripts/probe_mellea.py --query "Hollis" --runs 5
+.venv/bin/python scripts/probe_mellea.py --query "100 Gold St Manhattan" --runs 3
+.venv/bin/python scripts/probe_mellea.py --query "what are they building in Gowanus and is it risky" --runs 3
+```
+### Threadlocal hooks in `app/fsm.py`
+The FSM is sync code called from a threadpool executor. To plumb
+streaming callbacks without changing every action signature, we use
+threadlocals:
+- `set_strict_mode(bool)` → `_current_strict_mode()` decides whether
+  `step_reconcile` routes to Mellea or the legacy reconciler.
+- `set_token_callback(fn)` → `_current_token_callback()` for streaming
+  tokens out of the reconciler.
+- `set_mellea_attempt_callback(fn)` → fires after each Mellea attempt
+  with `(attempt_idx, passed, failed)`.
+**Always reset in a `finally:`.** `app/intents/single_address.py:run()`
+is the canonical example.
+### SSE event vocabulary (`/api/agent/stream`)
+| event | payload | when |
+|-------|---------|------|
+| `hello` | `{query}` | connection open |
+| `plan_token` | `{delta}` | each token of the planner JSON |
+| `plan` | `{intent, targets, specialists, rationale}` | planner finished |
+| `step` | `{step, ok, started_at, elapsed_s, result?, err?}` | each FSM action |
+| `token` | `{delta, attempt?}` | each Granite reconcile token |
+| `mellea_attempt` | `{attempt, passed, failed}` | end of each Mellea attempt |
+| `final` | full result dict (`paragraph`, `mellea`, `audit`, `tier`, `score`, ...) | reconcile done |
+| `error` | `{err}` | exception in the runner |
+| `done` | `{}` | stream closing |
+Frontend resets the briefing buffer when `token.attempt` changes
+(handles reroll cleanly).
+### Frontend property convention
+Svelte custom elements take props via JS property setters:
+```js
+const el = document.getElementById("paragraph");  // <r-briefing>
+await customElements.whenDefined("r-briefing");
+el.sourceLabels = SOURCE_LABELS;
+el.text = "...streaming markdown...";
+```
+`<r-trace>` exposes imperative methods on the host:
+```js
+el.pushStep({ step: "geocode", ok: true, elapsed_s: 0.3, result: {...} });
+el.clear();
+```
+`<r-sources-footer>` reads `citeIndex` from the shared store; the
+Briefing populates it whenever its `bodyHtml` is computed.
+---
+## Decisions worth remembering
+These are paths we explored and either chose or ruled out. Don't
+re-litigate them without new information.
+- **Lit → Svelte (May 2026).** Three Lit components were live first
+  (`web/static/components/`) but the user wanted a full Svelte
+  rewrite. Migrated to Svelte 5 custom-element bundle (drop-in
+  replacement — same tag names, same property API). The Lit files
+  are still on disk for reference but not loaded.
+- **Granite 4.x native inline citations are deprecated.** We
+  investigated the `<|start_of_cite|>...<|end_of_cite|>` mode. The
+  official Ollama template removed it for 4.x; `granite_common` ships
+  no `granite4/` package; `granite-io` has no 4.x processor.
+  4.1 emits citation tokens only in an end-of-response list, never
+  inline. IBM's expected 4.x citation path is a separate LoRA on
+  granite-4.0-micro that produces post-hoc JSON — needs HF
+  transformers, not Ollama. **Hand-rolled `[doc_id]` regex + reroll
+  is the right pattern for our setup.**
+- **Mellea 0.4 RAG intrinsics aren't reachable.** `find_citations`,
+  `flag_hallucinated_content`, `check_context_relevance` all route
+  through `GraniteCommonAdapter` → activated LoRA on the HF
+  transformers backend. `mellea/backends/ollama.py:357-359` literally
+  raises `NotImplementedError` for activated LoRAs. To use them we'd
+  swap the serving layer, eat ~5GB more RAM, lose Ollama's
+  optimizations. Not worth it for the demo.
+- **CARTO Voyager basemap (not Stadia).** Tried Stadia Alidade
+  Smooth — looks great, but they 401 without an API key and
+  domain allowlist. Voyager is auth-free, retina-tiled, more
+  editorial than Positron.
+- **Speculative streaming Mellea.** `reconcile_strict_streaming`
+  streams every attempt's tokens to the user (visible at t≈30s
+  instead of after t≈95s of validation silence). Inline banner
+  shows reroll status. Felt latency drops dramatically even when
+  total wall-clock is the same.
+- **Sentence-scoped `citations_dense` + identifier-aware `\b` regex.**
+  The combo killed the chronic 3/4 reroll loop on neighborhood
+  queries. Hollis: was 3/4 with 2 rerolls every run; now 4/4 with
+  ≤1 reroll. Don't tighten the regex back to a fixed-width window
+  without re-running the probe across all three intent types.
+- **LiteLLM Router for backend abstraction (May 2026).** Considered
+  hand-rolling an OpenAI-vs-Ollama dispatch ourselves. LiteLLM's
+  Router gives us model aliasing + fallback for free, and Mellea
+  has a litellm backend if we ever need it. The shim is ~250 lines
+  total (`app/llm.py`); the entire production code path stayed in
+  the `ollama.chat`-shaped call signature. Don't replace this with
+  the openai SDK directly — the failover behaviour is load-bearing.
+- **Granite 4.1 is dense decoder-only.** Earlier confusion: the
+  hybrid Mamba variants are in **Granite 4.0-H**, not 4.1. vLLM
+  0.17 serves 4.1 as a vanilla LLaMA-style model — no architecture
+  risk, no special flags. If a future bump introduces a hybrid 4.x,
+  re-verify vLLM compatibility before deploying.
+- **vLLM HF chat template emits `[doc_id=X]`, Ollama Modelfile emits
+  `[X]`.** The rest of Riprap (Mellea regex, frontend chip parser,
+  citations footer) was written against `[X]`. `app/llm.py` runs a
+  one-line regex normalize on every response and stream chunk. Don't
+  remove it without changing every other consumer.
+- **HF Space → AMD GPU as primary, T4 Ollama as fallback.** Considered
+  using the HF Space's bundled Ollama as a remote inference server
+  (proxy `/v1/chat/completions` from FastAPI to localhost:11434) so
+  that local dev could use the T4. Rejected: T4 is slower than
+  MI300X, surface area is bigger, and the AMD path already covers
+  the "fast remote inference" use case. The proxy idea is recoverable
+  in ~25 lines of FastAPI if we ever want it.
+---
+## Common tasks playbook
+### Add a new specialist
+1. Add a module under `app/context/` or `app/flood_layers/`.
+2. Add an action in `app/fsm.py` (`step_yourname`) with `@action(reads=[...], writes=[...])`.
+3. Wire it into the FSM graph in the `Application.with_actions(...)` chain.
+4. Add a doc message builder in `app/reconcile.py:build_documents()`.
+5. Update `STEP_LABELS` in `web/static/agent.js` for the trace label.
+6. Update `SOURCE_LABELS` / `SOURCE_URLS` / `SOURCE_VINTAGES` in
+   `web/static/agent.js` for the chip + footer rendering.
+7. Double-gate the new specialist: run the SSE probe against both
+   `RIPRAP_LLM_PRIMARY=ollama` and `=vllm` and confirm the briefing
+   cites the new doc_id with no Mellea regressions.
+### Prototype a new specialist (experimental)
+For exploratory work that isn't yet ready to land in `app/`:
+1. Scaffold `experiments/<NN>_<name>/` with its own `RESULTS.md`,
+   smoke tests, and cached fixtures. Don't import from `app/` except
+   `app.llm.chat` — keeps the experiment portable.
+2. License-check the model: confirm Apache-2.0 / MIT / BSD on the
+   actual `LICENSE` file in the model repo (not the HF metadata
+   field — they sometimes disagree). Add a row to
+   `experiments/shared/licenses.md`.
+3. Validate against both `RIPRAP_LLM_PRIMARY=ollama` and
+   `=vllm` before proposing integration. Specialist behaviour must
+   be backend-independent — never branch on backend in specialist
+   code.
+4. Only after the experiment passes both gates and produces a
+   demo-safe trace UI rendering, propose a PR-style summary for
+   integration into `app/`.
+### Change the briefing markdown structure
+1. Edit `EXTRA_SYSTEM_PROMPT` in `app/reconcile.py`.
+2. Edit `renderMarkdownPure()` in `web/svelte/src/lib/Briefing.svelte`
+   if you add new block syntax.
+3. Rebuild Svelte: `cd web/svelte && npm run build`.
+4. Re-run the probe to confirm Mellea still passes.
+### Tune the Mellea checks
+`app/mellea_validator.py`:
+- `_NUM_RE` — number recognition. Use `\b` boundaries to skip
+  identifiers.
+- `_TRIVIAL_NUMS` — set of numbers exempt from citation requirement
+  (small integers, NYC service line numbers like 311/911).
+- `_check_every_claim_cited()` — sentence-scoped; uses `_SENT_END`
+  for boundaries.
+- `_failing_sentences_for_citations()` — feeds the reroll feedback
+  prompt with surgical corrections.
+After any change here: probe across 3 intent types (above).
+### Add a new Svelte component
+1. Create `web/svelte/src/lib/MyComponent.svelte` with
+   `<svelte:options customElement={{ tag: "r-mycomp", props: {...} }} />`.
+2. Side-effect import it in `web/svelte/src/main.js`.
+3. Mount `<r-mycomp>` in `agent.html`.
+4. `cd web/svelte && npm run build`.
+5. Commit `web/static/dist/riprap.js` and `riprap.js.map`.
+---
+## Known sharp edges
+- **`build_documents` complexity (radon F=101).** It's a giant
+  `if`/`elif` per specialist. Don't refactor pre-demo; touching it
+  risks subtle regressions in doc message ordering, which Granite is
+  sensitive to.
+- **Static assets cache hard.** When iterating on Svelte or `agent.js`,
+  the user must hard-reload (⌘⇧R). Cache-busting query strings are
+  not in place.
+- **Ollama keeps stale models loaded across rebuilds locally.** If
+  you change a Modelfile or pull a new tag, restart `ollama serve`
+  to be sure.
+- **Burr FSM `iter_steps` mutates global state.** Don't run two
+  concurrent `single_address` queries against the same uvicorn
+  worker — strict-mode threadlocal makes it safer than it was, but
+  there's no per-request isolation.
+- **Mellea 0.3 vs 0.4 API differences.** Local venv has 0.4 (3.12),
+  HF has 0.3 (3.10). `start_session`, `RejectionSamplingStrategy`,
+  `MelleaSession.instruct(strategy, requirements,
+  return_sampling_results)` are stable across both. Don't import
+  anything from `mellea.stdlib.components.intrinsic.*` — that
+  package only exists in 0.4 and won't import on HF.
+- **HF Space sleeps after idle.** Free tier; first request after
+  sleep is a 30–90 s cold start. Ping the space before a demo.
+- **vLLM cold compile / first-call slowdown.** First few requests
+  against a fresh `vllm serve` container can log surprisingly low
+  throughput (single-digit tokens/s prompt, ~4 tokens/s gen on a
+  MI300X) while ROCm kernels JIT-compile and the prefix cache
+  warms. Subsequent requests are 30–50× faster. If a benchmark
+  reads "vLLM is slow" on the first run, run it three more times
+  before believing it.
+- **Backend pill auto-detection.** `app/llm.py:_default_hardware_label`
+  picks `AMD MI300X` when `RIPRAP_LLM_PRIMARY=vllm`, `NVIDIA T4`
+  when `SPACE_ID` is set (HF Spaces injects this), `Local` otherwise.
+  Override with `RIPRAP_HARDWARE_LABEL` / `RIPRAP_ENGINE_LABEL`
+  if you bring up a different GPU.
+---
+## Useful one-liners
+```bash
+# Tail the local server log
+tail -f /tmp/riprap-local.log
+# Inspect the live HF Space's deployed SHA + stage
+curl -sf "https://huggingface.co/api/spaces/msradam/riprap-nyc/runtime" | python3 -m json.tool
+# Confirm both remotes have the same HEAD
+git log --oneline -1 && git ls-remote huggingface main | head -1
+# Force-re-pull Granite weights locally if Ollama seems wrong
+ollama rm granite4.1:8b && ollama pull granite4.1:8b
+# What backend is the running server on? (live reachability + label)
+curl -s http://127.0.0.1:7860/api/backend | python3 -m json.tool
+# Bring up vLLM on a fresh AMD ROCm droplet (one-shot)
+docker run -d --name vllm \
+  --device=/dev/kfd --device=/dev/dri --group-add=video \
+  --ipc=host --shm-size=16g -p 8000:8000 \
+  -v /root/hf-cache:/root/.cache/huggingface \
+  -e GLOO_SOCKET_IFNAME=eth0 -e VLLM_HOST_IP=127.0.0.1 \
+  vllm/vllm-openai-rocm:v0.17.1 \
+  --model ibm-granite/granite-4.1-8b \
+  --host 0.0.0.0 --port 8000 --api-key "$TOKEN" \
+  --max-model-len 8192 --served-model-name granite-4.1-8b
+# Without GLOO_SOCKET_IFNAME, gloo fails to bind 0.0.0.0 and the
+# engine core never initialises.
+# Check what doc_ids the briefing should contain for an intent
+.venv/bin/python -c "from app.reconcile import build_documents; \
+  print([m['role'] for m in build_documents({'sandy':{'inside':True}, 'nyc311':{'n':5}})])"
+```

METHODOLOGY.md ADDED Viewed

	@@ -0,0 +1,264 @@

+# Riprap — Scoring Methodology
+> Riprap produces a **flood-exposure tier (1–4) per NYC address**, not
+> a calibrated damage probability. The tier is a deterministic
+> literature-grounded composite of public-data signals; the language
+> model writes the citing prose around it but does not score.
+## 1. Why this design
+Closed-methodology scores (First Street, Jupiter, Fathom) are useful
+products but uncitable in civic work — a NYCEM grant writer can't quote
+"0.73" in a FEMA BRIC sub-application without a defensible audit trail.
+At the same time, an LLM-emitted score would be non-reproducible and
+uncalibrated, with documented LLM-as-judge pathologies (Zheng et al.
+2023; Wang et al. 2024). The honest middle: **a deterministic rubric a
+planner can argue with**.
+The tier is computed in `app/score.py` and mirrored in `web/static/app.js`.
+Both implementations are kept in sync; the Python side is authoritative
+for register builds and CLI exports.
+## 2. Methodology pedigree
+The composite construction follows a well-trodden path in the multi-
+indicator vulnerability/exposure literature:
+- **Cutter, Boruff & Shirley (2003)**, *Social Science Quarterly* 84(2):
+  242–261 — the SoVI hazards-of-place pattern: group indicators
+  thematically; sum factors with equal weights because there is no
+  defensible theoretical basis for differential weighting.
+- **Tate (2012)**, *Natural Hazards* 63: 325–347 — explicit Monte Carlo
+  sensitivity analysis showing that hierarchical equal-weighted
+  composites are the most rank-stable. This is why we use equal weights
+  *within* sub-indices.
+- **Balica, Wright & van der Meulen (2012)**, *Natural Hazards* 64:
+  73–105 — Coastal City Flood Vulnerability Index, multiplicative
+  (Exposure × Susceptibility / Resilience). We adopt only the
+  override-behavior of multiplicative form, as a "max-empirical floor"
+  (§4 below), because we have no resilience term.
+- **Kim et al. (2019)**, *Scientific Reports* 9:18564 — additive vs
+  geometric aggregation; additive is more transparent and reproducible
+  *if* sub-indices are pre-grouped thematically. Done.
+NPCC4 (2024) Ch. 3 (Rosenzweig et al., *Annals of the New York Academy
+of Sciences* 1539) and the NYC Hazard Mitigation Plan 2024 supply the
+NYC-specific tiering hierarchy that informs which scenarios get higher
+weights inside the Regulatory sub-index.
+## 3. Sub-index structure
+Three thematic sub-indices, each normalized to [0, 1] by dividing the
+weighted sum by the maximum possible weight in the group. The composite
+is the simple sum of the three sub-indices (range 0–3).
+### 3.1 Regulatory sub-index
+Binary "inside zone" indicators with weights ordered by agency tiering:
+| Indicator                       | Weight | Citation |
+|---------------------------------|-------:|----------|
+| FEMA NFHL 1% (SFHA)             | 1.00   | FEMA NFHL — regulatory mandate threshold |
+| FEMA NFHL 0.2%                  | 0.50   | FEMA NFHL — tail scenario |
+| NYC DEP Moderate-2050 + 2.5 ft  | 0.75   | NYC DEP Stormwater Maps 2021; NPCC4 Ch.3 |
+| NYC DEP Extreme-2080 + SLR      | 0.50   | NYC DEP Stormwater Maps 2021 — explicitly tail |
+| NYC DEP Tidal-2050              | 0.75   | NPCC4 Ch.3 coastal projection |
+Why DEP-2050 outranks DEP-2080: NPCC4 designates the 2080 extreme
+scenario as a **tail** projection. Closer-horizon coastal/pluvial
+maps — those a current planner can act on — get the higher weight.
+### 3.2 Hydrological sub-index
+Continuous terrain measures, banded into 4 levels (1.0 / 0.66 / 0.33 / 0):
+| Indicator | Weight | Bands | Citation |
+|---|---:|---|---|
+| HAND (m)                | 1.00 | <1, 1–3, 3–10, ≥10 | Nobre et al., 2011, *J. Hydrology* 404: 13–29 |
+| TWI quartile            | 0.50 | ≥12, 10–12, 8–10, <8 | Beven & Kirkby, 1979; Sørensen et al., 2006, *HESS* 10 |
+| Elev pct (200 m, inv)   | 0.50 | <10, 10–25, 25–50, ≥50 | Standard geomorphometric proxy |
+| Elev pct (750 m, inv)   | 0.50 | <10, 10–25, 25–50, ≥50 | Standard geomorphometric proxy |
+| Basin relief (m)        | 0.25 | ≥8, 4–8, 2–4, <2 | Supporting variable, Nobre 2011 |
+TWI is half-weighted relative to HAND because TWI is documented as
+noisier in flat urban DEMs (Sørensen 2006 explicitly states TWI is
+site-specific and best percentile-binned). HAND remains the canonical
+hydrology indicator (Aristizabal et al. 2023, *WRR* 59, NOAA NWM).
+### 3.3 Empirical sub-index
+Mix of binary observed-extent flags and banded count signals:
+| Indicator                  | Weight | Citation |
+|----------------------------|-------:|----------|
+| Sandy 2012 inundation      | 1.00 + **floor** | NYC OD `5xsi-dfpx`; NYC HMP 2024 |
+| USGS Ida HWM within 100 m  | 1.00 + **floor** | USGS STN Event 312 |
+| USGS Ida HWM within 800 m  | 0.50   | USGS STN Event 312 |
+| Prithvi-EO 2.0 Ida polygon | 0.75   | Jakubik et al., 2025 (NASA/IBM Prithvi-EO 2.0); semi-empirical |
+| 311 complaint count band   | 0.75   | NYC OD `erm2-nwe9`; NYC 311-as-flood-proxy literature |
+| FloodNet trigger (3 yr)    | 0.75   | FloodNet NYC; NPCC4 Ch.3 references |
+The 311 and FloodNet weights are capped at 0.75 because both signals
+have documented coverage and reporting bias — 311 reflects civic
+engagement as well as flooding, FloodNet has uneven spatial coverage.
+Sandy and HWMs are 1.0 because they're engineered ground-truth
+observations.
+Bands for 311 count (200 m buffer, 5-year window):
+| Count   | Value |
+|---------|------:|
+| ≥10     | 1.00  |
+| 3–9     | 0.66  |
+| 1–2     | 0.33  |
+| 0       | 0     |
+## 4. Max-empirical floor
+If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
+the tier is capped at **2 (Elevated)** — it cannot be worse, regardless
+of the additive composite.
+This recovers the *important* multiplicative behaviour Balica 2012
+argues for: empirical, ground-truth observations should not be
+cancelled out by terrain or modeled scenarios. We implement it as a
+floor (a `min(tier, 2)` after composition) rather than a full
+multiplicative form so the composite remains additive and auditable.
+The 100 m radius is chosen because USGS HWM positional uncertainty is
+typically 5–30 m horizontal — 100 m gives ~3σ headroom for a confident
+"this address was inundated" signal.
+## 5. Composite → tier mapping
+The composite is the sum of the three normalized sub-indices (range 0–3):
+| Composite | Tier | Label                |
+|-----------|-----:|----------------------|
+| ≥ 1.50    | 1    | High exposure        |
+| ≥ 1.00    | 2    | Elevated exposure    |
+| ≥ 0.50    | 3    | Moderate exposure    |
+| > 0       | 4    | Limited exposure     |
+| 0         | 0    | No flagged exposure  |
+Then floor: `Sandy or HWM<100m → tier ≤ 2`.
+## 6. Live signals are NOT in the score
+NWS active alerts, NOAA tide residual (surge), and NWS hourly precip
+are **not** part of the static tier. Per **IPCC AR6 WG II** glossary
+and **NPCC4** Ch. 3, exposure is a quasi-stationary property of place;
+event occurrence is time-varying. Mixing the two would produce a tier
+that flickers every six minutes and that residents could interpret as
+neither "is my building exposed?" nor "is it flooding right now?".
+Live signals are surfaced separately in the UI as a **"Current
+conditions"** badge, with their own provenance (NOAA station ID, NWS
+alert URL, ASOS station code), and they expire on their own cadence.
+Static tier is unaffected.
+This mirrors how First Street separates Flood Factor (static, 30-yr
+horizon) from event-day Flood Lab products, and how Fathom separates
+Global Flood Map from real-time intelligence.
+## 7. Honest scope
+Riprap's tier is **not**:
+- A flood-damage probability or expected loss.
+- A flood-insurance rating. For that, see **FEMA Risk Rating 2.0**
+  (FEMA 2021), which uses claims-driven GLMs over decades of labeled
+  outcome data we do not have.
+- A vulnerability assessment. Engineering fragility (foundation type,
+  electrical hardening, drainage), social capacity, and financial
+  absorption are out of scope.
+- A prediction. Future-scenario layers (DEP 2050/2080, FEMA 0.2%) are
+  bounding scenarios, not forecasts.
+It **is**:
+- An exposure prior — a literature-grounded, deterministic, reproducible
+  index of how many publicly-documented flood signals overlap this
+  address.
+- Auditable end-to-end: every term has a published source; every weight
+  has a rationale; the floor rule has a stated motivation; the tier
+  breakpoints are documented above.
+- Forkable: a researcher who disagrees with any weight can edit
+  `app/score.py` and rerun. The UI methodology panel makes this
+  invitation explicit.
+## 8. Caveats foregrounded in UI copy
+These appear next to the tier badge and in the methodology disclosure:
+> **Riprap tiers are not flood-damage probabilities.** They reflect
+> publicly-documented exposure indicators only.
+> **311 counts are influenced by neighborhood reporting habits** and
+> may under-represent flooding in lower-engagement areas
+> (Agonafir et al. and the broader 311-as-civic-engagement literature).
+> **DEP 2050/2080 and FEMA 0.2% are bounding scenarios, not forecasts.**
+> The tier reads them as "if this scenario materialized, this address
+> would be inside its footprint" — not "this is the expected future."
+> **Compound flooding is not separately modeled.** Concurrence of rain
+> + tide + groundwater is the residual research frontier (NPCC4 Ch. 3).
+## 9. Sensitivity / future work
+- **Tate-style Monte Carlo perturbation** of weights to characterize
+  how sensitive each tier assignment is to weight choice. Not yet
+  implemented; would be a natural next research output.
+- **Calibration exercise** if a labeled dataset emerges (FEMA assistance
+  records, building-level damage from Sandy/Ida insurance claims). Until
+  then, "calibrated" is a word we do not use.
+- **Block- or NTA-level aggregation** for neighborhood-grade scoring,
+  with each indicator computed as an areal aggregate rather than a
+  point sample.
+## References
+Aristizabal, F. et al. (2023). "Improving Continental Hydrologic
+Modeling Using Height Above Nearest Drainage." *Water Resources
+Research* 59.
+Balica, S., Wright, N., & van der Meulen, F. (2012). "A Flood
+Vulnerability Index for Coastal Cities and Its Use in Assessing
+Climate Change Impacts." *Natural Hazards* 64: 73–105.
+Beven, K. J., & Kirkby, M. J. (1979). "A Physically Based, Variable
+Contributing Area Model of Basin Hydrology." *Hydrological Sciences
+Bulletin* 24(1): 43–69.
+Cutter, S. L., Boruff, B. J., & Shirley, W. L. (2003). "Social
+Vulnerability to Environmental Hazards." *Social Science Quarterly*
+84(2): 242–261.
+FEMA (2021). *NFIP Risk Rating 2.0 Methodology and Data Sources.*
+Jakubik, J. et al. (2025). "Prithvi-EO 2.0: A Versatile Multi-Temporal
+Foundation Model for Earth Observation Applications." NASA/IBM.
+Kim, S. et al. (2019). "Assessment of Aggregation Frameworks for
+Composite Indicators in Measuring Flood Vulnerability to Climate
+Change." *Scientific Reports* 9:18564.
+Nobre, A. D. et al. (2011). "Height Above the Nearest Drainage — A
+Hydrologically Relevant New Terrain Model." *Journal of Hydrology*
+404(1–2): 13–29.
+NYC HMP (2024). *NYC Hazard Mitigation Plan 2024.* NYC Emergency
+Management.
+NYC NPCC4 (2024). *4th NYC Climate Assessment.* New York City Panel
+on Climate Change. Including Rosenzweig et al., Ch. 3, *Annals NYAS*
+1539.
+Sørensen, R., Zinko, U., & Seibert, J. (2006). "On the Calculation of
+the Topographic Wetness Index." *Hydrology and Earth System Sciences*
+10: 101–112.
+Tate, E. (2012). "Social Vulnerability Indices: A Comparative
+Assessment Using Uncertainty and Sensitivity Analysis." *Natural
+Hazards* 63: 325–347.

MONDAY.md ADDED Viewed

	@@ -0,0 +1,224 @@

+# Monday handoff (May 4, 2026)
+State of the repo at end of Sunday May 3 / overnight into May 4.
+Demo is **Sunday May 10**.
+## Overnight pass (Sunday evening → Monday)
+Eight priorities closed against `audit/2026-05-03-evening-audit.md`:
+1. `pitch/cold_open.md` restored (was accidentally deleted in 1cb5ee6).
+2. Granite Guardian / refusal-classification leftovers removed —
+   Mellea is the sole grounding mechanism, period.
+3. **Trace UI is now clickable.** Click any specialist row to reveal
+   its raw structured output (formatted JSON, copy button,
+   max-height + scroll). This is the auditability contract: every
+   claim in the briefing is traceable to the specialist that produced
+   it directly inside the UI, not just the citation appendix.
+4. Buffered-footprint overlap for the three Point-geometry register
+   specialists. NYU Langone / Stuyvesant HS / P.S. 89 now correctly
+   register `inside_sandy_2012=true`. Each output records its
+   `footprint_buffer_m`.
+5. Map renders register-asset pins (subway / school / hospital /
+   NYCHA-centroid) coloured by Sandy exposure with click popups
+   showing name + `[doc_id]`. NYCHA polygon-fill is queued for when
+   `geometry_geojson` lands in the dataclass.
+6. **`floodnet_forecast` specialist** — TTM r2 forecast on the
+   nearest FloodNet sensor's flood-event recurrence. Reuses the
+   (512, 96) singleton already loaded for `ttm_311_forecast` —
+   *no new model class loaded into memory*. The strongest single
+   TTM win for the NYU CUSP audience.
+7. Trace UI groups TTM specialists under one parent node
+   `forecasting.granite-timeseries-ttm-r2 [N instances]` so the
+   "one foundation model, multiple data streams" architectural story
+   is legible without reading per-row metadata.
+8. `experiments/` cleanup: dropped two empty dirs (`05_sam2_promptable`,
+   `06_chronos_bolt_forecast`), renamed `05_terramind_finetune` →
+   `05a_terramind_finetune_micro` to dedupe with the active NYC
+   fine-tune dir, removed `Riprap.zip` from repo root.
+Commit chain: `a2143fc` … through `ed6ae9d`. Morning handoff doc
+at `audit/2026-05-04-morning-handoff.md` summarises what to verify
+and what's queued next.
+## Where Sunday ended
+All four keep-list items resolved + 4 register specialists shipped + AMD
+fine-tune prep green.
+| Item | Status | Path |
+|---|---|---|
+| Pitch cold-open locked | ✓ | `pitch/cold_open.md` |
+| TerraMind-NYC fine-tune eval spec | ✓ | `experiments/05_terramind_nyc_finetune/eval/eval_spec.md` |
+| 200-query adversarial set + refusal eval | ✓ (planner pivot) | `experiments/06_granite_guardian/` |
+| Subway-entrance specialist (Sheepshead Bay) | ✓ | `experiments/07_mta_entrances/` |
+| NYCHA-developments specialist (Red Hook) | ✓ | `experiments/08_nycha_developments/` |
+| DOE-schools specialist (Coney Island) | ✓ | `experiments/09_doe_schools/` |
+| DOH-hospitals specialist (Coney Island) | ✓ | `experiments/10_doh_hospitals/` |
+| FSM integration of all 4 register specialists | ✓ | `app/registers/`, `app/fsm.py`, `app/reconcile.py`, `web/static/agent.js` |
+| AMD droplet TerraMind smoke + STAC manifest | ✓ | `129.212.182.52:/root/terramind_nyc/` |
+End-to-end smoke on "Coney Island Brooklyn" produced citations
+`[mta_entrance_56]`, `[nycha_dev_239]`, `[nycha_dev_166]` alongside
+`[rag_mta]` and `[nyc311]` — family-prefix chip routing works.
+Last commit: `86861be` (FSM integration of 4 register specialists).
+## Decisions locked
+- **Refusal classification dropped entirely.** Planner-level
+  classifier hit FN=0% but FP=7% (gate was <5%). Granite Guardian
+  itself was already abandoned (laptop-infeasible). After the audit
+  surfaced that the planner shim was documented-but-never-wired,
+  the decision is now Option C: drop refusal handling. Cold-start
+  framing scopes the audience; Mellea rejection sampling enforces
+  grounding integrity; the four-tier glyph margin carries the
+  epistemic-honesty signal. The `GuardianRefusal.svelte` component
+  is deleted (was only ever rendered on a documentation page).
+  Demo's integrity beat is the **Mellea grounding-failure reroll on
+  the curated Hollis 0.19% → 19% case**. `experiments/06_granite_guardian/`
+  is preserved as a "considered and rejected" artifact for the
+  methodology paper.
+- **AMD path: `129.212.182.52` is production**, not `165.245.134.44`.
+  CLAUDE.md says the latter; **fix CLAUDE.md to match reality**.
+  Production vLLM is on `.52`. The TerraMind container shares the
+  GPU with vLLM; both fit on one MI300X.
+- **TerraMind manifest is 1028 paired chips**, 2021-05 → 2026-04,
+  NYC 5-borough hull +5 km, S2-cloud <30%, ≤3-day pair window. One
+  year (2022-05 → 2023-04) returned 0 due to PC API intermittency —
+  acceptable for the micro-fine-tune.
+## First thing Monday morning
+1. **Refresh Microsoft Planetary Computer signed URLs.** They have
+   ~1 hr TTL; the manifest from Sunday evening is stale by morning.
+   On the droplet:
+   ```bash
+   ssh root@129.212.182.52
+   docker exec -it terramind bash
+   cd /root/terramind_nyc
+   python build_manifest.py --refresh-only manifest_train.jsonl
+   python build_manifest.py --refresh-only manifest_holdout.jsonl
+   ```
+   (Recipe is in `/root/terramind_nyc/NOTES.md` on the droplet.)
+2. **Kick off TerraMind-NYC fine-tune.** Spec at
+   `experiments/05_terramind_nyc_finetune/eval/eval_spec.md`. Budget
+   is 30 GPU-hours; alarm at 25 (set on the droplet). Predicted
+   actual: ~0.16 GPU-hours at bs=8 / 3 epochs. Don't run anything
+   experimental until eval-spec gates pass on the held-out set.
+3. **Decide bucket** (A ship-in-demo / B publish-only / C revert):
+   - A: ship the fine-tuned checkpoint as a Riprap specialist.
+   - B: publish to HF as `msradam/TerraMind-1.0-NYC` with model card,
+     don't ship in demo. **Bucket B is fully acceptable** per the
+     spec — civic-tech publication discipline is the durable goal.
+   - C: discard checkpoint, no public artefact.
+## Working on Monday
+- TerraMind-NYC fine-tune (above).
+- **Mellea grounding-failure demo prep.** The pitch demo is the
+  Hollis 0.19% → 19% case where Granite emits a number with the
+  wrong order of magnitude and Mellea catches it. Demo script
+  needs to:
+  - Show the failed first attempt (banner: "Mellea reroll: numerics
+    grounding failed").
+  - Show the second attempt with the corrected number.
+  - Show the audit panel with the pass/fail per-requirement.
+  - Show wall-clock for the reroll (target: under 30 s end-to-end).
+  - Currently reproducible via `scripts/probe_mellea.py --query
+    "Hollis" --runs 5`. The demo script is the *visual* version.
+- **MTA Sandy-recovery citation layer.** Parse the MTA "Hurricane
+  Sandy: Three Years Later" report into per-station-id facts so
+  the subway-entrance specialist can emit
+  `[mta_recovery_<station_id>]` doc messages alongside the
+  exposure ones.
+- **NYCHA polygon-fill on the map.** Overnight session shipped
+  NYCHA developments as centroid pins on the map (graded by
+  `pct_inside_sandy ≥ 50%`). The next tightening is to add a
+  `geometry_geojson` field to `app/registers/nycha.py`'s
+  `DevelopmentFinding` dataclass and route through SSE so
+  `register-polygons` actually renders graded fills (the layer +
+  source are already present in `RipMap.svelte`).
+- **PLUTO/Building-Footprints join** for Stuyvesant Town etc.
+  Overnight pass shipped buffered-point overlap (NYU Langone,
+  Stuyvesant HS, P.S. 89 now correctly flip to
+  `inside_sandy_2012=true`). The 100m hospital buffer / 50m school
+  buffer is honest but coarse; PLUTO + actual building footprints
+  is the next step for the very-large-campus assets.
+## Outstanding through Friday
+In rough priority order:
+1. **More specialists**:
+   - FEMA OpenFEMA NFIP claims tract-aggregated (pending).
+   - NWS NWPS reach-level forecast + USGS NWIS Bronx / Saw Mill /
+     Hutchinson rivers.
+   - NYC DEP CSO outfalls + Bluebelt + Green Infrastructure
+     specialist (CSS-vs-MS4 distinction for ASCE).
+   - Three more TTM r2 specialists (USGS streamgage stage, NWS
+     rainfall accumulation, NYC 311 sewer-backup citywide rate).
+     **FloodNet forecast already shipped in the overnight pass.**
+2. **Visual identity refresh**: Carto Positron, IBM Plex, four-tier
+   epistemic palette, WeasyPrint PDF export, trace UI as `<details>`
+   tree.
+3. **WCAG 2.2 AA pass.**
+4. **Methodology paper draft** (6-8 page PDF). Goal: Saturday May 9.
+5. **Historical-event mode** — vintage-cutoff queries. Saturday.
+6. **Five Build-in-Public posts** through the week.
+7. **5-minute hackathon pitch + 3 demo queries.** Friday rehearsal.
+8. **ASCE talk materials** — May 13 (post-hackathon).
+## Sharp edges to remember
+- **Static assets cache hard.** When iterating on Svelte or
+  agent.js, hard-reload (⌘⇧R). No cache-busting in place.
+- **HF Space sleeps after idle.** Free tier; first request after
+  sleep is a 30-90 s cold start. Ping the space before any demo.
+- **vLLM cold compile.** First few requests against a fresh
+  `vllm serve` log surprisingly low throughput while ROCm kernels
+  JIT. Run benchmarks 3+ times before believing them.
+- **Sandy GeoJSON has self-intersection issues** that blow up
+  `unary_union`. Use `buffer(0)` (caught and fixed for NYCHA;
+  may surface again for any new polygon-overlap specialist).
+- **DEP column is `Flooding_Category` (int16)**, not `depth_class`.
+  Documented in NYCHA RESULTS.md.
+- **Centroid-edge join false-negatives** on NYU Langone / Stuyvesant
+  / P.S. 89 because their centroid points lie just outside the OEM
+  Sandy polygon despite real 2012 basement flooding. PLUTO
+  footprint join is the queued fix.
+- **Don't restart uvicorn while a model is mid-generation.** Ollama
+  keeps the request alive but the FastAPI handler dies, leaving
+  the user staring at a dead stream.
+## Files to read in order on Monday morning
+1. This file.
+2. `experiments/05_terramind_nyc_finetune/eval/eval_spec.md` — the
+   contract for what training output triggers ship/publish/revert.
+3. `experiments/06_granite_guardian/RESULTS.md` — the Guardian →
+   planner pivot decision record (so you know why Guardian is in
+   the repo but not on the demo path).
+4. `experiments/07_mta_entrances/RESULTS.md` — the canonical
+   register-specialist pattern (the other three follow it).
+5. `CLAUDE.md` — fix the AMD droplet IP (165.245.134.44 →
+   129.212.182.52) at the same time as the first edit of the day.
+## Status as of 2026-05-03 ~12:50 ET
+- Both git remotes (origin + huggingface) up-to-date through
+  `86861be`.
+- HF Space rebuild was *not* triggered on the FSM-integration
+  commit; do `git push huggingface main` when you want to deploy.
+  (You may want to wait until Monday afternoon so a broken HF
+  rebuild doesn't eat morning time.)
+- Local Ollama has both `granite4.1:3b` and `granite4.1:8b` warm.
+- AMD droplet `129.212.182.52` has the `terramind` container
+  running with TerraTorch 1.2.7 + pystac-client + planetary-
+  computer installed in system Python; HF cache populated.
+- 200-query adversarial set + planner-pivot eval results
+  reproducible from `experiments/06_granite_guardian/` in ~3 min.
+- Mellea probe still works: `scripts/probe_mellea.py --query
+  "Hollis" --runs 5`.

agent.py CHANGED Viewed

@@ -1,4 +1,4 @@
-"""HeliOS-NYC agent CLI: address -> cited paragraph via the Burr FSM.
 Usage:
     python agent.py "180 Beach 35 St, Queens"

+"""Riprap agent CLI — address → cited briefing via the Burr FSM.
 Usage:
     python agent.py "180 Beach 35 St, Queens"

app/areas/__init__.py ADDED Viewed

File without changes

app/areas/nta.py ADDED Viewed

	@@ -0,0 +1,224 @@

+"""NYC Neighborhood Tabulation Area (NTA 2020) resolver.
+NTAs are NYC Department of City Planning's official neighborhood unit:
+~262 polygons covering all 5 boroughs, including some park / airport
+slivers. They are the canonical "neighborhood" unit for NYC civic data.
+This module provides:
+  - load() → GeoDataFrame with all NTAs (cached)
+  - resolve(name) → list of matching NTAs by fuzzy name match, or by borough
+  - by_code(code) → exact lookup
+  - polygon_for(code) → shapely Polygon in EPSG:4326
+"""
+from __future__ import annotations
+import re
+from functools import lru_cache
+from pathlib import Path
+from typing import Any
+import geopandas as gpd
+from shapely.geometry import Polygon
+DATA_PATH = Path(__file__).resolve().parents[2] / "data" / "nyc_ntas_2020.geojson"
+# Common alias map: user-typed strings → canonical NTA names. We don't need to
+# be exhaustive here; the fuzzy matcher catches most cases. This handles the
+# few hard ones where the official NTA name differs from local usage.
+ALIASES = {
+    "the rockaways":    "Rockaway Beach-Arverne-Edgemere",
+    "rockaway":         "Rockaway Beach-Arverne-Edgemere",
+    "brighton":         "Brighton Beach",
+    "lower east side":  "Lower East Side",
+    "les":              "Lower East Side",
+    "soho":             "SoHo-Little Italy-Hudson Square",
+    "tribeca":          "Tribeca-Civic Center",
+    "fidi":             "Financial District-Battery Park City",
+    "downtown brooklyn":"Downtown Brooklyn-DUMBO-Boerum Hill",
+    "dumbo":            "Downtown Brooklyn-DUMBO-Boerum Hill",
+    "park slope":       "Park Slope",
+    "carroll gardens":  "Carroll Gardens-Cobble Hill-Gowanus-Red Hook",
+    "red hook":         "Carroll Gardens-Cobble Hill-Gowanus-Red Hook",
+    "gowanus":          "Carroll Gardens-Cobble Hill-Gowanus-Red Hook",
+    "hollis":           "Queens Village-Hollis-Bellerose",
+    "long island city": "Hunters Point-Sunnyside-West Maspeth",
+    "lic":              "Hunters Point-Sunnyside-West Maspeth",
+    "astoria":          "Astoria (Central)",
+    "flushing":         "Flushing-Willets Point",
+    "harlem":           "Central Harlem (North)",
+    "east harlem":      "East Harlem (North)",
+    "washington heights":"Washington Heights (North)",
+    "midtown":          "Midtown South-Flatiron-Union Square",
+    "upper east side":  "Upper East Side-Carnegie Hill",
+    "ues":              "Upper East Side-Carnegie Hill",
+    "upper west side":  "Upper West Side-Lincoln Square",
+    "uws":              "Upper West Side-Lincoln Square",
+    "coney island":     "Coney Island-Sea Gate",
+}
+BOROUGH_NORMALIZE = {
+    "manhattan": "Manhattan", "mn": "Manhattan",
+    "brooklyn":  "Brooklyn",  "bk": "Brooklyn",  "kings": "Brooklyn",
+    "queens":    "Queens",    "qn": "Queens",
+    "bronx":     "Bronx",     "the bronx": "Bronx", "bx": "Bronx",
+    "staten island": "Staten Island", "si": "Staten Island", "richmond": "Staten Island",
+}
+def _normalize(s: str) -> str:
+    return re.sub(r"[^a-z]+", "", (s or "").lower())
+@lru_cache(maxsize=1)
+def load() -> gpd.GeoDataFrame:
+    """Load the NTA 2020 GeoJSON; coerce CRS to EPSG:4326. Cached."""
+    g = gpd.read_file(DATA_PATH)
+    if g.crs is None or g.crs.to_string() != "EPSG:4326":
+        g = g.to_crs("EPSG:4326")
+    return g
+def by_code(code: str) -> dict | None:
+    g = load()
+    hit = g[g["nta2020"] == code]
+    if hit.empty:
+        return None
+    return _row_to_dict(hit.iloc[0])
+def _row_to_dict(row) -> dict:
+    return {
+        "nta_code":  row["nta2020"],
+        "nta_name":  row["ntaname"],
+        "borough":   row["boroname"],
+        "cdta":      row.get("cdtaname"),
+        "geometry":  row["geometry"],
+    }
+def borough_match(query: str) -> str | None:
+    """If query matches a borough name (or common abbreviation), return the
+    canonical name. Otherwise return None."""
+    q = query.strip().lower()
+    return BOROUGH_NORMALIZE.get(q)
+def resolve(query: str) -> list[dict[str, Any]]:
+    """Resolve a free-text query to NTA(s).
+    Strategy (in priority order):
+      1. Borough match → all NTAs in borough.
+      2. Alias map → exact NTA name match.
+      3. Case-insensitive EXACT name match (so 'Kew Gardens' wins over
+         'Kew Gardens Hills' when both exist).
+      4. Substring match on normalized NTA name. When multiple match,
+         prefer the one whose normalized name length is closest to the
+         query — avoids 'Kew Gardens' resolving to 'Kew Gardens Hills'.
+      5. CDTA-name substring fallback.
+    """
+    g = load()
+    q = (query or "").strip()
+    if not q:
+        return []
+    boro = borough_match(q)
+    if boro:
+        hits = g[g["boroname"] == boro]
+        return [_row_to_dict(r) for _, r in hits.iterrows()]
+    alias = ALIASES.get(q.lower())
+    if alias:
+        hits = g[g["ntaname"] == alias]
+        if not hits.empty:
+            return [_row_to_dict(r) for _, r in hits.iterrows()]
+    # Exact (case-insensitive) — preferred over substring
+    name_lower = g["ntaname"].fillna("").str.lower()
+    exact = g[name_lower == q.lower()]
+    if not exact.empty:
+        return [_row_to_dict(r) for _, r in exact.iterrows()]
+    qn = _normalize(q)
+    if not qn:
+        return []
+    name_norm = g["ntaname"].fillna("").map(_normalize)
+    contains = g[name_norm.str.contains(qn, na=False)].copy()
+    if not contains.empty:
+        contains["_diff"] = contains["ntaname"].fillna("").map(
+            lambda s: abs(len(_normalize(s)) - len(qn))
+        )
+        contains = contains.sort_values("_diff")
+        return [_row_to_dict(r) for _, r in contains.iterrows()]
+    cdta_norm = g["cdtaname"].fillna("").map(_normalize)
+    contains = g[cdta_norm.str.contains(qn, na=False)]
+    if not contains.empty:
+        return [_row_to_dict(r) for _, r in contains.iterrows()]
+    return []
+def polygon_for(code: str) -> Polygon | None:
+    hit = by_code(code)
+    return hit["geometry"] if hit else None
+def resolve_from_text(text: str) -> list[dict[str, Any]]:
+    """Scan free-text (e.g. a full natural-language query) for any known NTA
+    name, alias, or borough. Returns the first match. This is the fallback
+    when the planner failed to extract a clean target.
+    Strategy: walk ALIASES first (cheap), then iterate NTA names and look
+    for the longest match contained in the text. We prefer the longest
+    match so 'Carroll Gardens' wins over 'Gardens'.
+    """
+    t = (text or "").lower()
+    if not t:
+        return []
+    # Boroughs first (whole-word-ish — avoid false hits inside "queensland" etc.)
+    for boro_key, canon in BOROUGH_NORMALIZE.items():
+        if f" {boro_key} " in f" {t} " or t.startswith(boro_key + " ") or t.endswith(" " + boro_key):
+            hits = resolve(canon)
+            if hits:
+                return hits
+    # Alias keys, longest first
+    for key in sorted(ALIASES.keys(), key=len, reverse=True):
+        if key in t:
+            hits = resolve(key)
+            if hits:
+                return hits
+    # NTA names. Order: longest first so multi-word names match before
+    # shorter substrings, AND preferring the WORD-BOUNDARY match so
+    # "Kew Gardens" in the query doesn't collide with "Kew Gardens Hills"
+    # (the latter is longer; without word-boundary checking it'd match
+    # nothing, but with substring-in-text it'd match if the query ever
+    # contained the longer phrase). Caller picks the closest-length match.
+    g = load()
+    names = sorted(set(g["ntaname"].dropna().str.lower().tolist()), key=len, reverse=True)
+    matches = []
+    for name in names:
+        if not name or len(name) < 4:
+            continue
+        # Word-boundary-ish check: name must appear bounded by start/end or
+        # whitespace/punct (so "kew gardens hills" matches but "kew gardens"
+        # alone doesn't trigger "kew gardens hills" because of the trailing
+        # space requirement).
+        padded_t = f" {t} "
+        if f" {name} " in padded_t or f" {name}." in padded_t or f" {name}," in padded_t or f" {name}?" in padded_t:
+            matches.append(name)
+    if matches:
+        # Prefer the longest word-boundary match — most specific.
+        best = sorted(matches, key=len, reverse=True)[0]
+        hits = resolve(best)
+        if hits:
+            return hits
+    # Fallback: any substring (no boundary). Less precise, but catches
+    # casual queries like "show me red hook" where "red hook" is a
+    # neighborhood-name fragment within a longer NTA name.
+    for name in names:
+        if not name or len(name) < 4:
+            continue
+        if name in t:
+            hits = resolve(name)
+            if hits:
+                return hits
+    return []

app/assets/mta_entrances.py CHANGED Viewed

@@ -8,7 +8,6 @@ register is built for.
 """
 from __future__ import annotations
-import json
 from pathlib import Path
 import geopandas as gpd

 """
 from __future__ import annotations
 from pathlib import Path
 import geopandas as gpd

app/context/dob_permits.py ADDED Viewed

	@@ -0,0 +1,258 @@

+"""NYC DOB construction-permit specialist — "what are they building".
+Pulls active NYC DOB Permit Issuance records (Socrata `ipu4-2q9a`)
+inside a polygon, filtered to recent New Building (NB), major
+Alteration (A1), and Demolition (DM) jobs. Each project is then
+cross-referenced against the static flood layers (Sandy 2012, DEP
+Stormwater scenarios) so the reconciler can write things like:
+  "12 active major construction projects in Gowanus. Of these,
+   8 sit inside the DEP Extreme-2080 stormwater scenario."
+The dataset uses separate gis_latitude / gis_longitude columns rather
+than a Socrata Point, so we bbox-filter via SoQL then do exact
+point-in-polygon containment client-side with shapely.
+"""
+from __future__ import annotations
+import logging
+from collections import Counter
+from dataclasses import asdict, dataclass
+from datetime import date, datetime, timedelta
+from typing import Any
+import geopandas as gpd
+import httpx
+from shapely.geometry import Point
+log = logging.getLogger("riprap.dob_permits")
+URL = "https://data.cityofnewyork.us/resource/ipu4-2q9a.json"
+DOC_ID = "dob_permits"
+CITATION = ("NYC DOB Permit Issuance (NYC OpenData ipu4-2q9a) — "
+            "issued/in-progress construction permits")
+JOB_TYPE_LABELS = {
+    "NB": "new building",
+    "A1": "major alteration (use/occupancy)",
+    "A2": "minor alteration",
+    "A3": "minor work / interior",
+    "DM": "demolition",
+    "SG": "sign",
+    "PL": "plumbing",
+    "EQ": "equipment",
+}
+# Default filter: focus on "what are they building" — new construction,
+# major alterations, demolitions. Skip minor mechanical permits.
+DEFAULT_JOB_TYPES = ("NB", "A1", "DM")
+@dataclass
+class Permit:
+    job_id: str
+    job_type: str
+    job_type_label: str
+    permit_status: str
+    issuance_date: str
+    expiration_date: str | None
+    address: str
+    borough: str
+    bbl: str | None
+    lat: float
+    lon: float
+    owner_business: str | None
+    permittee_business: str | None
+    nta_name: str | None
+def permits_in_bbox(min_lat: float, min_lon: float,
+                    max_lat: float, max_lon: float,
+                    job_types: tuple[str, ...] = DEFAULT_JOB_TYPES,
+                    since: date | None = None,
+                    limit: int = 5000) -> list[Permit]:
+    """Pull DOB permits intersecting a bounding box, recently issued, with
+    matching job types. We expand from polygon to bbox and rely on the
+    caller to do exact point-in-polygon filtering."""
+    if since is None:
+        since = date.today() - timedelta(days=540)  # ~18 months
+    # gis_latitude/gis_longitude are stored as text in this dataset; cast
+    # to number for the bbox compare. issuance_date is a floating timestamp
+    # surfaced as 'MM/DD/YYYY' string — cast explicitly to floating_timestamp
+    # so the comparator parses ISO dates correctly. BETWEEN is picky on text
+    # columns, so use explicit >= / <= operators.
+    where = (
+        f"job_type IN ({','.join(repr(t) for t in job_types)})"
+        f" AND issuance_date::floating_timestamp >= '{since.isoformat()}'"
+        f" AND gis_latitude::number >= {min_lat}"
+        f" AND gis_latitude::number <= {max_lat}"
+        f" AND gis_longitude::number >= {min_lon}"
+        f" AND gis_longitude::number <= {max_lon}"
+    )
+    r = httpx.get(URL, params={
+        "$select": ",".join([
+            "job__", "job_type", "permit_status", "issuance_date",
+            "expiration_date", "house__", "street_name", "borough",
+            "block", "lot",
+            "gis_latitude", "gis_longitude", "owner_s_business_name",
+            "permittee_s_business_name", "gis_nta_name",
+        ]),
+        "$where": where,
+        "$order": "issuance_date desc",
+        "$limit": str(limit),
+    }, timeout=60)
+    r.raise_for_status()
+    out: list[Permit] = []
+    for row in r.json():
+        try:
+            lat = float(row["gis_latitude"])
+            lon = float(row["gis_longitude"])
+        except (KeyError, ValueError, TypeError):
+            continue
+        addr = " ".join(filter(None, [
+            row.get("house__"),
+            (row.get("street_name") or "").title(),
+        ])).strip()
+        # DOB has no `bbl` column; compose from borough + block + lot.
+        # Borough codes: MAN=1, BX=2, BK=3, QN=4, SI=5.
+        boro_code = {"MANHATTAN": "1", "BRONX": "2", "BROOKLYN": "3",
+                     "QUEENS": "4", "STATEN ISLAND": "5"}.get(
+                     (row.get("borough") or "").upper())
+        block = (row.get("block") or "").lstrip("0")
+        lot = (row.get("lot") or "").lstrip("0")
+        bbl = (f"{boro_code}-{block.zfill(5)}-{lot.zfill(4)}"
+               if boro_code and block and lot else None)
+        out.append(Permit(
+            job_id=row.get("job__", ""),
+            job_type=row.get("job_type", ""),
+            job_type_label=JOB_TYPE_LABELS.get(row.get("job_type", ""), row.get("job_type", "")),
+            permit_status=row.get("permit_status", ""),
+            issuance_date=(row.get("issuance_date") or "")[:10],
+            expiration_date=(row.get("expiration_date") or "")[:10] or None,
+            address=addr,
+            borough=(row.get("borough") or "").title(),
+            bbl=bbl,
+            lat=lat,
+            lon=lon,
+            owner_business=row.get("owner_s_business_name"),
+            permittee_business=row.get("permittee_s_business_name"),
+            nta_name=row.get("gis_nta_name"),
+        ))
+    return out
+def permits_in_polygon(polygon, polygon_crs: str = "EPSG:4326",
+                       job_types: tuple[str, ...] = DEFAULT_JOB_TYPES,
+                       since: date | None = None) -> list[Permit]:
+    """Permits inside a polygon. Uses bbox prefilter + shapely contains."""
+    g = gpd.GeoDataFrame(geometry=[polygon], crs=polygon_crs).to_crs("EPSG:4326")
+    geom = g.iloc[0].geometry
+    minx, miny, maxx, maxy = geom.bounds
+    raw = permits_in_bbox(miny, minx, maxy, maxx, job_types=job_types, since=since)
+    out: list[Permit] = []
+    for p in raw:
+        pt = Point(p.lon, p.lat)
+        if geom.contains(pt) or geom.intersects(pt):
+            out.append(p)
+    # Dedupe by job_id (one job can have multiple permits as work proceeds)
+    seen: dict[str, Permit] = {}
+    for p in out:
+        # Keep the most-recently-issued permit per job
+        cur = seen.get(p.job_id)
+        if cur is None or (p.issuance_date or "") > (cur.issuance_date or ""):
+            seen[p.job_id] = p
+    return list(seen.values())
+def cross_reference_flood(permits: list[Permit]) -> list[dict[str, Any]]:
+    """Tag each permit with which flood layers cover its point.
+    Adds: in_sandy (bool), dep_class (highest depth class hit across DEP scenarios),
+    dep_scenarios (list of scenario ids that fired)."""
+    if not permits:
+        return []
+    from app.flood_layers import dep_stormwater, sandy_inundation
+    pts = gpd.GeoDataFrame(
+        geometry=[Point(p.lon, p.lat) for p in permits],
+        crs="EPSG:4326",
+    ).to_crs("EPSG:2263")
+    pts["_pid"] = list(range(len(pts)))
+    sandy_flags = sandy_inundation.join(pts).reset_index(drop=True).tolist()
+    dep_hits = {scen: dep_stormwater.join(pts, scen)["depth_class"].astype(int).tolist()
+                for scen in ("dep_extreme_2080", "dep_moderate_2050", "dep_moderate_current")}
+    out = []
+    for i, p in enumerate(permits):
+        scen_hits = {s: dep_hits[s][i] for s in dep_hits}
+        max_class = max(scen_hits.values(), default=0)
+        active_scens = [s for s, c in scen_hits.items() if c > 0]
+        out.append({
+            **asdict(p),
+            "in_sandy":      bool(sandy_flags[i]),
+            "dep_max_class": max_class,
+            "dep_scenarios": active_scens,
+            "any_flood_layer_hit": bool(sandy_flags[i] or max_class > 0),
+        })
+    return out
+def summary_for_polygon(polygon, polygon_crs: str = "EPSG:4326",
+                        since_days: int = 540,
+                        top_n: int = 8) -> dict:
+    """Full polygon-mode summary: list active permits, cross-reference each
+    with flood layers, return aggregate counts + a top-N projects-of-concern
+    list (those that hit at least one flood layer, ranked by max DEP class
+    + Sandy hit)."""
+    since = date.today() - timedelta(days=since_days)
+    permits = permits_in_polygon(polygon, polygon_crs=polygon_crs, since=since)
+    enriched = cross_reference_flood(permits)
+    by_type: Counter = Counter(e["job_type_label"] for e in enriched)
+    by_status: Counter = Counter(e["permit_status"] for e in enriched)
+    n_total = len(enriched)
+    n_sandy = sum(1 for e in enriched if e["in_sandy"])
+    n_dep_any = sum(1 for e in enriched if e["dep_max_class"] > 0)
+    n_dep_severe = sum(1 for e in enriched if e["dep_max_class"] >= 2)
+    n_any_flood = sum(1 for e in enriched if e["any_flood_layer_hit"])
+    # Rank: severity = (in_sandy * 3) + dep_max_class
+    def severity(e):
+        return (3 if e["in_sandy"] else 0) + e["dep_max_class"]
+    flagged = sorted(
+        [e for e in enriched if e["any_flood_layer_hit"]],
+        key=severity, reverse=True,
+    )[:top_n]
+    # Light projection of every permit for map pinning (no need to ship the
+    # full permit record for the not-flagged ones — the map only needs lat,
+    # lon, address, job_type_label, and the flood-flag fields).
+    all_pins = [
+        {
+            "lat":           e["lat"],
+            "lon":           e["lon"],
+            "address":       e["address"],
+            "job_type":      e["job_type"],
+            "in_sandy":      e["in_sandy"],
+            "dep_max_class": e["dep_max_class"],
+            "any_flood":     e["any_flood_layer_hit"],
+        }
+        for e in enriched
+    ]
+    return {
+        "since":           since.isoformat(),
+        "n_total":         n_total,
+        "n_in_sandy":      n_sandy,
+        "n_in_dep_any":    n_dep_any,
+        "n_in_dep_severe": n_dep_severe,
+        "n_any_flood":     n_any_flood,
+        "by_job_type":     dict(by_type.most_common()),
+        "by_permit_status":dict(by_status.most_common()),
+        "flagged_top":     flagged,
+        "all_pins":        all_pins,
+        "all_count":       n_total,
+    }
+def now_iso() -> str:
+    return datetime.utcnow().date().isoformat()

app/context/gliner_extract.py ADDED Viewed

	@@ -0,0 +1,125 @@

+"""GLiNER (urchade/gliner_medium-v2.1) typed-entity extraction over the
+RAG retriever's top paragraphs.
+Adds structured fields to the reconciler's grounding context. For each
+RAG chunk the specialist emits, GLiNER produces a list of typed spans
+with one of five labels:
+    nyc_location          (e.g. "Coney Island")
+    dollar_amount         (e.g. "$5.6 million")
+    date_range            (e.g. "fiscal year 2025-2027")
+    agency                (e.g. "NYC DEP")
+    infrastructure_project (e.g. "Bluebelt expansion")
+The doc_id for emission is `gliner_<source>` where `<source>` is the
+RAG chunk's doc_id stripped of its `rag_` prefix. So `rag_comptroller`
+becomes `gliner_comptroller`. The reconciler can then cite typed
+fields with `[gliner_comptroller]`.
+License: Apache-2.0 — `urchade/gliner_medium-v2.1` (NOT the
+`gliner_base` variant, which is CC-BY-NC-4.0). See
+experiments/shared/licenses.md.
+"""
+from __future__ import annotations
+import logging
+import os
+from dataclasses import dataclass
+log = logging.getLogger("riprap.gliner")
+ENTITY_LABELS = [
+    "nyc_location",
+    "dollar_amount",
+    "date_range",
+    "agency",
+    "infrastructure_project",
+]
+DEFAULT_THRESHOLD = float(os.environ.get("RIPRAP_GLINER_THRESHOLD", "0.45"))
+MODEL_NAME = os.environ.get("RIPRAP_GLINER_MODEL", "urchade/gliner_medium-v2.1")
+ENABLE = os.environ.get("RIPRAP_GLINER_ENABLE", "1").lower() in ("1", "true", "yes")
+_MODEL = None  # lazy
+@dataclass
+class Extraction:
+    label: str
+    text: str
+    score: float
+def _ensure_model():
+    """Lazy GLiNER load. Returns None if disabled or load fails so
+    callers can silently fall back to no-op."""
+    global _MODEL
+    if not ENABLE:
+        return None
+    if _MODEL is not None:
+        return _MODEL
+    try:
+        from gliner import GLiNER
+        log.info("gliner: loading %s", MODEL_NAME)
+        _MODEL = GLiNER.from_pretrained(MODEL_NAME)
+    except Exception:
+        log.exception("gliner: load failed; specialist will no-op")
+        _MODEL = False  # sentinel
+    return _MODEL or None
+def warm():
+    _ensure_model()
+def _source_short(rag_doc_id: str) -> str:
+    """`rag_comptroller` -> `comptroller`. Anything not prefixed `rag_`
+    passes through unchanged."""
+    return rag_doc_id[4:] if rag_doc_id.startswith("rag_") else rag_doc_id
+def extract_for_chunk(text: str, threshold: float = DEFAULT_THRESHOLD) -> list[Extraction]:
+    model = _ensure_model()
+    if model is None or not text:
+        return []
+    raw = model.predict_entities(text, ENTITY_LABELS, threshold=threshold)
+    return [Extraction(label=r["label"], text=r["text"],
+                       score=float(r["score"])) for r in raw]
+def extract_for_rag_hits(hits: list[dict],
+                         threshold: float = DEFAULT_THRESHOLD,
+                         max_hits: int = 3) -> dict[str, dict]:
+    """Run GLiNER on the top-`max_hits` RAG hits. Returns a dict keyed by
+    short source id (e.g. "comptroller") with the structured payload
+    that the FSM stores into state["gliner"] and that
+    reconcile.build_documents() consumes."""
+    out: dict[str, dict] = {}
+    if not hits:
+        return out
+    for h in hits[:max_hits]:
+        source = _source_short(h.get("doc_id", "rag_unknown"))
+        ents = extract_for_chunk(h.get("text", ""), threshold=threshold)
+        if not ents:
+            continue
+        # Dedup verbatim repeats (common in agency PDFs that repeat
+        # "DEP" 13 times in a methodology section).
+        seen = set()
+        deduped: list[Extraction] = []
+        for e in ents:
+            key = (e.label, e.text.lower())
+            if key in seen:
+                continue
+            seen.add(key)
+            deduped.append(e)
+        out[source] = {
+            "rag_doc_id": h.get("doc_id"),
+            "title": h.get("title"),
+            "paragraph_excerpt": h.get("text", "")[:240]
+            + ("…" if len(h.get("text", "")) > 240 else ""),
+            "n_entities": len(deduped),
+            "entities": [{"label": e.label, "text": e.text,
+                          "score": round(e.score, 3)} for e in deduped],
+        }
+    return out

app/context/microtopo.py CHANGED Viewed

@@ -123,10 +123,9 @@ def _row_col(transform, lat: float, lon: float) -> tuple[int, int]:
     """Inverse-affine: WGS84 (lon,lat) -> raster (row, col).
     Mirrors rasterio.transform.rowcol but without holding a dataset handle.
     """
-    # affine: x = a*col + b*row + c ; y = d*col + e*row + f
-    # invert: col = (a_inv * (x - c)) approx — we have a diagonal affine
-    a, b, c, d, e, f = transform.a, transform.b, transform.c, transform.d, transform.e, transform.f
-    # diagonal case (b=d=0, common for north-up rasters):
     col = int(round((lon - c) / a))
     row = int(round((lat - f) / e))
     return row, col
@@ -206,3 +205,70 @@ def microtopo_at(lat: float, lon: float, radius_m: int = 750) -> Microtopo | Non
         twi=twi_v,
         hand_m=hand_v,
     )

     """Inverse-affine: WGS84 (lon,lat) -> raster (row, col).
     Mirrors rasterio.transform.rowcol but without holding a dataset handle.
     """
+    # Diagonal affine (north-up raster): x = a*col + c, y = e*row + f.
+    a, c = transform.a, transform.c
+    e, f = transform.e, transform.f
     col = int(round((lon - c) / a))
     row = int(round((lat - f) / e))
     return row, col
         twi=twi_v,
         hand_m=hand_v,
     )
+def microtopo_for_polygon(polygon, polygon_crs: str = "EPSG:4326") -> dict | None:
+    """Polygon-mode aggregation: distributional summary of the DEM/HAND/TWI
+    rasters clipped to the polygon. Returns medians + fraction of cells
+    in flood-prone bands. Used for neighborhood-mode queries."""
+    state = _load_dem()
+    if state is None:
+        return None
+    try:
+        import rasterio
+        from rasterio.mask import mask as rio_mask
+    except Exception:
+        return None
+    import geopandas as gpd
+    poly = gpd.GeoDataFrame(geometry=[polygon], crs=polygon_crs).to_crs("EPSG:4326")
+    geom = [poly.iloc[0].geometry.__geo_interface__]
+    def _stats(path: Path) -> dict | None:
+        if not path.exists():
+            return None
+        try:
+            with rasterio.open(path) as src:
+                clipped, _ = rio_mask(src, geom, crop=True, filled=False)
+                arr = clipped[0]
+                vals = arr.compressed() if hasattr(arr, "compressed") else arr.flatten()
+                vals = vals[np.isfinite(vals)]
+                if vals.size == 0:
+                    return None
+                return {
+                    "n_cells":   int(vals.size),
+                    "min":       float(np.min(vals)),
+                    "median":    float(np.median(vals)),
+                    "p10":       float(np.percentile(vals, 10)),
+                    "p90":       float(np.percentile(vals, 90)),
+                    "max":       float(np.max(vals)),
+                    "raw":       vals,
+                }
+        except Exception as e:
+            log.warning("polygon raster mask failed for %s: %r", path.name, e)
+            return None
+    elev = _stats(DEM_PATH)
+    hand = _stats(HAND_PATH)
+    twi = _stats(TWI_PATH)
+    if elev is None:
+        return None
+    # Fraction of polygon cells in canonical flood-prone bands
+    frac_hand_lt1 = (
+        round(float((hand["raw"] < 1.0).mean()), 4) if hand else None
+    )
+    frac_twi_gt10 = (
+        round(float((twi["raw"] > 10.0).mean()), 4) if twi else None
+    )
+    return {
+        "n_cells": elev["n_cells"],
+        "elev_min_m":     round(elev["min"], 2),
+        "elev_median_m":  round(elev["median"], 2),
+        "elev_p10_m":     round(elev["p10"], 2),
+        "elev_max_m":     round(elev["max"], 2),
+        "hand_median_m":  round(hand["median"], 2) if hand else None,
+        "twi_median":     round(twi["median"], 2) if twi else None,
+        "frac_hand_lt1":  frac_hand_lt1,
+        "frac_twi_gt10":  frac_twi_gt10,
+    }

app/context/noaa_tides.py ADDED Viewed

	@@ -0,0 +1,110 @@

+"""NOAA CO-OPS Tides & Currents — live coastal water level.
+api.tidesandcurrents.noaa.gov, no auth, 6-min cadence.
+We pick the nearest of three NYC-region stations to the queried address:
+  - 8518750 The Battery, NY
+  - 8516945 Kings Point, NY (Long Island Sound entrance)
+  - 8531680 Sandy Hook, NJ (NY Harbor approach)
+The verified-water-level API returns instantaneous water elevation
+relative to MLLW (Mean Lower Low Water — the local tidal datum). To
+distinguish "high tide" from "storm surge" we also fetch the published
+predicted tide and report the residual.
+"""
+from __future__ import annotations
+from dataclasses import dataclass
+from math import asin, cos, radians, sin, sqrt
+import httpx
+DOC_ID = "noaa_tides"
+CITATION = "NOAA CO-OPS Tides & Currents (api.tidesandcurrents.noaa.gov)"
+URL = "https://api.tidesandcurrents.noaa.gov/api/prod/datagetter"
+STATIONS = [
+    # (id, name, lat, lon)
+    # NYC harbor + Long Island Sound
+    ("8518750", "The Battery, NY",         40.7006, -74.0142),
+    ("8516945", "Kings Point, NY",         40.8103, -73.7649),
+    ("8531680", "Sandy Hook, NJ",          40.4669, -74.0094),
+    # Hudson tidal corridor (head-of-tide is Troy / Albany; Hudson is tidal
+    # all the way up to the Federal Lock at Troy)
+    ("8518995", "Albany, NY (Hudson)",     42.6469, -73.7464),
+    ("8518962", "Turkey Point Hudson, NY", 41.7569, -73.9433),
+    ("8519483", "West Point, NY",          41.3845, -73.9536),
+]
+@dataclass
+class TideReading:
+    station_id: str
+    station_name: str
+    distance_km: float
+    observed_ft: float | None      # current water level above MLLW
+    predicted_ft: float | None     # astronomical prediction at same instant
+    residual_ft: float | None      # observed - predicted (≈ storm surge)
+    obs_time: str | None
+    error: str | None = None
+def _haversine_km(lat1, lon1, lat2, lon2) -> float:
+    R = 6371.0
+    p1, p2 = radians(lat1), radians(lat2)
+    dp = radians(lat2 - lat1); dl = radians(lon2 - lon1)
+    a = sin(dp/2)**2 + cos(p1)*cos(p2)*sin(dl/2)**2
+    return 2 * R * asin(sqrt(a))
+def _nearest_station(lat: float, lon: float):
+    return min(STATIONS, key=lambda s: _haversine_km(lat, lon, s[2], s[3]))
+def _fetch(station_id: str, product: str) -> dict:
+    r = httpx.get(URL, params={
+        "date": "latest", "station": station_id, "product": product,
+        "datum": "MLLW", "units": "english", "time_zone": "lst_ldt",
+        "format": "json",
+    }, timeout=8.0)
+    r.raise_for_status()
+    return r.json()
+def reading_at(lat: float, lon: float) -> TideReading:
+    sid, name, slat, slon = _nearest_station(lat, lon)
+    dist_km = round(_haversine_km(lat, lon, slat, slon), 1)
+    out = TideReading(station_id=sid, station_name=name, distance_km=dist_km,
+                      observed_ft=None, predicted_ft=None, residual_ft=None,
+                      obs_time=None)
+    try:
+        obs = _fetch(sid, "water_level").get("data") or []
+        pred = _fetch(sid, "predictions").get("predictions") or []
+        if obs:
+            out.observed_ft = round(float(obs[0]["v"]), 2)
+            out.obs_time = obs[0].get("t")
+        if pred:
+            out.predicted_ft = round(float(pred[0]["v"]), 2)
+        if out.observed_ft is not None and out.predicted_ft is not None:
+            out.residual_ft = round(out.observed_ft - out.predicted_ft, 2)
+    except Exception as e:
+        out.error = str(e)
+    return out
+def summary_for_point(lat: float, lon: float) -> dict:
+    r = reading_at(lat, lon)
+    # Look up station coords for the map marker.
+    sta = next((s for s in STATIONS if s[0] == r.station_id), None)
+    return {
+        "station_id": r.station_id,
+        "station_name": r.station_name,
+        "station_lat": sta[2] if sta else None,
+        "station_lon": sta[3] if sta else None,
+        "distance_km": r.distance_km,
+        "observed_ft_mllw": r.observed_ft,
+        "predicted_ft_mllw": r.predicted_ft,
+        "residual_ft": r.residual_ft,
+        "obs_time": r.obs_time,
+        "error": r.error,
+    }

app/context/nws_alerts.py ADDED Viewed

	@@ -0,0 +1,71 @@

+"""NWS API — active alerts at a point.
+api.weather.gov/alerts/active?point={lat},{lon}, no auth, JSON.
+A User-Agent header is required (NWS rate-limits anonymous traffic).
+We surface only flood-relevant categories so the doc the reconciler
+sees is short and on-topic.
+"""
+from __future__ import annotations
+from typing import Any
+import httpx
+DOC_ID = "nws_alerts"
+CITATION = "NWS public alert API (api.weather.gov/alerts)"
+USER_AGENT = "Riprap-NYC/0.1 (civic-flood-tool; +https://huggingface.co/spaces/msradam/riprap-nyc)"
+_FLOOD_EVENT_KEYWORDS = (
+    "flood", "flash flood", "coastal flood", "high surf", "storm surge",
+    "hurricane", "tropical storm", "tornado warning",  # high-impact context
+    "rip current",
+)
+def _is_flood_relevant(event_name: str) -> bool:
+    e = (event_name or "").lower()
+    return any(k in e for k in _FLOOD_EVENT_KEYWORDS)
+def alerts_at(lat: float, lon: float) -> list[dict[str, Any]]:
+    r = httpx.get(
+        "https://api.weather.gov/alerts/active",
+        params={"point": f"{lat:.4f},{lon:.4f}"},
+        headers={"User-Agent": USER_AGENT, "Accept": "application/geo+json"},
+        timeout=8.0,
+    )
+    r.raise_for_status()
+    out = []
+    for f in r.json().get("features", []):
+        p = f.get("properties", {}) or {}
+        event = p.get("event") or ""
+        if not _is_flood_relevant(event):
+            continue
+        out.append({
+            "id": p.get("id"),
+            "event": event,
+            "severity": p.get("severity"),
+            "urgency": p.get("urgency"),
+            "certainty": p.get("certainty"),
+            "headline": p.get("headline"),
+            "sent": p.get("sent"),
+            "effective": p.get("effective"),
+            "expires": p.get("expires"),
+            "sender_name": p.get("senderName"),
+            "areaDesc": p.get("areaDesc"),
+        })
+    return out
+def summary_for_point(lat: float, lon: float) -> dict:
+    try:
+        active = alerts_at(lat, lon)
+    except Exception as e:
+        return {"n_active": 0, "alerts": [], "error": str(e)}
+    return {
+        "n_active": len(active),
+        "alerts": active,
+        "error": None,
+    }

app/context/nws_obs.py ADDED Viewed

	@@ -0,0 +1,108 @@

+"""NWS station observations — latest hourly METAR for the nearest NYC airport.
+api.weather.gov/stations/{id}/observations/latest.
+Five NYC-region ASOS stations cover the city; we pick the nearest.
+Most useful field for flood context is hourly precipitation (the
+`precipitationLastHour` quantity, mm). The latest observation is
+typically <60 min old.
+"""
+from __future__ import annotations
+from dataclasses import dataclass
+from math import asin, cos, radians, sin, sqrt
+import httpx
+DOC_ID = "nws_obs"
+CITATION = "NWS station observations API (api.weather.gov/stations)"
+USER_AGENT = "Riprap-NYC/0.1 (civic-flood-tool; +https://huggingface.co/spaces/msradam/riprap-nyc)"
+# NYC + Hudson Corridor ASOS stations. Picker is haversine-nearest, so adding
+# upstate stations enables Albany / Poughkeepsie / Newburgh queries without
+# breaking NYC behaviour (NYC stations stay closer for NYC lat/lon).
+STATIONS = [
+    # NYC region
+    ("KNYC", "Central Park, NY",         40.7794, -73.9692),
+    ("KLGA", "LaGuardia Airport, NY",    40.7794, -73.8800),
+    ("KJFK", "JFK Airport, NY",          40.6413, -73.7781),
+    ("KEWR", "Newark Liberty, NJ",       40.6925, -74.1687),
+    ("KFRG", "Republic Farmingdale, NY", 40.7288, -73.4134),
+    # Hudson Corridor (south → north)
+    ("KHPN", "White Plains, NY",         41.0670, -73.7076),
+    ("KSWF", "Newburgh-Stewart, NY",     41.5042, -74.1048),
+    ("KPOU", "Poughkeepsie, NY",         41.6262, -73.8842),
+    ("KALB", "Albany Intl, NY",          42.7475, -73.8025),
+]
+@dataclass
+class Obs:
+    station_id: str
+    station_name: str
+    distance_km: float
+    obs_time: str | None
+    temp_c: float | None
+    precip_last_hour_mm: float | None
+    precip_last_3h_mm: float | None
+    precip_last_6h_mm: float | None
+    error: str | None = None
+def _haversine_km(lat1, lon1, lat2, lon2) -> float:
+    R = 6371.0
+    p1, p2 = radians(lat1), radians(lat2)
+    dp = radians(lat2 - lat1); dl = radians(lon2 - lon1)
+    a = sin(dp/2)**2 + cos(p1)*cos(p2)*sin(dl/2)**2
+    return 2 * R * asin(sqrt(a))
+def _val_mm(props, key) -> float | None:
+    """NWS returns {value: ..., unitCode: 'wmoUnit:mm'} per quantity. Convert
+    to mm; if value is null, return None."""
+    q = (props or {}).get(key) or {}
+    v = q.get("value")
+    if v is None:
+        return None
+    return round(float(v), 2)
+def obs_at(lat: float, lon: float) -> Obs:
+    sid, name, slat, slon = min(STATIONS, key=lambda s: _haversine_km(lat, lon, s[2], s[3]))
+    dist_km = round(_haversine_km(lat, lon, slat, slon), 1)
+    out = Obs(station_id=sid, station_name=name, distance_km=dist_km,
+              obs_time=None, temp_c=None,
+              precip_last_hour_mm=None, precip_last_3h_mm=None,
+              precip_last_6h_mm=None)
+    try:
+        r = httpx.get(
+            f"https://api.weather.gov/stations/{sid}/observations/latest",
+            headers={"User-Agent": USER_AGENT, "Accept": "application/geo+json"},
+            timeout=8.0,
+        )
+        r.raise_for_status()
+        p = r.json().get("properties", {}) or {}
+        out.obs_time = p.get("timestamp")
+        out.temp_c = _val_mm(p, "temperature")
+        out.precip_last_hour_mm = _val_mm(p, "precipitationLastHour")
+        out.precip_last_3h_mm = _val_mm(p, "precipitationLast3Hours")
+        out.precip_last_6h_mm = _val_mm(p, "precipitationLast6Hours")
+    except Exception as e:
+        out.error = str(e)
+    return out
+def summary_for_point(lat: float, lon: float) -> dict:
+    o = obs_at(lat, lon)
+    return {
+        "station_id": o.station_id,
+        "station_name": o.station_name,
+        "distance_km": o.distance_km,
+        "obs_time": o.obs_time,
+        "temp_c": o.temp_c,
+        "precip_last_hour_mm": o.precip_last_hour_mm,
+        "precip_last_3h_mm": o.precip_last_3h_mm,
+        "precip_last_6h_mm": o.precip_last_6h_mm,
+        "error": o.error,
+    }

app/context/nyc311.py CHANGED Viewed

@@ -35,6 +35,8 @@ class Complaint:
     created_date: str
     address: str | None
     status: str | None
 def complaints_near(lat: float, lon: float, radius_m: float = 200,
@@ -46,12 +48,69 @@ def complaints_near(lat: float, lon: float, radius_m: float = 200,
         ts = since.replace(tzinfo=None).isoformat(timespec="seconds")
         where += f" AND created_date >= '{ts}'"
     r = httpx.get(URL, params={
-        "$select": "unique_key, descriptor, created_date, incident_address, status",
         "$where": where,
         "$order": "created_date desc",
         "$limit": str(limit),
     }, timeout=30)
     r.raise_for_status()
     return [
         Complaint(
             unique_key=row.get("unique_key", ""),
@@ -64,12 +123,28 @@ def complaints_near(lat: float, lon: float, radius_m: float = 200,
     ]
-def summary_for_point(lat: float, lon: float, radius_m: float = 200,
-                      years: int = 5) -> dict:
     since = datetime.now(timezone.utc) - timedelta(days=365 * years)
-    cs = complaints_near(lat, lon, radius_m, since=since, limit=2000)
     by_year: Counter = Counter(c.created_date[:4] for c in cs if c.created_date)
     by_descriptor: Counter = Counter(c.descriptor for c in cs)
     return {
         "n": len(cs),
         "radius_m": radius_m,
@@ -82,4 +157,5 @@ def summary_for_point(lat: float, lon: float, radius_m: float = 200,
              "address": c.address}
             for c in cs[:5]
         ],
     }

     created_date: str
     address: str | None
     status: str | None
+    lat: float | None = None
+    lon: float | None = None
 def complaints_near(lat: float, lon: float, radius_m: float = 200,
         ts = since.replace(tzinfo=None).isoformat(timespec="seconds")
         where += f" AND created_date >= '{ts}'"
     r = httpx.get(URL, params={
+        "$select": "unique_key, descriptor, created_date, incident_address, "
+                   "status, latitude, longitude",
         "$where": where,
         "$order": "created_date desc",
         "$limit": str(limit),
     }, timeout=30)
     r.raise_for_status()
+    out = []
+    for row in r.json():
+        lat = row.get("latitude")
+        lon = row.get("longitude")
+        try:
+            lat = float(lat) if lat is not None else None
+            lon = float(lon) if lon is not None else None
+        except Exception:
+            lat, lon = None, None
+        out.append(Complaint(
+            unique_key=row.get("unique_key", ""),
+            descriptor=row.get("descriptor", ""),
+            created_date=row.get("created_date", ""),
+            address=row.get("incident_address"),
+            status=row.get("status"),
+            lat=lat, lon=lon,
+        ))
+    return out
+def summary_for_point(lat: float, lon: float, radius_m: float = 200,
+                      years: int = 5) -> dict:
+    since = datetime.now(timezone.utc) - timedelta(days=365 * years)
+    cs = complaints_near(lat, lon, radius_m, since=since, limit=2000)
+    return _summarize(cs, years=years, radius_m=radius_m)
+def complaints_in_polygon(polygon, polygon_crs: str = "EPSG:4326",
+                          since: datetime | None = None,
+                          limit: int = 5000,
+                          simplify_tolerance: float = 0.0005) -> list[Complaint]:
+    """Pull flood-related complaints inside an arbitrary polygon via
+    Socrata's `within_polygon(location, 'MULTIPOLYGON(...)')` predicate.
+    NYC NTA polygons can have thousands of vertices and exceed Socrata's
+    URL length limit (414). We simplify in EPSG:4326 with a default
+    ~50 m tolerance, which collapses vertex count ~10-20× without
+    materially changing the contained-points result.
+    Polygon must be EPSG:4326 (lat/lon) for the Socrata query.
+    """
+    import geopandas as gpd
+    g = gpd.GeoDataFrame(geometry=[polygon], crs=polygon_crs).to_crs("EPSG:4326")
+    geom = g.iloc[0].geometry.simplify(simplify_tolerance, preserve_topology=True)
+    wkt = geom.wkt
+    where = f"{_DESC_CLAUSE} AND within_polygon(location, '{wkt}')"
+    if since:
+        ts = since.replace(tzinfo=None).isoformat(timespec="seconds")
+        where += f" AND created_date >= '{ts}'"
+    r = httpx.get(URL, params={
+        "$select": "unique_key, descriptor, created_date, incident_address, status",
+        "$where": where,
+        "$order": "created_date desc",
+        "$limit": str(limit),
+    }, timeout=60)
+    r.raise_for_status()
     return [
         Complaint(
             unique_key=row.get("unique_key", ""),
     ]
+def summary_for_polygon(polygon, polygon_crs: str = "EPSG:4326",
+                        years: int = 5) -> dict:
+    """Polygon-mode aggregation: counts of flood-related 311 complaints
+    inside the polygon over the trailing window."""
     since = datetime.now(timezone.utc) - timedelta(days=365 * years)
+    cs = complaints_in_polygon(polygon, polygon_crs=polygon_crs, since=since)
+    return _summarize(cs, years=years, radius_m=None)
+def _summarize(cs: list[Complaint], years: int, radius_m: float | None) -> dict:
     by_year: Counter = Counter(c.created_date[:4] for c in cs if c.created_date)
     by_descriptor: Counter = Counter(c.descriptor for c in cs)
+    # Cap at 60 most-recent points for the map layer — keeps the SSE
+    # payload small while still showing meaningful clustering.
+    points = [
+        {"lat": c.lat, "lon": c.lon,
+         "descriptor": c.descriptor,
+         "date": c.created_date[:10],
+         "address": c.address}
+        for c in cs[:60]
+        if c.lat is not None and c.lon is not None
+    ]
     return {
         "n": len(cs),
         "radius_m": radius_m,
              "address": c.address}
             for c in cs[:5]
         ],
+        "points": points,
     }

app/context/terramind_synthesis.py ADDED Viewed

	@@ -0,0 +1,357 @@

+"""TerraMind v1 base as a real-time FSM node — DEM → ESRI LULC.
+Per user query: take the geocoded (lat, lon), pull a DEM patch from
+Riprap's existing NYC-wide LiDAR raster (already used by the microtopo
+specialist — no STAC dependency), run TerraMind to generate a
+plausible categorical land-cover map from the terrain context, and
+emit class fractions the reconciler can cite as a synthetic-prior
+context layer alongside the empirical and modeled flood evidence.
+Why DEM → LULC (and not DEM → S2L2A as initially prototyped):
+  - LULC is *categorical* and *interpretable*. The output is one of
+    10 ESRI Land Cover classes per pixel; class fractions like "78%
+    Built Area" go straight into the briefing as cite-able claims.
+  - S2L2A is 12-channel reflectance — uninterpretable downstream
+    without a separate segmentation head.
+  - LULC is *comparable to ground truth*: NYC PLUTO land-use class
+    is already in the data layer; future calibration possible.
+Class label mapping is *tentative* against ESRI 2020-2022 schema
+(which TerraMesh's LULC tokenizer was trained on). The doc body
+discloses the mapping as tentative and the reconciler is instructed
+to use hedged framing ("the synthetic land-cover prior identifies …
+likely class …") rather than asserting hard labels.
+Why this shape:
+  - **No STAC dependency.** Microsoft Planetary Computer search has
+    been intermittent during this hackathon; the DEM raster is local
+    and always available.
+  - **Real-time.** < 0.3 s synthesis + < 0.5 s DEM patch read on M3
+    CPU once warm.
+  - **Honesty discipline.** Synthetic-prior tier, fourth epistemic
+    class alongside empirical / modeled / proxy.
+License: Apache-2.0 — `ibm-esa-geospatial/TerraMind-1.0-base`.
+"""
+from __future__ import annotations
+import logging
+import os
+import random
+import threading
+import time
+from typing import Any
+log = logging.getLogger("riprap.terramind")
+ENABLE = os.environ.get("RIPRAP_TERRAMIND_ENABLE", "1").lower() in ("1", "true", "yes")
+DEFAULT_STEPS = int(os.environ.get("RIPRAP_TERRAMIND_STEPS", "10"))
+DEFAULT_SEED = int(os.environ.get("RIPRAP_TERRAMIND_SEED", "42"))
+CHIP_PX = int(os.environ.get("RIPRAP_TERRAMIND_CHIP_PX", "224"))
+CHIP_M = CHIP_PX * 30  # NYC DEM is at 30 m -> 6.72 km square
+HALF_M = CHIP_M / 2
+_MODEL = None
+_INIT_LOCK = threading.Lock()
+# Tentative ESRI 2020-2022 Land Cover class mapping for TerraMind v1's
+# LULC tokenizer output (10 channels, argmax over channel axis -> class
+# index 0-9). The README/docs don't expose the exact mapping and the
+# tokenizer source confirms only "ESRI LULC" without a label table, so
+# the names below are best-effort. The doc body discloses tentativeness.
+LULC_CLASSES = [
+    "water",                # 0
+    "trees",                # 1
+    "grass",                # 2
+    "flooded_vegetation",   # 3
+    "crops",                # 4
+    "scrub_shrub",          # 5
+    "built_area",           # 6
+    "bare_ground",          # 7
+    "snow_ice",             # 8
+    "clouds_or_no_data",    # 9
+]
+def _has_required_deps() -> tuple[bool, str | None]:
+    """Probe deps. Distinguishes a *truly missing* package
+    (ModuleNotFoundError) from a *transient race* (other ImportError —
+    typically sklearn's "partially initialized module" from concurrent
+    imports inside the parallel-fanout block).
+    Truly missing returns (False, names). Transient race returns
+    (True, None) — let the caller try again, the import will resolve
+    on the next attempt once the racing thread finishes.
+    """
+    missing = []
+    for name in ("terratorch", "rasterio"):
+        try:
+            __import__(name)
+        except ModuleNotFoundError:
+            missing.append(name)
+        except ImportError:
+            # sklearn-style partial-init race; treat as available and
+            # let _ensure_model retry. Logged but not surfaced as missing.
+            log.debug("terramind: import race on %s, will retry on demand", name)
+    return (not missing, ", ".join(missing) if missing else None)
+_DEPS_OK, _DEPS_MISSING = _has_required_deps()
+def _ensure_model():
+    """Lazy load with a lock so the parallel-block worker can't double-init."""
+    global _MODEL
+    if _MODEL is not None:
+        return _MODEL
+    with _INIT_LOCK:
+        if _MODEL is not None:
+            return _MODEL
+        # Heavy import deferred to first call so module import stays cheap
+        # and HF Spaces (no terratorch) doesn't pay it at all.
+        import terratorch.models.backbones.terramind.model.terramind_register  # noqa
+        from terratorch.registry import FULL_MODEL_REGISTRY
+        log.info("terramind: loading v1 base generate (DEM -> LULC)")
+        m = FULL_MODEL_REGISTRY.build(
+            "terratorch_terramind_v1_base_generate",
+            modalities=["DEM"],
+            output_modalities=["LULC"],
+            pretrained=True,
+            timesteps=DEFAULT_STEPS,
+        )
+        m.eval()
+        _MODEL = m
+        log.info("terramind: model ready")
+    return _MODEL
+def warm():
+    """Call at app boot to amortize the ~6 s checkpoint load + first-call
+    JIT. No-op when deps are absent."""
+    if ENABLE and _DEPS_OK:
+        try:
+            _ensure_model()
+        except Exception:
+            log.exception("terramind: warm() failed; specialist will no-op")
+def _read_dem_patch(lat: float, lon: float):
+    """Read a CHIP_PX×CHIP_PX DEM patch centered on (lat, lon) from the
+    local NYC-wide LiDAR raster. Returns (array, bounds_4326) where
+    bounds_4326 is (minlon, minlat, maxlon, maxlat) so the synthesised
+    LULC can be georeferenced onto the same extent for map rendering.
+    Returns None if outside the raster's extent."""
+    from pathlib import Path
+    import numpy as np
+    import rasterio
+    from rasterio.windows import from_bounds
+    dem_path = (Path(__file__).resolve().parents[2]
+                / "data" / "nyc_dem_30m.tif")
+    if not dem_path.exists():
+        return None
+    with rasterio.open(dem_path) as src:
+        # The DEM is in EPSG:4326 (geographic) in our cache — convert
+        # the chip extent in the same CRS by building a rough degree
+        # bbox from a meters-square half-side at NYC latitude.
+        # 1 degree lat ≈ 111 km, 1 degree lon ≈ 85 km at 40.7°N.
+        d_lat = (HALF_M / 111_000.0)
+        d_lon = (HALF_M / 85_000.0)
+        win = from_bounds(lon - d_lon, lat - d_lat,
+                          lon + d_lon, lat + d_lat,
+                          src.transform)
+        arr = src.read(1, window=win, boundless=True, fill_value=0).astype("float32")
+    if arr.size == 0 or arr.shape[0] < 8 or arr.shape[1] < 8:
+        return None
+    # Resize to CHIP_PX × CHIP_PX via torch interpolation. The exact
+    # pixel-perfect alignment doesn't matter for a synthetic prior; the
+    # model just needs a real terrain patch to condition on.
+    import torch
+    t = torch.from_numpy(arr).unsqueeze(0).unsqueeze(0)
+    t = torch.nn.functional.interpolate(t, size=(CHIP_PX, CHIP_PX),
+                                         mode="bilinear", align_corners=False)
+    out = t.squeeze(0).numpy()  # (1, CHIP_PX, CHIP_PX)
+    # Replace NaN sentinel values with median elevation so the model
+    # doesn't see NaN tokens.
+    if np.isnan(out).any():
+        med = float(np.nanmedian(out))
+        out = np.nan_to_num(out, nan=med)
+    bounds_4326 = (lon - d_lon, lat - d_lat, lon + d_lon, lat + d_lat)
+    return out, bounds_4326
+# Map class index -> visual color for the categorical fill on the
+# MapLibre layer. Colors picked to be visually distinct from the
+# existing red (Sandy) / blue (DEP) / cyan (Prithvi) / orange (Ida HWM).
+LULC_FILL_COLORS = {
+    "water":              "#0284c7",  # not used (we keep water clear so
+                                       # the underlying basemap shows)
+    "trees":              "#16a34a",  # green
+    "grass":              "#86efac",  # pale green
+    "flooded_vegetation": "#a3e635",  # lime
+    "crops":              "#fde047",  # yellow
+    "scrub_shrub":        "#bef264",
+    "built_area":         "#9ca3af",  # neutral gray
+    "bare_ground":        "#d6d3d1",  # warm light gray
+    "snow_ice":           "#f3f4f6",
+    "clouds_or_no_data":  "#000000",  # not used (kept transparent)
+}
+# Classes we don't render at all (transparent) — water is best left
+# uncolored so the basemap shoreline reads through; clouds/no-data is
+# semantically meaningless to fill.
+LULC_HIDE_CLASSES = {"water", "clouds_or_no_data"}
+def _polygonize_lulc(class_idx, bounds_4326: tuple) -> dict:
+    """Vectorize the per-pixel argmax classification into one MultiPolygon
+    per class label, then dump as a single GeoJSON FeatureCollection in
+    EPSG:4326. Each feature carries `label` + `class_idx` properties so
+    the frontend can colour by category.
+    """
+    import json
+    import geopandas as gpd
+    from rasterio.features import shapes
+    from rasterio.transform import from_bounds as transform_from_bounds
+    from shapely.geometry import shape
+    minlon, minlat, maxlon, maxlat = bounds_4326
+    h, w = class_idx.shape
+    transform = transform_from_bounds(minlon, minlat, maxlon, maxlat, w, h)
+    feats = []
+    for i, label in enumerate(LULC_CLASSES):
+        if label in LULC_HIDE_CLASSES:
+            continue
+        mask = (class_idx == i).astype("uint8")
+        if mask.sum() < 8:  # skip tiny noise
+            continue
+        polys = []
+        for geom, value in shapes(mask, mask=mask.astype(bool),
+                                   transform=transform):
+            if value != 1:
+                continue
+            polys.append(shape(geom))
+        if not polys:
+            continue
+        # Dissolve via geopandas + simplify lightly. The chip is 30 m
+        # per pixel and we don't need pixel-edge fidelity at urban zoom.
+        gdf = gpd.GeoDataFrame({"geometry": polys}, crs="EPSG:4326")
+        gdf["geometry"] = gdf.geometry.simplify(1e-4, preserve_topology=True)
+        for geom in gdf.geometry:
+            feats.append({
+                "type": "Feature",
+                "geometry": json.loads(gpd.GeoSeries([geom],
+                                                    crs="EPSG:4326").to_json())["features"][0]["geometry"],
+                "properties": {"label": label, "class_idx": i,
+                               "fill_color": LULC_FILL_COLORS.get(label, "#9ca3af")},
+            })
+    return {"type": "FeatureCollection", "features": feats}
+def fetch(lat: float, lon: float, timeout_s: float = 60.0) -> dict[str, Any]:
+    """Run the specialist. Returns:
+        { ok: bool,
+          skipped: str | None,
+          synthetic_modality: bool,
+          tim_chain: list[str],
+          diffusion_steps: int, diffusion_seed: int,
+          dem_mean_m: float,
+          class_fractions: dict[str, float],  # tentative ESRI labels
+          dominant_class: str,                 # highest-fraction label
+          dominant_pct: float,
+          n_classes_observed: int,
+          chip_shape: list[int],
+          elapsed_s: float,
+          err: str | None }
+    Designed never to raise. Failures show up as ok=False with reason.
+    """
+    if not ENABLE:
+        return {"ok": False, "skipped": "RIPRAP_TERRAMIND_ENABLE=0"}
+    if not _DEPS_OK:
+        return {"ok": False, "skipped": f"deps unavailable: {_DEPS_MISSING}"}
+    t0 = time.time()
+    try:
+        import numpy as np
+        patch = _read_dem_patch(lat, lon)
+        if patch is None:
+            return {"ok": False, "skipped": "no DEM coverage at this point"}
+        dem, bounds_4326 = patch
+        dem_mean = float(dem.mean())
+        import torch
+        random.seed(DEFAULT_SEED)
+        torch.manual_seed(DEFAULT_SEED)
+        model = _ensure_model()
+        dem_t = torch.from_numpy(dem).unsqueeze(0).float()  # (1, 1, H, W)
+        if time.time() - t0 > timeout_s:
+            return {"ok": False, "skipped": "terramind exceeded budget"}
+        with torch.no_grad():
+            out = model({"DEM": dem_t}, timesteps=DEFAULT_STEPS,
+                        verbose=False)
+        lulc = out["LULC"]
+        if hasattr(lulc, "detach"):
+            lulc = lulc.detach().cpu().numpy()
+        if lulc.ndim == 4:
+            lulc = lulc[0]  # (n_classes, H, W)
+        # Argmax over class channel -> per-pixel class index, then
+        # fraction by class. This is the cite-able structured output.
+        class_idx = lulc.argmax(axis=0)  # (H, W)
+        unique, counts = np.unique(class_idx, return_counts=True)
+        total = float(class_idx.size)
+        fractions: dict[str, float] = {}
+        for u, c in zip(unique, counts, strict=False):
+            label = (LULC_CLASSES[int(u)] if 0 <= int(u) < len(LULC_CLASSES)
+                     else f"class_{int(u)}")
+            fractions[label] = round(100.0 * c / total, 2)
+        # Sort dominant -> tail for deterministic doc body ordering.
+        ordered = dict(sorted(fractions.items(),
+                              key=lambda kv: kv[1], reverse=True))
+        dominant_class = next(iter(ordered)) if ordered else "unknown"
+        dominant_pct = ordered.get(dominant_class, 0.0)
+        # Class indices map to TerraMesh's LULC tokenizer codebook; the
+        # exact label-to-index mapping isn't published. Surface a tentative
+        # name plus the raw index so a reader can see we're not asserting
+        # ground truth.
+        dominant_idx = next((i for i, lbl in enumerate(LULC_CLASSES)
+                             if lbl == dominant_class), -1)
+        dominant_display = (
+            f"class_{dominant_idx} (tentative: {dominant_class})"
+            if dominant_idx >= 0 else dominant_class
+        )
+        # Polygonize the categorical raster for the map layer.
+        # Best-effort — failure here doesn't fail the specialist.
+        try:
+            polygons_geojson = _polygonize_lulc(class_idx, bounds_4326)
+        except Exception:
+            log.exception("terramind: polygonize failed; skipping map layer")
+            polygons_geojson = None
+        return {
+            "ok": True,
+            "synthetic_modality": True,
+            "tim_chain": ["DEM", "LULC_synthetic"],
+            "diffusion_steps": DEFAULT_STEPS,
+            "diffusion_seed": DEFAULT_SEED,
+            "dem_mean_m": round(dem_mean, 2),
+            "class_fractions": ordered,
+            "dominant_class": dominant_class,
+            "dominant_class_display": dominant_display,
+            "dominant_pct": dominant_pct,
+            "n_classes_observed": len(ordered),
+            "chip_shape": list(lulc.shape),
+            "bounds_4326": list(bounds_4326),
+            "polygons_geojson": polygons_geojson,
+            "label_schema": "ESRI 2020-2022 Land Cover (tentative — "
+                            "TerraMind tokenizer source confirms ESRI but "
+                            "not exact label-to-index mapping)",
+            "elapsed_s": round(time.time() - t0, 2),
+        }
+    except Exception as e:
+        log.exception("terramind: fetch failed")
+        return {"ok": False, "err": f"{type(e).__name__}: {e}",
+                "elapsed_s": round(time.time() - t0, 2)}

app/flood_layers/dep_stormwater.py CHANGED Viewed

@@ -8,7 +8,6 @@ Four scenarios, all in EPSG:2263. Polygons are categorized by depth class:
 from __future__ import annotations
 from functools import lru_cache
-from pathlib import Path
 import geopandas as gpd
@@ -71,3 +70,42 @@ def join(assets: gpd.GeoDataFrame, scenario: str) -> gpd.GeoDataFrame:
 def label(scenario: str) -> str:
     return SCENARIOS[scenario]["label"]

 from __future__ import annotations
 from functools import lru_cache
 import geopandas as gpd
 def label(scenario: str) -> str:
     return SCENARIOS[scenario]["label"]
+def coverage_for_polygon(polygon, scenario: str,
+                         polygon_crs: str = "EPSG:4326") -> dict:
+    """Polygon-level summary: what fraction of the input polygon falls into
+    each depth class for a given DEP scenario? Used in neighborhood mode.
+    Returns:
+      {
+        'scenario':        scenario id,
+        'label':           human-readable scenario name,
+        'fraction_any':    fraction of polygon inside any flooded class,
+        'fraction_class':  {1: f, 2: f, 3: f} fraction in each class,
+        'polygon_area_m2': total polygon area,
+      }
+    """
+    z = load(scenario)
+    poly_gdf = gpd.GeoDataFrame(geometry=[polygon], crs=polygon_crs).to_crs(NYC_CRS)
+    poly_geom = poly_gdf.iloc[0].geometry
+    poly_ft2 = float(poly_geom.area)
+    sqft_to_m2 = 0.092903
+    fraction_class = {1: 0.0, 2: 0.0, 3: 0.0}
+    if poly_ft2:
+        for cat in (1, 2, 3):
+            sub = z[z["Flooding_Category"] == cat]
+            if sub.empty:
+                continue
+            inter = sub.geometry.intersection(poly_geom)
+            inter = inter[~inter.is_empty]
+            ft2 = float(inter.area.sum()) if len(inter) else 0.0
+            fraction_class[cat] = round(ft2 / poly_ft2, 4)
+    fraction_any = round(sum(fraction_class.values()), 4)
+    return {
+        "scenario":        scenario,
+        "label":           label(scenario),
+        "fraction_any":    fraction_any,
+        "fraction_class":  fraction_class,
+        "polygon_area_m2": round(poly_ft2 * sqft_to_m2, 1),
+    }

app/flood_layers/ida_hwm.py CHANGED Viewed

@@ -33,6 +33,7 @@ class HWMSummary:
     nearest_site: str | None
     nearest_elev_ft: float | None
     sample_sites: list[str]
 def _haversine_m(lat1, lon1, lat2, lon2):
@@ -71,6 +72,17 @@ def summary_for_point(lat: float, lon: float, radius_m: int = 1000) -> HWMSummar
                if f["properties"].get("height_above_gnd") is not None]
     sites = [f["properties"].get("site_description") for _, f in in_radius]
     sites = [s for s in sites if s][:5]
     return HWMSummary(
         n_within_radius=len(in_radius),
         radius_m=radius_m,
@@ -80,4 +92,5 @@ def summary_for_point(lat: float, lon: float, radius_m: int = 1000) -> HWMSummar
         nearest_site=nf["properties"].get("site_description") if nf else None,
         nearest_elev_ft=nf["properties"].get("elev_ft") if nf else None,
         sample_sites=sites,
     )

     nearest_site: str | None
     nearest_elev_ft: float | None
     sample_sites: list[str]
+    points: list[dict] | None = None  # per-mark for the map layer
 def _haversine_m(lat1, lon1, lat2, lon2):
                if f["properties"].get("height_above_gnd") is not None]
     sites = [f["properties"].get("site_description") for _, f in in_radius]
     sites = [s for s in sites if s][:5]
+    points = []
+    for d, f in in_radius[:50]:  # cap so SSE payload stays small
+        flon, flat = f["geometry"]["coordinates"]
+        p = f["properties"]
+        points.append({
+            "lat": flat, "lon": flon,
+            "site": p.get("site_description"),
+            "elev_ft": p.get("elev_ft"),
+            "height_above_gnd_ft": p.get("height_above_gnd"),
+            "distance_m": round(d, 1),
+        })
     return HWMSummary(
         n_within_radius=len(in_radius),
         radius_m=radius_m,
         nearest_site=nf["properties"].get("site_description") if nf else None,
         nearest_elev_ft=nf["properties"].get("elev_ft") if nf else None,
         sample_sites=sites,
+        points=points,
     )

app/flood_layers/prithvi_live.py ADDED Viewed

	@@ -0,0 +1,299 @@

+"""Prithvi-EO 2.0 (Sen1Floods11 fine-tune) live water segmentation.
+A per-query specialist: pulls the most recent low-cloud Sentinel-2 L2A
+scene over the address from Microsoft Planetary Computer, runs the
+IBM-NASA flood-mapping fine-tune, and reports % water within 500 m.
+Distinct from `app/flood_layers/prithvi_water.py`, which serves the
+offline-precomputed 2021 Ida polygons. This one is *fresh observation*
+each query — different doc_id (`prithvi_live`), different epistemic
+claim, additive to the static layer.
+Network calls (STAC search + COG band reads) and a 300M-param model
+forward pass make this the slowest specialist after the LLM. Gated by
+RIPRAP_PRITHVI_LIVE_ENABLE so deployments without the deps installed
+silently skip it. Cloud-cover refuses out at 30%+ to honor the
+Sen1Floods11 training distribution.
+License: Apache-2.0 (verified — `ibm-nasa-geospatial/Prithvi-EO-2.0-
+300M-TL-Sen1Floods11`). See experiments/shared/licenses.md.
+"""
+from __future__ import annotations
+import logging
+import os
+import threading
+import time
+from typing import Any
+log = logging.getLogger("riprap.prithvi_live")
+ENABLE = os.environ.get("RIPRAP_PRITHVI_LIVE_ENABLE", "1").lower() in ("1", "true", "yes")
+SEARCH_DAYS = int(os.environ.get("RIPRAP_PRITHVI_LIVE_SEARCH_DAYS", "120"))
+MAX_CLOUD_PCT = float(os.environ.get("RIPRAP_PRITHVI_LIVE_MAX_CLOUD", "30"))
+DEVICE = os.environ.get("RIPRAP_PRITHVI_LIVE_DEVICE", "cpu")
+REPO = "ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11"
+# Sen1Floods11 expects 6 bands in this exact order.
+BANDS = ["B02", "B03", "B04", "B8A", "B11", "B12"]
+IMG_SIZE = 512  # Sen1Floods11 training crop
+CHIP_PX = 1024
+CHIP_M = CHIP_PX * 10
+HALF_M = CHIP_M / 2
+CENTER_RADIUS_M = 500
+PIXEL_M = 10
+_MODEL = None
+_RUN_MODEL = None
+_INIT_LOCK = threading.Lock()  # serializes lazy load if multiple threads
+                               # hit fetch() before _MODEL is populated
+def _has_required_deps() -> tuple[bool, str | None]:
+    """Heavy-EO deps (terratorch / planetary_computer / rioxarray /
+    pystac-client / xarray / einops) live in requirements-experiments.txt
+    only — they don't fit Riprap's HF Spaces' Py3.10 dep cone alongside
+    transformers<5 / hf_hub<1 / granite-tsfm<0.3.4 / mellea<0.4.
+    Probe each importable name once at module load. If any are missing,
+    fetch() returns a clean `skipped: deps_unavailable` outcome instead
+    of crashing with a noisy ModuleNotFoundError in the trace. Local
+    dev + AMD path have these installed and the specialist runs."""
+    missing = []
+    for name in ("terratorch", "planetary_computer", "pystac_client",
+                 "rioxarray", "xarray", "einops"):
+        try:
+            __import__(name)
+        except ImportError:
+            missing.append(name)
+    if missing:
+        return False, ", ".join(missing)
+    return True, None
+_DEPS_OK, _DEPS_MISSING = _has_required_deps()
+def warm():
+    """Optional pre-load. The FSM action is lazy too — calling warm()
+    here just amortizes the first-query cost at app boot."""
+    if not ENABLE:
+        return
+    try:
+        _ensure_model()
+    except Exception:
+        log.exception("prithvi_live: warm() failed; specialist will no-op")
+def _ensure_model():
+    global _MODEL, _RUN_MODEL
+    if _MODEL is not None:
+        return _MODEL, _RUN_MODEL
+    with _INIT_LOCK:
+        if _MODEL is not None:  # double-check inside the lock
+            return _MODEL, _RUN_MODEL
+        import importlib.util
+        from huggingface_hub import hf_hub_download
+        from terratorch.cli_tools import LightningInferenceModel
+        config_path = hf_hub_download(REPO, "config.yaml")
+        checkpoint = hf_hub_download(REPO, "Prithvi-EO-V2-300M-TL-Sen1Floods11.pt")
+        log.info("prithvi_live: loading model")
+        m = LightningInferenceModel.from_config(config_path, checkpoint)
+        m.model.eval()
+        if DEVICE == "cuda":
+            try:
+                import torch
+                if torch.cuda.is_available():
+                    m.model.cuda()
+            except Exception:
+                log.exception("prithvi_live: cuda move failed")
+        inference_py = hf_hub_download(REPO, "inference.py")
+        spec = importlib.util.spec_from_file_location("_prithvi_inference",
+                                                       inference_py)
+        mod = importlib.util.module_from_spec(spec)
+        spec.loader.exec_module(mod)
+        _MODEL = m
+        _RUN_MODEL = mod.run_model
+        return _MODEL, _RUN_MODEL
+def _search_recent_scene(lat: float, lon: float):
+    """Most recent low-cloud S2 L2A item near (lat, lon) in the last
+    SEARCH_DAYS days, or None."""
+    import datetime as dt
+    import planetary_computer as pc
+    from pystac_client import Client
+    end = dt.datetime.utcnow().date()
+    start = end - dt.timedelta(days=SEARCH_DAYS)
+    client = Client.open(
+        "https://planetarycomputer.microsoft.com/api/stac/v1",
+        modifier=pc.sign_inplace,
+    )
+    delta = 0.02
+    search = client.search(
+        collections=["sentinel-2-l2a"],
+        bbox=[lon - delta, lat - delta, lon + delta, lat + delta],
+        datetime=f"{start}/{end}",
+        query={"eo:cloud_cover": {"lt": MAX_CLOUD_PCT}},
+        max_items=20,
+    )
+    items = sorted(
+        search.items(),
+        key=lambda it: (it.properties.get("eo:cloud_cover", 100),
+                        -(it.datetime.timestamp() if it.datetime else 0)),
+    )
+    return items[0] if items else None
+def _build_chip(item, lat: float, lon: float):
+    """Returns (img, ref_da, epsg) — img is the (6, H, W) center-cropped
+    float32 array; ref_da is the rioxarray DataArray of the reference
+    band BEFORE the center crop (kept so we can compute the affine
+    transform for polygonization in EPSG:4326)."""
+    import numpy as np
+    import rioxarray  # noqa: F401
+    import xarray as xr
+    from pyproj import Transformer
+    if "proj:epsg" in item.properties:
+        epsg = int(item.properties["proj:epsg"])
+    else:
+        code = item.properties.get("proj:code", "")
+        if code.startswith("EPSG:"):
+            epsg = int(code.split(":", 1)[1])
+        else:
+            raise RuntimeError("STAC item missing proj:epsg / proj:code")
+    fwd = Transformer.from_crs("EPSG:4326", f"EPSG:{epsg}", always_xy=True)
+    cx, cy = fwd.transform(lon, lat)
+    xmin, xmax = cx - HALF_M, cx + HALF_M
+    ymin, ymax = cy - HALF_M, cy + HALF_M
+    ref = rioxarray.open_rasterio(item.assets[BANDS[0]].href, masked=False).squeeze(drop=True)
+    ref = ref.rio.clip_box(minx=xmin, miny=ymin, maxx=xmax, maxy=ymax)
+    ref = ref.isel(y=slice(0, CHIP_PX), x=slice(0, CHIP_PX))
+    arrs = [ref.astype("float32")]
+    for b in BANDS[1:]:
+        da = rioxarray.open_rasterio(item.assets[b].href, masked=False).squeeze(drop=True)
+        da = da.rio.clip_box(minx=xmin, miny=ymin, maxx=xmax, maxy=ymax)
+        if da.shape != ref.shape:
+            da = da.rio.reproject_match(ref)
+        arrs.append(da.astype("float32"))
+    stacked = xr.concat(arrs, dim="band", join="override").assign_coords(band=BANDS)
+    img = stacked.values  # (6, H, W)
+    # Center crop to IMG_SIZE x IMG_SIZE.
+    _, h, w = img.shape
+    sy, sx = (h - IMG_SIZE) // 2, (w - IMG_SIZE) // 2
+    img = img[:, sy:sy + IMG_SIZE, sx:sx + IMG_SIZE]
+    if img.mean() > 1:
+        img = img / 10000.0
+    return np.nan_to_num(img.astype("float32")), ref, epsg
+def _polygonize_mask(pred, ref_da, epsg: int) -> dict | None:
+    """Vectorize the binary water mask into an EPSG:4326 GeoJSON
+    FeatureCollection so the frontend can paint it on the MapLibre
+    map. Returns None on failure (best-effort — never raises into the
+    caller path)."""
+    try:
+        import json
+        import geopandas as gpd
+        from rasterio.features import shapes
+        from rasterio.transform import from_origin
+        from shapely.geometry import shape
+        # Reconstruct the affine transform of the center-cropped pred.
+        # ref_da has 1024 px at 10 m; we cropped to the central 512.
+        xs = ref_da.x.values
+        ys = ref_da.y.values
+        if len(xs) < IMG_SIZE or len(ys) < IMG_SIZE:
+            return None
+        # rioxarray gives pixel-centered coords; offset by half a pixel
+        # to the upper-left to build a from_origin transform.
+        sy = (len(ys) - IMG_SIZE) // 2
+        sx = (len(xs) - IMG_SIZE) // 2
+        # ys are descending (top-to-bottom); take the top of the crop.
+        top_y = float(ys[sy]) + (PIXEL_M / 2.0)
+        left_x = float(xs[sx]) - (PIXEL_M / 2.0)
+        transform = from_origin(left_x, top_y, PIXEL_M, PIXEL_M)
+        # Polygonize only the water class (1).
+        mask = (pred == 1).astype("uint8")
+        polys = []
+        for geom, value in shapes(mask, mask=mask.astype(bool),
+                                   transform=transform):
+            if value != 1:
+                continue
+            polys.append(shape(geom))
+        if not polys:
+            return {"type": "FeatureCollection", "features": []}
+        gdf = gpd.GeoDataFrame({"geometry": polys},
+                                crs=f"EPSG:{epsg}").to_crs("EPSG:4326")
+        # Simplify slightly to keep the SSE payload small (10 m raster
+        # over 5 km square = up to ~10 k tiny squares; simplification
+        # collapses adjacent water pixels into smooth polygons).
+        gdf["geometry"] = gdf.geometry.simplify(0.00005, preserve_topology=True)
+        return json.loads(gdf.to_json())
+    except Exception:
+        log.exception("prithvi_live: polygonize failed")
+        return None
+def fetch(lat: float, lon: float, timeout_s: float = 60.0) -> dict[str, Any]:
+    """Run the specialist. Returns a dict with at minimum:
+        { "ok": bool,
+          "skipped": str | None,    # reason if no observation
+          "item_id": str | None,
+          "item_datetime": str | None,
+          "cloud_cover": float | None,
+          "pct_water_within_500m": float | None,
+          "pct_water_full": float | None }
+    Designed to never raise; failures show up as ok=False with an `err`.
+    """
+    if not ENABLE:
+        return {"ok": False, "skipped": "RIPRAP_PRITHVI_LIVE_ENABLE=0"}
+    if not _DEPS_OK:
+        # Clean "not deployed here" signal instead of a ModuleNotFoundError
+        # surfaced as an exception. Same trace-card layout as ENABLE=0.
+        return {"ok": False,
+                "skipped": f"deps unavailable on this deployment: "
+                           f"{_DEPS_MISSING}"}
+    t0 = time.time()
+    try:
+        item = _search_recent_scene(lat, lon)
+        if item is None:
+            return {"ok": False, "skipped": f"no <{MAX_CLOUD_PCT}% cloud "
+                    f"S2 in last {SEARCH_DAYS}d"}
+        cc = float(item.properties.get("eo:cloud_cover", -1))
+        if time.time() - t0 > timeout_s:
+            return {"ok": False, "skipped": "stac search exceeded budget"}
+        img, ref_da, epsg = _build_chip(item, lat, lon)
+        if time.time() - t0 > timeout_s:
+            return {"ok": False, "skipped": "chip build exceeded budget"}
+        model, run_model = _ensure_model()
+        x = img[None, :, None, :, :]  # (1, 6, 1, H, W)
+        pred_t = run_model(x, None, None, model.model, model.datamodule, IMG_SIZE)
+        import numpy as np
+        pred = pred_t[0].cpu().numpy().astype("uint8")
+        pct_full = float(100.0 * pred.mean())
+        yy, xx = np.indices(pred.shape)
+        cy, cx = pred.shape[0] // 2, pred.shape[1] // 2
+        radius_px = CENTER_RADIUS_M / PIXEL_M
+        circle = (yy - cy) ** 2 + (xx - cx) ** 2 <= radius_px ** 2
+        pct_500 = float(100.0 * pred[circle].mean()) if circle.sum() else 0.0
+        # Polygonize the water mask into EPSG:4326 GeoJSON for the map.
+        polygons_geojson = _polygonize_mask(pred, ref_da, epsg)
+        return {
+            "ok": True,
+            "item_id": item.id,
+            "item_datetime": str(item.datetime),
+            "cloud_cover": cc,
+            "pct_water_full": pct_full,
+            "pct_water_within_500m": pct_500,
+            "polygons_geojson": polygons_geojson,
+            "elapsed_s": round(time.time() - t0, 2),
+        }
+    except Exception as e:
+        log.exception("prithvi_live: fetch failed")
+        return {"ok": False, "err": f"{type(e).__name__}: {e}",
+                "elapsed_s": round(time.time() - t0, 2)}

app/flood_layers/sandy_inundation.py CHANGED Viewed

@@ -17,7 +17,7 @@ def load() -> gpd.GeoDataFrame:
     return g[["geometry"]]
-def join(assets: gpd.GeoDataFrame) -> "gpd.pd.Series":
     """Return a boolean Series indexed like assets: True if inside Sandy zone."""
     z = load()
     # spatial join avoids fragile unary union over messy public polygons
@@ -32,3 +32,31 @@ def join(assets: gpd.GeoDataFrame) -> "gpd.pd.Series":
     s[:] = False
     s.iloc[list(flagged)] = True
     return s.reset_index(drop=True)

     return g[["geometry"]]
+def join(assets: gpd.GeoDataFrame) -> gpd.pd.Series:
     """Return a boolean Series indexed like assets: True if inside Sandy zone."""
     z = load()
     # spatial join avoids fragile unary union over messy public polygons
     s[:] = False
     s.iloc[list(flagged)] = True
     return s.reset_index(drop=True)
+def coverage_for_polygon(polygon, polygon_crs: str = "EPSG:4326") -> dict:
+    """Polygon-level summary: what fraction of the input polygon overlaps
+    the 2012 Sandy inundation extent? Used in neighborhood-mode queries.
+    Returns:
+      {
+        'overlap_area_m2':   absolute overlap in m2,
+        'polygon_area_m2':   total polygon area in m2,
+        'fraction':          overlap / polygon_area, range [0, 1],
+        'inside':            True if any overlap exists,
+      }
+    """
+    z = load().to_crs("EPSG:2263")  # NY State Plane Long Island, units = ft
+    poly_gdf = gpd.GeoDataFrame(geometry=[polygon], crs=polygon_crs).to_crs("EPSG:2263")
+    poly_geom = poly_gdf.iloc[0].geometry
+    inter = z.intersection(poly_geom)
+    inter = inter[~inter.is_empty]
+    overlap_ft2 = float(inter.area.sum()) if len(inter) else 0.0
+    poly_ft2 = float(poly_geom.area)
+    sqft_to_m2 = 0.092903
+    return {
+        "overlap_area_m2":   round(overlap_ft2 * sqft_to_m2, 1),
+        "polygon_area_m2":   round(poly_ft2 * sqft_to_m2, 1),
+        "fraction":          round(overlap_ft2 / poly_ft2, 4) if poly_ft2 else 0.0,
+        "inside":            overlap_ft2 > 0,
+    }

app/fsm.py CHANGED Viewed

@@ -1,12 +1,14 @@
-"""HeliOS-NYC Burr FSM for address-query flood risk.
-Linear pipeline; each action degrades gracefully (empty result -> no doc).
-The reconciler (Granite 4.1) only sees documents from specialists that
-actually produced data.
 """
 from __future__ import annotations
 import logging
 import time
 from typing import Any
@@ -14,15 +16,93 @@ import geopandas as gpd
 from burr.core import ApplicationBuilder, State, action
 from shapely.geometry import Point
-from app.context import floodnet, microtopo, nyc311
 from app.energy import estimate as energy_estimate
 from app.flood_layers import dep_stormwater, ida_hwm, prithvi_water, sandy_inundation
 from app.geocode import geocode_one
 from app.rag import retrieve as rag_retrieve
 from app.reconcile import reconcile as run_reconcile
-log = logging.getLogger("helios_nyc.fsm")
 def _step(state: State, name: str) -> dict[str, Any]:
     """Append a step record to the trace; returns the dict so the action
@@ -67,7 +147,10 @@ def step_sandy(state: State) -> State:
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
-            return state.update(trace=trace)
         pt = gpd.GeoDataFrame(geometry=[Point(state["lon"], state["lat"])], crs="EPSG:4326").to_crs("EPSG:2263")
         flag = bool(sandy_inundation.join(pt).iloc[0])
         rec["ok"] = True; rec["result"] = {"inside": flag}
@@ -75,7 +158,7 @@ def step_sandy(state: State) -> State:
     except Exception as e:
         rec["ok"] = False; rec["err"] = str(e)
         log.exception("sandy failed")
-        return state.update(trace=trace)
     finally:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
@@ -86,7 +169,10 @@ def step_dep(state: State) -> State:
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
-            return state.update(trace=trace)
         pt = gpd.GeoDataFrame(geometry=[Point(state["lon"], state["lat"])], crs="EPSG:4326").to_crs("EPSG:2263")
         out: dict[str, Any] = {}
         for scen in ["dep_extreme_2080", "dep_moderate_2050", "dep_moderate_current"]:
@@ -101,7 +187,7 @@ def step_dep(state: State) -> State:
     except Exception as e:
         rec["ok"] = False; rec["err"] = str(e)
         log.exception("dep failed")
-        return state.update(trace=trace)
     finally:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
@@ -112,7 +198,10 @@ def step_floodnet(state: State) -> State:
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
-            return state.update(trace=trace)
         s = floodnet.summary_for_point(state["lat"], state["lon"], radius_m=600)
         s["radius_m"] = 600
         rec["ok"] = True
@@ -122,7 +211,7 @@ def step_floodnet(state: State) -> State:
     except Exception as e:
         rec["ok"] = False; rec["err"] = str(e)
         log.exception("floodnet failed")
-        return state.update(trace=trace)
     finally:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
@@ -133,14 +222,17 @@ def step_311(state: State) -> State:
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
-            return state.update(trace=trace)
         s = nyc311.summary_for_point(state["lat"], state["lon"], radius_m=200, years=5)
         rec["ok"] = True; rec["result"] = {"n": s["n"]}
         return state.update(nyc311=s, trace=trace)
     except Exception as e:
         rec["ok"] = False; rec["err"] = str(e)
         log.exception("311 failed")
-        return state.update(trace=trace)
     finally:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
@@ -151,7 +243,7 @@ def step_ida_hwm(state: State) -> State:
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
-            return state.update(trace=trace)
         s = ida_hwm.summary_for_point(state["lat"], state["lon"], radius_m=800)
         if s is None:
             rec["ok"] = False; rec["err"] = "HWM data missing"
@@ -178,6 +270,9 @@ def step_prithvi(state: State) -> State:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
             return state.update(prithvi_water=None, trace=trace)
         s = prithvi_water.summary_for_point(state["lat"], state["lon"])
         if s is None:
             rec["ok"] = False; rec["err"] = "Prithvi mask missing"
@@ -197,13 +292,375 @@ def step_prithvi(state: State) -> State:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
 @action(reads=["lat", "lon"], writes=["microtopo", "trace"])
 def step_microtopo(state: State) -> State:
     rec, trace = _step(state, "microtopo_lidar")
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
-            return state.update(trace=trace)
         m = microtopo.microtopo_at(state["lat"], state["lon"])
         if m is None:
             rec["ok"] = False; rec["err"] = "DEM fetch failed"
@@ -223,12 +680,19 @@ def step_microtopo(state: State) -> State:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
-@action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo", "ida_hwm", "prithvi_water"],
         writes=["rag", "trace"])
 def step_rag(state: State) -> State:
     rec, trace = _step(state, "rag_granite_embedding")
     try:
         geo = state.get("geocode") or {}
         sandy = state.get("sandy")
         dep = state.get("dep") or {}
         # Build a context-rich query so retrieval pulls policy paragraphs
@@ -240,7 +704,7 @@ def step_rag(state: State) -> State:
             bits.append(f"in {geo['borough']}")
         if sandy:
             bits.append("inside Hurricane Sandy 2012 inundation zone")
-        for k, v in dep.items():
             if v.get("depth_class", 0) > 0:
                 bits.append(f"in {v['depth_label']} pluvial scenario")
         bits.append("flood resilience plan, vulnerability, hardening, mitigation")
@@ -258,11 +722,59 @@ def step_rag(state: State) -> State:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
 @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
-               "ida_hwm", "prithvi_water", "rag"],
-        writes=["paragraph", "audit", "trace"])
 def step_reconcile(state: State) -> State:
-    rec, trace = _step(state, "reconcile_granite41")
     try:
         snap = {
             "geocode": state.get("geocode"),
@@ -273,52 +785,141 @@ def step_reconcile(state: State) -> State:
             "microtopo": state.get("microtopo"),
             "ida_hwm": state.get("ida_hwm"),
             "prithvi_water": state.get("prithvi_water"),
             "rag": state.get("rag"),
         }
-        para, audit = run_reconcile(snap, return_audit=True)
         rec["ok"] = True
-        rec["result"] = {
-            "paragraph_chars": len(para),
-            "dropped_sentences": len(audit["dropped"]),
-        }
-        return state.update(paragraph=para, audit=audit, trace=trace)
     except Exception as e:
         rec["ok"] = False; rec["err"] = str(e)
         log.exception("reconcile failed")
-        return state.update(paragraph="", audit={"raw": "", "dropped": []}, trace=trace)
     finally:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
 def build_app(query: str):
-    return (
         ApplicationBuilder()
-        .with_actions(
-            geocode=step_geocode,
-            sandy=step_sandy,
-            dep=step_dep,
-            floodnet=step_floodnet,
-            nyc311=step_311,
-            microtopo=step_microtopo,
-            ida_hwm=step_ida_hwm,
-            prithvi=step_prithvi,
-            rag=step_rag,
-            reconcile=step_reconcile,
-        )
-        .with_transitions(
-            ("geocode", "sandy"),
-            ("sandy", "dep"),
-            ("dep", "floodnet"),
-            ("floodnet", "nyc311"),
-            ("nyc311", "microtopo"),
-            ("microtopo", "ida_hwm"),
-            ("ida_hwm", "prithvi"),
-            ("prithvi", "rag"),
-            ("rag", "reconcile"),
-        )
         .with_state(query=query, trace=[])
         .with_entrypoint("geocode")
-        .build()
     )
@@ -345,37 +946,115 @@ def run(query: str) -> dict[str, Any]:
         "microtopo": final_state.get("microtopo"),
         "ida_hwm": final_state.get("ida_hwm"),
         "prithvi_water": final_state.get("prithvi_water"),
         "rag": final_state.get("rag"),
         "paragraph": final_state.get("paragraph"),
         "audit": final_state.get("audit"),
         "energy": _summarize_energy(trace),
         "trace": trace,
     }
 def iter_steps(query: str):
-    """Yield (action_name, partial_state_dict) after each Burr action.
-    Used by the web UI for SSE streaming — each yield is a "step lit up"
-    moment. The final yield carries the reconciled paragraph.
     """
     app = build_app(query)
-    last_trace_len = 0
-    for action_obj, result, state in app.iterate(halt_after=["reconcile"]):
-        trace = list(state.get("trace", []))
-        # Yield only the new trace records since the prior step
-        new_records = trace[last_trace_len:]
-        last_trace_len = len(trace)
-        for rec in new_records:
             yield {
                 "kind": "step",
-                "step": rec["step"],
-                "ok": rec.get("ok"),
-                "elapsed_s": rec.get("elapsed_s"),
-                "result": rec.get("result"),
-                "err": rec.get("err"),
             }
-    # final
     trace = state.get("trace", [])
     yield {
         "kind": "final",
@@ -387,8 +1066,22 @@ def iter_steps(query: str):
         "microtopo": state.get("microtopo"),
         "ida_hwm": state.get("ida_hwm"),
         "prithvi_water": state.get("prithvi_water"),
         "rag": state.get("rag"),
         "paragraph": state.get("paragraph"),
         "audit": state.get("audit"),
         "energy": _summarize_energy(trace),
     }

+"""Riprap Burr FSM — linear specialist pipeline for one address.
+Each action either produces a structured fact (which becomes a document
+the reconciler can cite) or stays silent on failure. The reconciler
+(Granite 4.1) only sees documents from specialists that actually
+produced data — the silence-over-confabulation contract.
 """
 from __future__ import annotations
 import logging
+import threading as _threading
 import time
 from typing import Any
 from burr.core import ApplicationBuilder, State, action
 from shapely.geometry import Point
+from app.context import floodnet, microtopo, noaa_tides, nws_alerts, nws_obs, nyc311
 from app.energy import estimate as energy_estimate
 from app.flood_layers import dep_stormwater, ida_hwm, prithvi_water, sandy_inundation
 from app.geocode import geocode_one
+from app.live import floodnet_forecast as fn_forecast
+from app.live import ttm_forecast
 from app.rag import retrieve as rag_retrieve
 from app.reconcile import reconcile as run_reconcile
+from app.registers import doe_schools as r_schools
+from app.registers import doh_hospitals as r_hospitals
+from app.registers import mta_entrances as r_mta
+from app.registers import nycha as r_nycha
+log = logging.getLogger("riprap.fsm")
+# NYC five-borough bbox. Specialists whose data sources are NYC-only
+# (Sandy 2012, NYC DEP Stormwater, FloodNet, NYC 311, NYC microtopo
+# raster, NYC Hurricane Ida Prithvi polygons) skip with an explicit
+# "out of NYC scope" reason when geocode lands outside this envelope.
+# Live specialists (NWS / NOAA / TTM) and the NY-State Ida HWMs run
+# unconditionally.
+_NYC_S, _NYC_W, _NYC_N, _NYC_E = 40.49, -74.27, 40.92, -73.69
+def _in_nyc(lat, lon) -> bool:
+    if lat is None or lon is None:
+        return False
+    return _NYC_S <= lat <= _NYC_N and _NYC_W <= lon <= _NYC_E
+# Thread-local hook so the streaming endpoint can subscribe to per-token
+# Granite output during reconcile, without threading a callback through
+# every Burr action signature.
+_FSM_LOCAL = _threading.local()
+def set_token_callback(on_token):
+    """Install a per-thread on_token(delta) callable for the next reconcile.
+    Pass None to clear."""
+    _FSM_LOCAL.on_token = on_token
+def _current_token_callback():
+    return getattr(_FSM_LOCAL, "on_token", None)
+def set_mellea_attempt_callback(fn):
+    _FSM_LOCAL.on_mellea_attempt = fn
+def _current_mellea_attempt_callback():
+    return getattr(_FSM_LOCAL, "on_mellea_attempt", None)
+def set_strict_mode(strict: bool):
+    """Per-thread flag — when True the linear FSM's reconcile step routes
+    through Mellea-validated rejection sampling instead of the standard
+    streaming reconciler. Disables token streaming for that step."""
+    _FSM_LOCAL.strict = bool(strict)
+def _current_strict_mode() -> bool:
+    return bool(getattr(_FSM_LOCAL, "strict", False))
+def set_planned_specialists(spec_names):
+    """Install a per-thread set of specialist names from the planner.
+    Used by step_reconcile to trim doc messages: documents whose family
+    prefix doesn't match any planned specialist are dropped before the
+    Mellea call. Cuts ~30-50% of prompt tokens on local Ollama, where
+    the FSM otherwise hands the reconciler every specialist's output
+    even if the planner only asked for a subset."""
+    _FSM_LOCAL.planned_specialists = set(spec_names) if spec_names else None
+def _current_planned_specialists():
+    return getattr(_FSM_LOCAL, "planned_specialists", None)
+# Canonical Burr: one action per specialist, sequential transitions.
+# A previous version of this module wrapped 16 specialists in a single
+# fan-out action that ran them concurrently in a ThreadPoolExecutor;
+# that path was removed because it sometimes hung after the fan-out
+# completed (Burr-internal post-action cleanup with custom executors)
+# and made the trace UI's per-step timing harder to reason about.
+# Parallelism, when wanted, belongs at the inference layer
+# (vLLM / Ollama NUM_PARALLEL), not the FSM.
 def _step(state: State, name: str) -> dict[str, Any]:
     """Append a step record to the trace; returns the dict so the action
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(sandy=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(sandy=None, trace=trace)
         pt = gpd.GeoDataFrame(geometry=[Point(state["lon"], state["lat"])], crs="EPSG:4326").to_crs("EPSG:2263")
         flag = bool(sandy_inundation.join(pt).iloc[0])
         rec["ok"] = True; rec["result"] = {"inside": flag}
     except Exception as e:
         rec["ok"] = False; rec["err"] = str(e)
         log.exception("sandy failed")
+        return state.update(sandy=None, trace=trace)
     finally:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(dep=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(dep=None, trace=trace)
         pt = gpd.GeoDataFrame(geometry=[Point(state["lon"], state["lat"])], crs="EPSG:4326").to_crs("EPSG:2263")
         out: dict[str, Any] = {}
         for scen in ["dep_extreme_2080", "dep_moderate_2050", "dep_moderate_current"]:
     except Exception as e:
         rec["ok"] = False; rec["err"] = str(e)
         log.exception("dep failed")
+        return state.update(dep=None, trace=trace)
     finally:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(floodnet=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(floodnet=None, trace=trace)
         s = floodnet.summary_for_point(state["lat"], state["lon"], radius_m=600)
         s["radius_m"] = 600
         rec["ok"] = True
     except Exception as e:
         rec["ok"] = False; rec["err"] = str(e)
         log.exception("floodnet failed")
+        return state.update(floodnet=None, trace=trace)
     finally:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(nyc311=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(nyc311=None, trace=trace)
         s = nyc311.summary_for_point(state["lat"], state["lon"], radius_m=200, years=5)
         rec["ok"] = True; rec["result"] = {"n": s["n"]}
         return state.update(nyc311=s, trace=trace)
     except Exception as e:
         rec["ok"] = False; rec["err"] = str(e)
         log.exception("311 failed")
+        return state.update(nyc311=None, trace=trace)
     finally:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(ida_hwm=None, trace=trace)
         s = ida_hwm.summary_for_point(state["lat"], state["lon"], radius_m=800)
         if s is None:
             rec["ok"] = False; rec["err"] = "HWM data missing"
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
             return state.update(prithvi_water=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(prithvi_water=None, trace=trace)
         s = prithvi_water.summary_for_point(state["lat"], state["lon"])
         if s is None:
             rec["ok"] = False; rec["err"] = "Prithvi mask missing"
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["prithvi_live", "trace"])
+def step_prithvi_live(state: State) -> State:
+    """Live Sentinel-2 water segmentation via Prithvi-EO 2.0.
+    Network + 300M-param forward pass per query, so it's the slowest
+    specialist by far. Gracefully no-ops via the underlying module if
+    `RIPRAP_PRITHVI_LIVE_ENABLE=0` or if STAC / model load fails.
+    """
+    rec, trace = _step(state, "prithvi_eo_live")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(prithvi_live=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(prithvi_live=None, trace=trace)
+        from app.flood_layers import prithvi_live
+        s = prithvi_live.fetch(state["lat"], state["lon"])
+        rec["ok"] = bool(s.get("ok"))
+        if not s.get("ok"):
+            rec["err"] = s.get("err") or s.get("skipped") or "no observation"
+        else:
+            rec["result"] = {
+                "scene_date": (s.get("item_datetime") or "")[:10],
+                "cloud_cover": s.get("cloud_cover"),
+                "pct_water_500m": s.get("pct_water_within_500m"),
+                "pct_water_5km": s.get("pct_water_full"),
+            }
+        return state.update(prithvi_live=s, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("prithvi_live failed")
+        return state.update(prithvi_live=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["ttm_311_forecast", "trace"])
+def step_ttm_311_forecast(state: State) -> State:
+    """TTM r2 zero-shot forecast on weekly 311 flood-complaint counts
+    at this specific address (200 m radius). 52 weeks of context →
+    4 weeks of forecast. Per-query, per-address, citable."""
+    rec, trace = _step(state, "ttm_311_forecast")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(ttm_311_forecast=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(ttm_311_forecast=None, trace=trace)
+        s = ttm_forecast.weekly_311_forecast_for_point(state["lat"], state["lon"])
+        rec["ok"] = bool(s.get("available"))
+        if not rec["ok"]:
+            rec["err"] = s.get("reason", "unavailable")
+        else:
+            rec["result"] = {
+                "history_total": s.get("history_total_complaints"),
+                "history_recent_mean": s.get("history_recent_3mo_mean"),
+                "forecast_mean": s.get("forecast_mean_per_week"),
+                "forecast_peak": s.get("forecast_peak_per_week"),
+                "accelerating": s.get("accelerating"),
+            }
+        return state.update(ttm_311_forecast=s, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("ttm_311_forecast failed")
+        return state.update(ttm_311_forecast=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["terramind", "trace"])
+def step_terramind(state: State) -> State:
+    """TerraMind v1 base — DEM → S2L2A synthesis as a per-query
+    cognitive-engine node. ~3-7s on M3 CPU. Output is a
+    *synthetic-prior* — explicitly fourth epistemic class alongside
+    empirical / modeled / proxy. Frame the doc body and reconciler
+    narration as 'plausible synthesis from terrain context', never
+    'imaged' or 'reconstructed'."""
+    rec, trace = _step(state, "terramind_synthesis")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(terramind=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(terramind=None, trace=trace)
+        from app.context import terramind_synthesis
+        s = terramind_synthesis.fetch(state["lat"], state["lon"])
+        rec["ok"] = bool(s.get("ok"))
+        if not s.get("ok"):
+            rec["err"] = s.get("err") or s.get("skipped") or "terramind unavailable"
+        else:
+            rec["result"] = {
+                "tim_chain": s.get("tim_chain"),
+                "diffusion_steps": s.get("diffusion_steps"),
+                "dem_mean_m": s.get("dem_mean_m"),
+                "synth_chip_shape": s.get("synth_chip_shape"),
+                "elapsed_s": s.get("elapsed_s"),
+            }
+        return state.update(terramind=s, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("terramind failed")
+        return state.update(terramind=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["noaa_tides", "trace"])
+def step_noaa_tides(state: State) -> State:
+    rec, trace = _step(state, "noaa_tides")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(noaa_tides=None, trace=trace)
+        s = noaa_tides.summary_for_point(state["lat"], state["lon"])
+        rec["ok"] = s.get("error") is None
+        rec["result"] = {
+            "station": s["station_id"],
+            "observed_ft_mllw": s["observed_ft_mllw"],
+            "residual_ft": s["residual_ft"],
+        }
+        if s.get("error"): rec["err"] = s["error"]
+        return state.update(noaa_tides=s, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("noaa_tides failed")
+        return state.update(noaa_tides=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["nws_alerts", "trace"])
+def step_nws_alerts(state: State) -> State:
+    rec, trace = _step(state, "nws_alerts")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(nws_alerts=None, trace=trace)
+        s = nws_alerts.summary_for_point(state["lat"], state["lon"])
+        rec["ok"] = s.get("error") is None
+        rec["result"] = {"n_active": s["n_active"]}
+        if s.get("error"): rec["err"] = s["error"]
+        return state.update(nws_alerts=s, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("nws_alerts failed")
+        return state.update(nws_alerts=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["nws_obs", "trace"])
+def step_nws_obs(state: State) -> State:
+    rec, trace = _step(state, "nws_obs")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(nws_obs=None, trace=trace)
+        s = nws_obs.summary_for_point(state["lat"], state["lon"])
+        rec["ok"] = s.get("error") is None
+        rec["result"] = {
+            "station": s["station_id"],
+            "p1h_mm": s["precip_last_hour_mm"],
+            "p6h_mm": s["precip_last_6h_mm"],
+        }
+        if s.get("error"): rec["err"] = s["error"]
+        return state.update(nws_obs=s, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("nws_obs failed")
+        return state.update(nws_obs=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["ttm_forecast", "trace"])
+def step_ttm_forecast(state: State) -> State:
+    """Granite TTM r2 zero-shot forecast of the Battery surge residual."""
+    rec, trace = _step(state, "ttm_forecast")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(ttm_forecast=None, trace=trace)
+        s = ttm_forecast.summary_for_point(state["lat"], state["lon"])
+        if not s.get("available"):
+            rec["ok"] = False
+            rec["err"] = s.get("reason", "TTM unavailable")
+            return state.update(ttm_forecast=None, trace=trace)
+        rec["ok"] = True
+        rec["result"] = {
+            "context": s["context_length"],
+            "horizon": s["horizon_steps"],
+            "forecast_peak_ft": s["forecast_peak_ft"],
+            "forecast_peak_min_ahead": s["forecast_peak_minutes_ahead"],
+            "interesting": s["interesting"],
+        }
+        return state.update(ttm_forecast=s, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("ttm_forecast failed")
+        return state.update(ttm_forecast=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["floodnet_forecast", "trace"])
+def step_floodnet_forecast(state: State) -> State:
+    """TTM r2 forecast of flood-event recurrence at the nearest FloodNet
+    sensor. Reuses the same (512, 96) singleton as ttm_311_forecast — no
+    additional model loaded into memory. Silent when the sensor has too
+    few historical events for a defensible forecast."""
+    rec, trace = _step(state, "floodnet_forecast")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(floodnet_forecast=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(floodnet_forecast=None, trace=trace)
+        s = fn_forecast.summary_for_point(state["lat"], state["lon"])
+        rec["ok"] = bool(s.get("available"))
+        if not rec["ok"]:
+            rec["err"] = s.get("reason", "unavailable")
+        else:
+            rec["result"] = {
+                "sensor_id": s.get("sensor_id"),
+                "distance_m": s.get("distance_from_query_m"),
+                "history_28d": s.get("history_recent_28d_events"),
+                "forecast_28d": s.get("forecast_28d_expected_events"),
+                "accelerating": s.get("accelerating"),
+            }
+        return state.update(floodnet_forecast=s if rec["ok"] else None,
+                            trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("floodnet_forecast failed")
+        return state.update(floodnet_forecast=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["mta_entrances", "trace"])
+def step_mta_entrances(state: State) -> State:
+    rec, trace = _step(state, "mta_entrance_exposure")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(mta_entrances=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(mta_entrances=None, trace=trace)
+        s = r_mta.summary_for_point(state["lat"], state["lon"])
+        if not s.get("available"):
+            rec["ok"] = False; rec["err"] = "no entrances within radius"
+            return state.update(mta_entrances=None, trace=trace)
+        rec["ok"] = True
+        rec["result"] = {
+            "n_entrances": s["n_entrances"],
+            "n_inside_sandy_2012": s["n_inside_sandy_2012"],
+            "n_in_dep_extreme_2080": s["n_in_dep_extreme_2080"],
+        }
+        return state.update(mta_entrances=s, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("mta_entrances failed")
+        return state.update(mta_entrances=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["nycha_developments", "trace"])
+def step_nycha(state: State) -> State:
+    rec, trace = _step(state, "nycha_development_exposure")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(nycha_developments=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(nycha_developments=None, trace=trace)
+        s = r_nycha.summary_for_point(state["lat"], state["lon"])
+        if not s.get("available"):
+            rec["ok"] = False; rec["err"] = "no NYCHA developments within radius"
+            return state.update(nycha_developments=None, trace=trace)
+        rec["ok"] = True
+        rec["result"] = {
+            "n_developments": s["n_developments"],
+            "n_majority_inside_sandy_2012": s["n_majority_inside_sandy_2012"],
+            "n_with_dep_2080_overlap": s["n_with_dep_2080_overlap"],
+        }
+        return state.update(nycha_developments=s, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("nycha failed")
+        return state.update(nycha_developments=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["doe_schools", "trace"])
+def step_doe_schools(state: State) -> State:
+    rec, trace = _step(state, "doe_school_exposure")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(doe_schools=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(doe_schools=None, trace=trace)
+        s = r_schools.summary_for_point(state["lat"], state["lon"])
+        if not s.get("available"):
+            rec["ok"] = False; rec["err"] = "no schools within radius"
+            return state.update(doe_schools=None, trace=trace)
+        rec["ok"] = True
+        rec["result"] = {
+            "n_schools": s["n_schools"],
+            "n_inside_sandy_2012": s["n_inside_sandy_2012"],
+            "n_in_dep_extreme_2080": s["n_in_dep_extreme_2080"],
+        }
+        return state.update(doe_schools=s, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("doe_schools failed")
+        return state.update(doe_schools=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["lat", "lon"], writes=["doh_hospitals", "trace"])
+def step_doh_hospitals(state: State) -> State:
+    rec, trace = _step(state, "doh_hospital_exposure")
+    try:
+        if state.get("lat") is None:
+            rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(doh_hospitals=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(doh_hospitals=None, trace=trace)
+        s = r_hospitals.summary_for_point(state["lat"], state["lon"])
+        if not s.get("available"):
+            rec["ok"] = False; rec["err"] = "no hospitals within radius"
+            return state.update(doh_hospitals=None, trace=trace)
+        rec["ok"] = True
+        rec["result"] = {
+            "n_hospitals": s["n_hospitals"],
+            "n_inside_sandy_2012": s["n_inside_sandy_2012"],
+            "n_in_dep_extreme_2080": s["n_in_dep_extreme_2080"],
+        }
+        return state.update(doh_hospitals=s, trace=trace)
+    except Exception as e:
+        rec["ok"] = False; rec["err"] = str(e)
+        log.exception("doh_hospitals failed")
+        return state.update(doh_hospitals=None, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
 @action(reads=["lat", "lon"], writes=["microtopo", "trace"])
 def step_microtopo(state: State) -> State:
     rec, trace = _step(state, "microtopo_lidar")
     try:
         if state.get("lat") is None:
             rec["ok"] = False; rec["err"] = "no coords"
+            return state.update(microtopo=None, trace=trace)
+        if not _in_nyc(state["lat"], state["lon"]):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(microtopo=None, trace=trace)
         m = microtopo.microtopo_at(state["lat"], state["lon"])
         if m is None:
             rec["ok"] = False; rec["err"] = "DEM fetch failed"
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
+               "ida_hwm", "prithvi_water", "noaa_tides", "nws_alerts", "nws_obs",
+               "ttm_forecast"],
         writes=["rag", "trace"])
 def step_rag(state: State) -> State:
     rec, trace = _step(state, "rag_granite_embedding")
     try:
         geo = state.get("geocode") or {}
+        if not _in_nyc(geo.get("lat"), geo.get("lon")):
+            rec["ok"] = False; rec["err"] = "out of NYC scope"
+            return state.update(rag=[], trace=trace)
         sandy = state.get("sandy")
         dep = state.get("dep") or {}
         # Build a context-rich query so retrieval pulls policy paragraphs
             bits.append(f"in {geo['borough']}")
         if sandy:
             bits.append("inside Hurricane Sandy 2012 inundation zone")
+        for v in dep.values():
             if v.get("depth_class", 0) > 0:
                 bits.append(f"in {v['depth_label']} pluvial scenario")
         bits.append("flood resilience plan, vulnerability, hardening, mitigation")
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+@action(reads=["rag"], writes=["gliner", "trace"])
+def step_gliner(state: State) -> State:
+    """GLiNER typed-entity extraction over the top RAG paragraphs.
+    Adds structured fields (`agency`, `dollar_amount`,
+    `infrastructure_project`, `nyc_location`, `date_range`) the
+    reconciler can cite with `[gliner_<source>]`. Silent no-op when
+    disabled via RIPRAP_GLINER_ENABLE=0 or when the model failed to
+    load — preserves the existing FSM contract.
+    """
+    rec, trace = _step(state, "gliner_extract")
+    try:
+        from app.context.gliner_extract import extract_for_rag_hits
+        hits = state.get("rag") or []
+        if not hits:
+            rec["ok"] = True
+            rec["result"] = {"sources": 0, "skipped": "no rag hits"}
+            return state.update(gliner={}, trace=trace)
+        out = extract_for_rag_hits(hits)
+        rec["ok"] = True
+        rec["result"] = {
+            "sources": len(out),
+            "totals_by_label": _label_counts(out),
+        }
+        return state.update(gliner=out, trace=trace)
+    except Exception as e:
+        rec["ok"] = False
+        rec["err"] = str(e)
+        log.exception("gliner failed")
+        return state.update(gliner={}, trace=trace)
+    finally:
+        rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+def _label_counts(gliner_out: dict[str, dict]) -> dict[str, int]:
+    counts: dict[str, int] = {}
+    for src in gliner_out.values():
+        for e in src.get("entities", []):
+            counts[e["label"]] = counts.get(e["label"], 0) + 1
+    return counts
 @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
+               "ida_hwm", "prithvi_water", "prithvi_live", "terramind",
+               "noaa_tides", "nws_alerts", "nws_obs", "ttm_forecast",
+               "ttm_311_forecast", "floodnet_forecast", "mta_entrances",
+               "nycha_developments", "doe_schools", "doh_hospitals",
+               "rag", "gliner"],
+        writes=["paragraph", "audit", "mellea", "trace"])
 def step_reconcile(state: State) -> State:
+    is_strict = _current_strict_mode()
+    rec, trace = _step(state, "mellea_reconcile_address" if is_strict else "reconcile_granite41")
+    mellea_meta = None
     try:
         snap = {
             "geocode": state.get("geocode"),
             "microtopo": state.get("microtopo"),
             "ida_hwm": state.get("ida_hwm"),
             "prithvi_water": state.get("prithvi_water"),
+            "noaa_tides": state.get("noaa_tides"),
+            "nws_alerts": state.get("nws_alerts"),
+            "nws_obs": state.get("nws_obs"),
+            "ttm_forecast": state.get("ttm_forecast"),
+            "ttm_311_forecast": state.get("ttm_311_forecast"),
+            "floodnet_forecast": state.get("floodnet_forecast"),
             "rag": state.get("rag"),
+            "gliner": state.get("gliner"),
+            "prithvi_live": state.get("prithvi_live"),
+            "terramind": state.get("terramind"),
+            "mta_entrances": state.get("mta_entrances"),
+            "nycha_developments": state.get("nycha_developments"),
+            "doe_schools": state.get("doe_schools"),
+            "doh_hospitals": state.get("doh_hospitals"),
         }
+        if is_strict:
+            from app.mellea_validator import DEFAULT_LOOP_BUDGET, reconcile_strict_streaming
+            from app.reconcile import EXTRA_SYSTEM_PROMPT, build_documents, trim_docs_to_plan
+            doc_msgs = build_documents(snap)
+            doc_msgs = trim_docs_to_plan(doc_msgs, _current_planned_specialists())
+            if not doc_msgs:
+                para = "No grounded data available for this address."
+                audit = {"raw": para, "dropped": []}
+            else:
+                token_cb = _current_token_callback()
+                attempt_cb = _current_mellea_attempt_callback()
+                mres = reconcile_strict_streaming(
+                    doc_msgs, EXTRA_SYSTEM_PROMPT,
+                    user_prompt="Write the cited paragraph now.",
+                    loop_budget=DEFAULT_LOOP_BUDGET,
+                    on_token=(lambda d, _ai: token_cb(d)) if token_cb else None,
+                    on_attempt_end=attempt_cb,
+                )
+                para = mres["paragraph"]
+                audit = {"raw": para, "dropped": []}
+                mellea_meta = {
+                    "rerolls": mres["rerolls"],
+                    "n_attempts": mres["n_attempts"],
+                    "requirements_passed": mres["requirements_passed"],
+                    "requirements_failed": mres["requirements_failed"],
+                    "requirements_total": mres["requirements_total"],
+                    "model": mres["model"],
+                    "loop_budget": mres["loop_budget"],
+                }
+            rec["result"] = {
+                "rerolls": (mellea_meta or {}).get("rerolls"),
+                "passed": (f"{len((mellea_meta or {}).get('requirements_passed') or [])}/"
+                           f"{(mellea_meta or {}).get('requirements_total') or 0}"),
+                "paragraph_chars": len(para),
+            }
+        else:
+            para, audit = run_reconcile(snap, return_audit=True,
+                                        on_token=_current_token_callback())
+            rec["result"] = {
+                "paragraph_chars": len(para),
+                "dropped_sentences": len(audit["dropped"]),
+            }
         rec["ok"] = True
+        return state.update(paragraph=para, audit=audit,
+                            mellea=mellea_meta, trace=trace)
     except Exception as e:
         rec["ok"] = False; rec["err"] = str(e)
         log.exception("reconcile failed")
+        return state.update(paragraph="", audit={"raw": "", "dropped": []},
+                            mellea=None, trace=trace)
     finally:
         rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
+import os as _os  # noqa: E402
+# Specialists that involve large spatial joins (every NYCHA development
+# overlapped against multiple flood layers, every DOE school footprint
+# joined to DEM/HAND, etc.) or per-query model inference (Prithvi-EO live
+# STAC + ViT, TerraMind diffusion). They're ~1-3 minutes apiece on a
+# laptop on the FIRST call (the lru_caches inside the registers warm up
+# afterwards). The previous parallel-fan-out FSM hid that cost behind
+# the longest single specialist; the linear FSM exposes it.
+#
+# Default OFF on local-Ollama so the demo briefing returns in well under
+# 90 s. Enable explicitly with RIPRAP_HEAVY_SPECIALISTS=1 (e.g. on the
+# AMD-vLLM path, where the reconciler's ~5 s leaves room for the joins).
+_HEAVY_SPECIALISTS_ENABLED = _os.environ.get(
+    "RIPRAP_HEAVY_SPECIALISTS",
+    "0" if _os.environ.get("RIPRAP_LLM_PRIMARY", "ollama").lower() == "ollama" else "1",
+).lower() in ("1", "true", "yes")
 def build_app(query: str):
+    """Linear, single-action-per-step Burr application.
+    Order: cheap-first geo + flood layers, then live live network signals,
+    then RAG → reconcile. Heavy specialists (NYCHA / DOE / DOH register
+    joins, Prithvi-EO live STAC, TerraMind diffusion) are gated behind
+    RIPRAP_HEAVY_SPECIALISTS — see the module-level note above.
+    """
+    builder = (
         ApplicationBuilder()
         .with_state(query=query, trace=[])
         .with_entrypoint("geocode")
+    )
+    actions: dict[str, Any] = {
+        "geocode": step_geocode,
+        "sandy": step_sandy,
+        "dep": step_dep,
+        "floodnet": step_floodnet,
+        "nyc311": step_311,
+        "noaa_tides": step_noaa_tides,
+        "nws_alerts": step_nws_alerts,
+        "nws_obs": step_nws_obs,
+        "ttm_forecast": step_ttm_forecast,
+        "ttm_311_forecast": step_ttm_311_forecast,
+        "floodnet_forecast": step_floodnet_forecast,
+        "microtopo": step_microtopo,
+        "ida_hwm": step_ida_hwm,
+        "mta_entrances": step_mta_entrances,
+        "prithvi": step_prithvi,  # baked GeoJSON polygons for Ida; cheap
+    }
+    if _HEAVY_SPECIALISTS_ENABLED:
+        actions["nycha"] = step_nycha
+        actions["doe_schools"] = step_doe_schools
+        actions["doh_hospitals"] = step_doh_hospitals
+        actions["prithvi_live"] = step_prithvi_live
+        actions["terramind"] = step_terramind
+    actions["rag"] = step_rag
+    actions["gliner"] = step_gliner
+    actions["reconcile"] = step_reconcile
+    # Sequential transitions — pair every adjacent action in the dict order.
+    keys = list(actions.keys())
+    transitions = list(zip(keys, keys[1:]))
+    return (
+        builder.with_actions(**actions).with_transitions(*transitions).build()
     )
         "microtopo": final_state.get("microtopo"),
         "ida_hwm": final_state.get("ida_hwm"),
         "prithvi_water": final_state.get("prithvi_water"),
+        "terramind": final_state.get("terramind"),
+        "noaa_tides": final_state.get("noaa_tides"),
+        "nws_alerts": final_state.get("nws_alerts"),
+        "nws_obs": final_state.get("nws_obs"),
+        "ttm_forecast": final_state.get("ttm_forecast"),
+        "ttm_311_forecast": final_state.get("ttm_311_forecast"),
+        "floodnet_forecast": final_state.get("floodnet_forecast"),
+        "mta_entrances": final_state.get("mta_entrances"),
+        "nycha_developments": final_state.get("nycha_developments"),
+        "doe_schools": final_state.get("doe_schools"),
+        "doh_hospitals": final_state.get("doh_hospitals"),
         "rag": final_state.get("rag"),
         "paragraph": final_state.get("paragraph"),
         "audit": final_state.get("audit"),
+        "mellea": final_state.get("mellea"),
         "energy": _summarize_energy(trace),
         "trace": trace,
     }
 def iter_steps(query: str):
+    """Yield SSE-friendly events as the FSM runs.
+    Each Burr action emits exactly one trace record on completion; we
+    yield it as a `step` event the moment the iterate loop returns from
+    that action. Reconciler tokens stream through the threadlocal
+    `set_token_callback` (installed before this generator is iterated),
+    not through this queue.
+    Burr's `app.iterate(halt_after=["reconcile"])` runs synchronously,
+    yielding `(action, result, state)` after every action. We drive it
+    in a background thread so the per-action SSE events reach the
+    client as soon as each action returns, while the reconciler's
+    token callback fires concurrently from the same thread.
     """
+    import queue
+    q: queue.Queue[tuple[str, Any] | None] = queue.Queue()
+    seen_keys: set[tuple[str, float]] = set()
+    def _push_step(rec: dict) -> None:
+        key = (rec.get("step", ""), rec.get("started_at", 0.0))
+        if key in seen_keys:
+            return
+        seen_keys.add(key)
+        q.put(("step", rec))
     app = build_app(query)
+    final_state_holder: dict[str, Any] = {}
+    # Threadlocals are per-thread; the request thread (single_address.run
+    # / neighborhood.run) sets the strict-mode flag, planner specialist
+    # set, and token / Mellea-attempt callbacks, but Burr's app.iterate
+    # runs in this generator's thread. Snapshot the request-thread state
+    # and re-install on the iterate thread so step_reconcile sees them.
+    _captured_strict = _current_strict_mode()
+    _captured_planned = _current_planned_specialists()
+    _captured_token_cb = _current_token_callback()
+    _captured_mellea_cb = _current_mellea_attempt_callback()
+    def _run_iterate():
+        set_strict_mode(_captured_strict)
+        set_planned_specialists(_captured_planned)
+        set_token_callback(_captured_token_cb)
+        set_mellea_attempt_callback(_captured_mellea_cb)
+        try:
+            for _action_obj, _result, state in app.iterate(halt_after=["reconcile"]):
+                final_state_holder["state"] = state
+                # Each action appends one record to state.trace; emit the
+                # most recent so the SSE client gets the step event the
+                # moment Burr returns from that action.
+                trace = state.get("trace") or []
+                if trace:
+                    _push_step(trace[-1])
+        except Exception as e:
+            log.exception("iterate raised")
+            q.put(("error", {"err": f"{type(e).__name__}: {e}"}))
+        finally:
+            set_strict_mode(False)
+            set_planned_specialists(None)
+            set_token_callback(None)
+            set_mellea_attempt_callback(None)
+            q.put(None)  # sentinel
+    runner = _threading.Thread(target=_run_iterate, name="riprap-fsm",
+                               daemon=True)
+    runner.start()
+    while True:
+        item = q.get()
+        if item is None:
+            break
+        kind, payload = item
+        if kind == "step":
             yield {
                 "kind": "step",
+                "step": payload.get("step"),
+                "ok": payload.get("ok"),
+                "elapsed_s": payload.get("elapsed_s"),
+                "result": payload.get("result"),
+                "err": payload.get("err"),
             }
+        elif kind == "error":
+            yield {"kind": "error", **payload}
+    runner.join(timeout=5)
+    state = final_state_holder.get("state")
+    if state is None:
+        return
     trace = state.get("trace", [])
     yield {
         "kind": "final",
         "microtopo": state.get("microtopo"),
         "ida_hwm": state.get("ida_hwm"),
         "prithvi_water": state.get("prithvi_water"),
+        "prithvi_live": state.get("prithvi_live"),
+        "terramind": state.get("terramind"),
+        "noaa_tides": state.get("noaa_tides"),
+        "nws_alerts": state.get("nws_alerts"),
+        "nws_obs": state.get("nws_obs"),
+        "ttm_forecast": state.get("ttm_forecast"),
+        "ttm_311_forecast": state.get("ttm_311_forecast"),
+        "floodnet_forecast": state.get("floodnet_forecast"),
+        "mta_entrances": state.get("mta_entrances"),
+        "nycha_developments": state.get("nycha_developments"),
+        "doe_schools": state.get("doe_schools"),
+        "doh_hospitals": state.get("doh_hospitals"),
         "rag": state.get("rag"),
+        "gliner": state.get("gliner"),
         "paragraph": state.get("paragraph"),
         "audit": state.get("audit"),
+        "mellea": state.get("mellea"),
         "energy": _summarize_energy(trace),
     }

app/geocode.py CHANGED Viewed

@@ -1,21 +1,45 @@
-"""NYC address geocoding via the city's public Geosupport service (no key).
-Uses NYC Department of City Planning's Geoclient-replacement via the open
-Geosearch API (geosearch.planninglabs.nyc) — no auth required, NYC-only,
-runs against the public service. Stays inside the "open civic data" lane.
 Includes a borough-hint post-filter so Queens hyphenated-style addresses
-(e.g. "153-09 90 Ave, Jamaica, Queens") preferentially resolve to the
 borough the user named.
 """
 from __future__ import annotations
 import re
 from dataclasses import dataclass
 import httpx
 URL = "https://geosearch.planninglabs.nyc/v2/search"
 _BOROUGHS = ("Manhattan", "Bronx", "Brooklyn", "Queens", "Staten Island")
@@ -79,15 +103,83 @@ def geocode(text: str, limit: int = 5) -> list[GeocodeHit]:
     return out
 def geocode_one(text: str) -> GeocodeHit | None:
-    """Return the best NYC match for `text`. If the user mentions a
-    borough or neighborhood we recognize, filter candidates to that
-    borough before picking the top hit. Avoids `183-12 Liberty Avenue,
-    Queens` resolving to a Brooklyn match the API surfaced first."""
     hint = _detect_borough(text)
     hits = geocode(text, limit=8)
     if hint:
         in_boro = [h for h in hits if h.borough and h.borough.lower() == hint.lower()]
         if in_boro:
             return in_boro[0]
-    return hits[0] if hits else None

+"""Address geocoding — NYC primary + national fallback.
+NYC primary: NYC DCP Geosearch (geosearch.planninglabs.nyc), no auth,
+NYC-only. It will fuzzy-match upstate addresses to NYC streets — e.g.
+'257 Washington Ave, Albany NY' silently maps to Clinton Hill, Brooklyn.
+We detect this via a non-NYC region or non-NYC ZIP and fall back to
+OpenStreetMap Nominatim (no key, free, rate-limited per usage policy).
 Includes a borough-hint post-filter so Queens hyphenated-style addresses
+(e.g. '153-09 90 Ave, Jamaica, Queens') preferentially resolve to the
 borough the user named.
 """
 from __future__ import annotations
+import logging
 import re
 from dataclasses import dataclass
 import httpx
+log = logging.getLogger("riprap.geocode")
 URL = "https://geosearch.planninglabs.nyc/v2/search"
+NOMINATIM_URL = "https://nominatim.openstreetmap.org/search"
+NOMINATIM_UA = "Riprap-NYC/0.1 (civic-flood-tool; +https://huggingface.co/spaces/msradam/riprap-nyc)"
+# NYC-bbox guard: lat 40.49–40.92, lon -74.27 to -73.69. Anything outside
+# this is probably not NYC; treat NYC Geosearch hits outside it as bogus.
+NYC_BBOX = (40.49, -74.27, 40.92, -73.69)
+# NYC ZIP prefixes are 100–104 (Manhattan), 110 (Queens), 112 (Brooklyn),
+# 113 (Queens), 114 (Queens), 116 (Queens), 100 (Bronx 104), 103 (SI 1),
+# basically all 1x with 3rd char 0–6. Upstate NY is 12x, 13x, 14x. We use
+# this only as a HINT to escalate to Nominatim, not as a hard filter.
+_UPSTATE_ZIP_RE = re.compile(r"\b1[2-4]\d{3}\b")
+_NON_NYC_HINTS = re.compile(
+    r"\b(albany|troy|schenectady|saratoga|kingston|poughkeepsie|newburgh|"
+    r"yonkers|white plains|hudson|rhinebeck|peekskill|beacon|tarrytown|"
+    r"new paltz|catskill|tivoli|hyde park|coxsackie|cohoes|amsterdam|"
+    r"glens falls|lake george|nyack|garrison|cold spring|highland|saugerties)\b",
+    re.IGNORECASE,
+)
 _BOROUGHS = ("Manhattan", "Bronx", "Brooklyn", "Queens", "Staten Island")
     return out
+def _looks_upstate(text: str) -> bool:
+    """Heuristic: should this query bypass NYC Geosearch?"""
+    if _UPSTATE_ZIP_RE.search(text):
+        return True
+    if _NON_NYC_HINTS.search(text):
+        return True
+    return False
+def _in_nyc_bbox(lat: float, lon: float) -> bool:
+    s, w, n, e = NYC_BBOX
+    return s <= lat <= n and w <= lon <= e
+def geocode_nominatim(text: str) -> GeocodeHit | None:
+    """National OSM Nominatim fallback. Used when NYC Geosearch can't
+    plausibly answer the query."""
+    try:
+        r = httpx.get(NOMINATIM_URL, params={
+            "q": text, "format": "jsonv2", "addressdetails": "1",
+            "limit": 1, "countrycodes": "us",
+        }, headers={"User-Agent": NOMINATIM_UA}, timeout=10)
+        r.raise_for_status()
+        rows = r.json()
+    except Exception as e:
+        log.warning("Nominatim fetch failed: %r", e)
+        return None
+    if not rows:
+        return None
+    row = rows[0]
+    addr = row.get("address") or {}
+    label = row.get("display_name") or text
+    return GeocodeHit(
+        address=label,
+        borough=addr.get("city") or addr.get("town") or addr.get("village") or addr.get("county"),
+        lat=float(row["lat"]),
+        lon=float(row["lon"]),
+        bbl=None,
+        bin=None,
+        raw={"source": "nominatim", **row},
+    )
 def geocode_one(text: str) -> GeocodeHit | None:
+    """Best match for `text`, using NYC Geosearch primary with a national
+    OSM Nominatim fallback for upstate / non-NYC queries.
+    Strategy:
+      1. If query mentions a known non-NYC city or has an upstate ZIP,
+         go straight to Nominatim — Geosearch will silently fuzzy-snap
+         '257 Washington Ave, Albany' to Clinton Hill Brooklyn otherwise.
+      2. Otherwise try Geosearch with the borough-hint post-filter.
+      3. If Geosearch returns nothing OR returns a hit outside the NYC
+         bbox (which means even Geosearch knows it isn't NYC), escalate
+         to Nominatim.
+    """
+    if _looks_upstate(text):
+        log.info("upstate hint detected in %r — using Nominatim", text)
+        hit = geocode_nominatim(text)
+        if hit:
+            return hit
     hint = _detect_borough(text)
     hits = geocode(text, limit=8)
     if hint:
         in_boro = [h for h in hits if h.borough and h.borough.lower() == hint.lower()]
         if in_boro:
             return in_boro[0]
+    if hits:
+        top = hits[0]
+        if top.lat is not None and _in_nyc_bbox(top.lat, top.lon):
+            return top
+        # Geosearch returned a hit, but it's outside the NYC bbox — that
+        # means even the NYC API thinks the answer isn't NYC. Try
+        # Nominatim before giving up.
+        log.info("Geosearch top hit outside NYC bbox (%s, %s) — falling back",
+                 top.lat, top.lon)
+    return geocode_nominatim(text)

app/intents/__init__.py ADDED Viewed

	@@ -0,0 +1,3 @@

+"""Per-intent execution modules. Each intent knows how to take a planner
+Plan and run only the specialists relevant to it, returning a
+reconciler-ready set of documents and a paragraph."""

app/intents/development_check.py ADDED Viewed

	@@ -0,0 +1,324 @@

+"""development_check intent — "what are they building in <X> and is it risky?"
+Pipeline:
+  1. Resolve target text → NTA polygon
+  2. Pull active DOB construction permits (NB / A1 / DM, last ~18 mo)
+     inside the polygon
+  3. Cross-reference each permit with the Sandy + DEP scenarios already
+     loaded in memory
+  4. Aggregate counts; rank flagged projects by severity
+  5. Reconcile via Granite 4.1 with a development-briefing prompt that
+     names specific projects and addresses
+"""
+from __future__ import annotations
+import logging
+import time
+from typing import Any
+from app import llm
+from app.areas import nta
+from app.context import dob_permits
+from app.rag import retrieve as rag_retrieve
+log = logging.getLogger("riprap.intent.development_check")
+# Reconciler model — see app/reconcile.py for the env-var contract.
+import os as _os  # noqa: E402
+OLLAMA_MODEL = _os.environ.get("RIPRAP_RECONCILER_MODEL",
+                                _os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:8b"))
+EXTRA_SYSTEM_PROMPT = """Write a flood-exposure briefing about active construction in an NYC neighborhood. Use ONLY the facts in the provided documents.
+Output this markdown skeleton verbatim, filling each `<...>` with content drawn only from the documents. After every numerical claim, append the document id in square brackets — e.g. `<count> [dob_permits]`. Bold at most one phrase per section using `**...**`. Omit any section whose supporting facts are absent from the documents.
+```
+**Status.**
+<one sentence: name the neighborhood from [nta_resolve] and the headline counts from [dob_permits] (total active projects, fraction in Sandy zone, fraction in DEP scenarios)>.
+**Flagged projects.**
+- <project address from [dob_permits]> ([dob_permits]). <job_type_label> issued <date>; owner <owner_business>. <flood-layer summary>.
+- <next project from [dob_permits], same pattern>
+- <continue for each flagged project, max 6>
+**Pattern.**
+<1-2 sentences observing which streets concentrate the flagged projects and the new-building / major-alteration mix from [dob_permits]>.
+**Policy context.**
+<1 sentence per RAG hit, citing the agency name and [rag_*]>.
+```
+Constraints:
+- Copy addresses, BBLs, dates, and owner names verbatim from the documents — no paraphrasing.
+- If [dob_permits] reports 0 flagged projects, omit the **Flagged projects.** section and say so in **Status.**.
+- If only [nta_resolve] is present and no [dob_permits], output exactly: `No grounded data available for this neighborhood.`
+"""
+def run(plan, query: str, progress_q=None, strict: bool = False) -> dict[str, Any]:
+    """Execute the development_check Plan. If progress_q is provided
+    (a queue.Queue), each finalized step record is put on it so a
+    streaming endpoint can render the trace live.
+    strict=True routes through Mellea-validated reconciliation (rejection
+    sampling against four grounding requirements). Disables token
+    streaming — the briefing arrives in one shot after Mellea's loop
+    settles. Trace gains a `mellea_validate` row with rerolls + which
+    requirements passed.
+    """
+    t0 = time.time()
+    trace: list[dict] = []
+    def _emit(r: dict):
+        if progress_q is not None:
+            progress_q.put({"kind": "step", **r})
+    target_text = next(
+        (t["text"] for t in plan.targets if t.get("type") in ("nta", "borough")),
+        None,
+    )
+    rec = {"step": "nta_resolve", "started_at": t0, "ok": False}
+    trace.append(rec)
+    # Try the planner's target first; if it didn't pick one, fall back to
+    # scanning the raw query text for any known neighborhood/borough name.
+    matches = nta.resolve(target_text) if target_text else []
+    if not matches:
+        log.info("planner gave no usable target (%r); scanning query %r",
+                 target_text, query)
+        matches = nta.resolve_from_text(query)
+    if not matches:
+        rec["err"] = f"no NTA match in target={target_text!r} or query={query!r}"
+        rec["elapsed_s"] = round(time.time() - t0, 2)
+        return _empty(plan, query, trace, error=rec["err"])
+    target = matches[0]
+    rec["ok"] = True
+    rec["result"] = {"nta_code": target["nta_code"],
+                     "nta_name": target["nta_name"],
+                     "borough":  target["borough"],
+                     "bbox":     list(target["geometry"].bounds)}
+    rec["elapsed_s"] = round(time.time() - t0, 2)
+    _emit(rec)
+    poly = target["geometry"]
+    docs: list[dict] = []
+    permits_summary = None
+    rag_out: list = []
+    # ---- DOB permits ------------------------------------------------------
+    p_t0 = time.time()
+    prec = {"step": "dob_permits_nta", "started_at": p_t0, "ok": False}
+    trace.append(prec)
+    try:
+        # top_n=5: 5 flagged projects in the doc context is the sweet spot —
+        # rich enough for a journalist briefing, cheap enough to stay under
+        # ~25 s reconcile on T4 with the 8b model.
+        permits_summary = dob_permits.summary_for_polygon(poly, top_n=5)
+        prec["ok"] = True
+        prec["result"] = {
+            "n_total":      permits_summary["n_total"],
+            "n_in_sandy":   permits_summary["n_in_sandy"],
+            "n_in_dep_any": permits_summary["n_in_dep_any"],
+            # Pin data so the UI can render permits the moment this step
+            # finishes, instead of waiting for the `final` event.
+            "all_pins":     permits_summary["all_pins"],
+        }
+    except Exception as e:
+        prec["err"] = str(e)
+        log.exception("dob_permits failed")
+    prec["elapsed_s"] = round(time.time() - p_t0, 2)
+    _emit(prec)
+    # ---- RAG --------------------------------------------------------------
+    if "rag" in plan.specialists:
+        r_t0 = time.time()
+        rrec = {"step": "rag_dev", "started_at": r_t0, "ok": False}
+        trace.append(rrec)
+        try:
+            q = (f"flood resilience new construction development {target['nta_name']} "
+                 f"{target['borough']} hardening building code")
+            rag_out = rag_retrieve(q, k=2, min_score=0.50)
+            rrec["ok"] = True
+            rrec["result"] = {"hits": len(rag_out)}
+        except Exception as e:
+            rrec["err"] = str(e)
+        rrec["elapsed_s"] = round(time.time() - r_t0, 2)
+        _emit(rrec)
+    # ---- documents --------------------------------------------------------
+    docs.append(_doc("nta_resolve", [
+        "Source: NYC DCP Neighborhood Tabulation Areas 2020.",
+        f"Target neighborhood: {target['nta_name']} (NTA {target['nta_code']}), "
+        f"in the borough of {target['borough']}.",
+    ]))
+    if permits_summary:
+        ps = permits_summary
+        body = [
+            "Source: NYC DOB Permit Issuance (Socrata ipu4-2q9a), filtered to "
+            "active New Building / Major Alteration / Demolition jobs in the "
+            "trailing 18 months. Cross-referenced with NYC Sandy 2012 "
+            "inundation extent and 3 DEP Stormwater scenarios.",
+            f"Total active major-construction projects in {target['nta_name']}: "
+            f"{ps['n_total']}.",
+            f"Of these: {ps['n_in_sandy']} fall inside the 2012 Sandy "
+            f"inundation zone; {ps['n_in_dep_any']} fall inside at least one "
+            f"DEP Stormwater scenario; {ps['n_in_dep_severe']} fall in the "
+            f"deeper DEP bands (1-4 ft or >4 ft).",
+        ]
+        if ps.get("by_job_type"):
+            mix = "; ".join(f"{n} {k}" for k, n in ps["by_job_type"].items())
+            body.append(f"Job-type mix: {mix}.")
+        for p in ps["flagged_top"]:
+            scen_str = (", ".join(p["dep_scenarios"]) or "none")
+            body.append(
+                f"- {p['address']}, {p['borough']} (BBL {p.get('bbl') or 'unknown'}). "
+                f"{p['job_type_label']}, permit issued {p['issuance_date']}, "
+                f"status {p['permit_status']}. "
+                f"Owner: {p.get('owner_business') or 'unknown'}. "
+                f"In Sandy zone: {p['in_sandy']}; in DEP scenarios: {scen_str}; "
+                f"max DEP depth class: {p['dep_max_class']}."
+            )
+        docs.append(_doc("dob_permits", body))
+    for h in rag_out:
+        docs.append(_doc(h["doc_id"], [
+            f"Source: {h['citation']}, page {h.get('page', '')}.",
+            f"Retrieved passage (verbatim): {h['text']}",
+        ]))
+    # ---- reconcile --------------------------------------------------------
+    rec_t0 = time.time()
+    rec_step = {"step": "reconcile_development", "started_at": rec_t0, "ok": False}
+    trace.append(rec_step)
+    paragraph = ""
+    audit = {"raw": "", "dropped": []}
+    mellea_meta = None
+    if len(docs) <= 1:
+        paragraph = ("**Status.** No active construction permit data available "
+                     f"for {target['nta_name']} [nta_resolve].")
+        audit = {"raw": paragraph, "dropped": []}
+        rec_step["ok"] = True
+    elif strict:
+        # Streaming Mellea path: tokens stream during each attempt; on
+        # validation failure we emit a mellea_attempt event and reroll.
+        rec_step["step"] = "mellea_reconcile_development"
+        try:
+            from app.mellea_validator import DEFAULT_LOOP_BUDGET, reconcile_strict_streaming
+            from app.reconcile import trim_docs_to_plan as _trim
+            docs = _trim(docs, set(plan.specialists or []))
+            def _on_token(delta: str, attempt_idx: int):
+                if progress_q is not None:
+                    progress_q.put({"kind": "token", "delta": delta,
+                                    "attempt": attempt_idx})
+            def _on_attempt_end(attempt_idx, passed, failed):
+                if progress_q is not None:
+                    progress_q.put({"kind": "mellea_attempt",
+                                    "attempt": attempt_idx,
+                                    "passed": passed, "failed": failed})
+            mres = reconcile_strict_streaming(
+                docs, EXTRA_SYSTEM_PROMPT,
+                user_prompt="Write the development briefing now.",
+                model=OLLAMA_MODEL, loop_budget=DEFAULT_LOOP_BUDGET,
+                on_token=_on_token if progress_q else None,
+                on_attempt_end=_on_attempt_end if progress_q else None,
+            )
+            paragraph = mres["paragraph"]
+            audit = {"raw": paragraph, "dropped": []}
+            mellea_meta = {
+                "rerolls": mres["rerolls"],
+                "n_attempts": mres["n_attempts"],
+                "requirements_passed": mres["requirements_passed"],
+                "requirements_failed": mres["requirements_failed"],
+                "requirements_total": mres["requirements_total"],
+                "model": mres["model"],
+                "loop_budget": mres["loop_budget"],
+            }
+            rec_step["ok"] = True
+            rec_step["result"] = {
+                "rerolls": mellea_meta["rerolls"],
+                "passed": f"{len(mellea_meta['requirements_passed'])}/{mellea_meta['requirements_total']}",
+                "paragraph_chars": len(paragraph),
+            }
+        except Exception as e:
+            rec_step["err"] = str(e)
+            log.exception("Mellea-validated reconcile failed")
+            paragraph = ""
+            audit = {"raw": "", "dropped": []}
+    else:
+        def _on_token(delta: str):
+            if progress_q is not None:
+                progress_q.put({"kind": "token", "delta": delta})
+        try:
+            paragraph, audit = _reconcile(docs, on_token=_on_token if progress_q else None)
+            rec_step["ok"] = True
+            rec_step["result"] = {"paragraph_chars": len(paragraph),
+                                  "dropped": len(audit["dropped"])}
+        except Exception as e:
+            rec_step["err"] = str(e)
+            log.exception("development reconcile failed")
+    rec_step["elapsed_s"] = round(time.time() - rec_t0, 2)
+    _emit(rec_step)
+    target_safe = {k: v for k, v in target.items() if k != "geometry"}
+    target_safe["bbox"] = list(target["geometry"].bounds)
+    return {
+        "intent":          "development_check",
+        "query":           query,
+        "plan": {
+            "intent": plan.intent,
+            "targets": plan.targets,
+            "specialists": plan.specialists,
+            "rationale": plan.rationale,
+        },
+        "target":          target_safe,
+        "n_matches":       len(matches),
+        "dob_summary":     permits_summary,
+        "rag":             rag_out,
+        "paragraph":       paragraph,
+        "audit":           audit,
+        "mellea":          mellea_meta,
+        "trace":           trace,
+        "total_s":         round(time.time() - t0, 2),
+    }
+def _doc(doc_id: str, body_lines: list[str]) -> dict:
+    return {"role": f"document {doc_id}", "content": "\n".join(body_lines)}
+def _reconcile(docs: list[dict], on_token=None) -> tuple[str, dict]:
+    from app.reconcile import verify_paragraph
+    messages = docs + [
+        {"role": "system", "content": EXTRA_SYSTEM_PROMPT},
+        {"role": "user", "content": "Write the development briefing now."},
+    ]
+    # num_ctx 6144 covers a typical dev_check prompt: system ~700 + nta
+    # doc + DOB body with 5 flagged projects ~3000 + RAG hits ~1000.
+    # 12288 was over-allocating KV cache — costly on T4. num_predict caps
+    # the briefing at ~600 tokens (4 sections + 5 bullet projects).
+    OPTS = {"temperature": 0, "num_ctx": 6144, "num_predict": 600}
+    if on_token is None:
+        resp = llm.chat(model=OLLAMA_MODEL, messages=messages, options=OPTS)
+        raw = resp["message"]["content"].strip()
+    else:
+        chunks: list[str] = []
+        for chunk in llm.chat(model=OLLAMA_MODEL, messages=messages,
+                                 stream=True, options=OPTS):
+            delta = (chunk.get("message") or {}).get("content") or ""
+            if delta:
+                chunks.append(delta)
+                on_token(delta)
+        raw = "".join(chunks).strip()
+    cleaned, dropped = verify_paragraph(raw, docs)
+    return cleaned, {"raw": raw, "dropped": dropped}
+def _empty(plan, query, trace, error):
+    return {
+        "intent":    "development_check",
+        "query":     query,
+        "error":     error,
+        "plan":      {"intent": plan.intent, "targets": plan.targets,
+                      "specialists": plan.specialists, "rationale": plan.rationale},
+        "trace":     trace,
+        "paragraph": f"Could not resolve target to an NTA: {error}",
+    }

app/intents/live_now.py ADDED Viewed

	@@ -0,0 +1,231 @@

+"""live_now intent — only fire live specialists. No geocode, no static
+historic/modeled layers. Reconciler emits a "right now" status note.
+Targets are usually `{"type": "nyc"}` for the whole city; if the user
+named a specific borough we still query at the same gauges (NOAA only
+has 3 NYC stations) and the same NWS forecast zones (the API takes a
+lat/lon point — we use a borough centroid).
+"""
+from __future__ import annotations
+import logging
+import time
+from typing import Any
+from app import llm
+from app.context import noaa_tides, nws_alerts, nws_obs
+from app.live import ttm_forecast
+log = logging.getLogger("riprap.intent.live_now")
+import os as _os  # noqa: E402
+# live_now stays on the smaller model: short outputs, speed matters more.
+OLLAMA_MODEL = _os.environ.get("RIPRAP_LIVE_MODEL",
+                                _os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:3b"))
+# NWS API requires a lat/lon point; pick a representative one per borough.
+BOROUGH_POINTS = {
+    "Manhattan":     (40.7831, -73.9712),  # Central Park
+    "Brooklyn":      (40.6500, -73.9500),  # Park Slope-ish
+    "Queens":        (40.7282, -73.7949),  # Forest Hills
+    "Bronx":         (40.8448, -73.8648),  # Fordham
+    "Staten Island": (40.5795, -74.1502),  # central SI
+    "NYC":           (40.7128, -74.0060),  # Lower Manhattan default
+}
+EXTRA_SYSTEM_PROMPT = """Write a current-conditions flood briefing for NYC. Use ONLY the facts in the provided documents.
+Output this markdown skeleton verbatim, filling each `<...>` with content drawn only from the documents. After every numerical claim, append the document id in square brackets — e.g. `<value> [noaa_tides]`. Bold at most one phrase per section using `**...**`. Omit any section whose supporting facts are absent from the documents.
+```
+**Status.**
+<one sentence on whether flood-relevant conditions are active right now, citing the strongest live signal>.
+**Live signals.**
+<1-3 sentences citing each live signal that fired: NWS alerts from [nws_alerts], tide observation and residual from [noaa_tides], recent precipitation from [nws_obs], any TTM forecast peak from [ttm_forecast]>.
+```
+Constraints:
+- Be brief — current-conditions reports are read in seconds.
+- Copy numerical values verbatim from documents. Do not round.
+- Do not invoke historic events (Sandy, Ida) — this is a now-only report.
+- If every live document indicates calm, write only: `**Status.** No active flood-relevant signals at this time [live_target].`
+"""
+def run(plan, query: str, progress_q=None) -> dict[str, Any]:
+    t0 = time.time()
+    trace: list[dict] = []
+    def _emit(r: dict):
+        if progress_q is not None:
+            progress_q.put({"kind": "step", **r})
+    boro = next((t.get("text") for t in plan.targets if t.get("type") == "borough"), None)
+    if boro and boro in BOROUGH_POINTS:
+        lat, lon = BOROUGH_POINTS[boro]
+        place = boro
+    else:
+        lat, lon = BOROUGH_POINTS["NYC"]
+        place = "NYC"
+    docs: list[dict] = []
+    tides_out = alerts_out = obs_out = ttm_out = None
+    if "noaa_tides" in plan.specialists:
+        tides_out = _run_step(trace, "noaa_tides", lambda: noaa_tides.summary_for_point(lat, lon), progress_q)
+    if "nws_alerts" in plan.specialists:
+        alerts_out = _run_step(trace, "nws_alerts", lambda: nws_alerts.summary_for_point(lat, lon), progress_q)
+    if "nws_obs" in plan.specialists:
+        obs_out = _run_step(trace, "nws_obs", lambda: nws_obs.summary_for_point(lat, lon), progress_q)
+    if "ttm_forecast" in plan.specialists:
+        ttm_out = _run_step(trace, "ttm_forecast", lambda: ttm_forecast.summary_for_point(lat, lon), progress_q)
+    # ---- documents ----
+    docs.append({"role": "document live_target", "content":
+        f"Source: planner. Live-conditions report for {place}. "
+        f"Coordinates used for NWS lookups: {lat:.4f}, {lon:.4f}."})
+    if tides_out and tides_out.get("observed_ft_mllw") is not None:
+        body = [
+            f"Source: NOAA CO-OPS Tides & Currents. Station: {tides_out['station_name']} "
+            f"(NOAA {tides_out['station_id']}, {tides_out['distance_km']} km from {place}).",
+            f"Observation time: {tides_out.get('obs_time') or 'unknown'}.",
+            f"Observed water level: {tides_out['observed_ft_mllw']} ft above MLLW.",
+        ]
+        if tides_out.get("predicted_ft_mllw") is not None:
+            body.append(f"Astronomical tide prediction at the same instant: "
+                        f"{tides_out['predicted_ft_mllw']} ft.")
+        if tides_out.get("residual_ft") is not None:
+            body.append(f"Residual (observed - predicted): {tides_out['residual_ft']} ft. "
+                        f"Positive = surge component above tide; negative = setdown.")
+        docs.append(_doc("noaa_tides", body))
+    if alerts_out and alerts_out.get("n_active", 0) > 0:
+        body = [f"Source: NWS Public Alerts API. Active flood-relevant alerts: "
+                f"{alerts_out['n_active']}."]
+        for a in alerts_out["alerts"][:4]:
+            body.append(
+                f"- {a.get('event','?')} (severity: {a.get('severity','?')}, "
+                f"urgency: {a.get('urgency','?')}); expires {a.get('expires','')[:16]}; "
+                f"area: {(a.get('areaDesc') or '')[:120]}."
+            )
+            if a.get("headline"):
+                body.append(f"  Headline: {a['headline'][:240]}")
+        docs.append(_doc("nws_alerts", body))
+    if obs_out and (obs_out.get("precip_last_hour_mm") is not None
+                    or obs_out.get("precip_last_6h_mm") is not None):
+        body = [
+            f"Source: NWS Station Observations. Nearest ASOS: {obs_out['station_name']} "
+            f"({obs_out['station_id']}, {obs_out['distance_km']} km).",
+            f"Observation time: {obs_out.get('obs_time') or 'unknown'}.",
+        ]
+        if obs_out.get("precip_last_hour_mm") is not None:
+            body.append(f"Precipitation last 1 h: {obs_out['precip_last_hour_mm']} mm.")
+        if obs_out.get("precip_last_6h_mm") is not None:
+            body.append(f"Precipitation last 6 h: {obs_out['precip_last_6h_mm']} mm.")
+        docs.append(_doc("nws_obs", body))
+    if ttm_out and ttm_out.get("available") and ttm_out.get("interesting"):
+        docs.append(_doc("ttm_forecast", [
+            "Source: Granite TimeSeries TTM r2 (Ekambaram et al. 2024).",
+            f"Forecast peak surge residual at {ttm_out['station_name']}: "
+            f"{ttm_out['forecast_peak_ft']} ft, expected "
+            f"{ttm_out['forecast_peak_minutes_ahead']} minutes from now.",
+            f"Recent peak |residual| in context window: "
+            f"{ttm_out['history_peak_abs_ft']} ft.",
+        ]))
+    # ---- reconcile ----
+    rec_t0 = time.time()
+    rec_step = {"step": "reconcile_live_now", "started_at": rec_t0, "ok": False}
+    trace.append(rec_step)
+    if not docs or len(docs) == 1:  # only the live_target doc, no actual signals
+        paragraph = ("**Status.** **No active flood-relevant signals at this time** for "
+                     f"{place} [live_target].")
+        audit = {"raw": paragraph, "dropped": []}
+        rec_step["ok"] = True
+    else:
+        def _on_token(delta: str):
+            if progress_q is not None:
+                progress_q.put({"kind": "token", "delta": delta})
+        try:
+            paragraph, audit = _reconcile(docs, on_token=_on_token if progress_q else None)
+            rec_step["ok"] = True
+        except Exception as e:
+            paragraph = "Could not produce a live-conditions report."
+            audit = {"raw": "", "dropped": []}
+            rec_step["err"] = str(e)
+    rec_step["elapsed_s"] = round(time.time() - rec_t0, 2)
+    _emit(rec_step)
+    return {
+        "intent": "live_now",
+        "query": query,
+        "place": place,
+        "plan": {
+            "intent": plan.intent,
+            "targets": plan.targets,
+            "specialists": plan.specialists,
+            "rationale": plan.rationale,
+        },
+        "noaa_tides":  tides_out,
+        "nws_alerts":  alerts_out,
+        "nws_obs":     obs_out,
+        "ttm_forecast": ttm_out,
+        "paragraph":   paragraph,
+        "audit":       audit,
+        "trace":       trace,
+        "total_s":     round(time.time() - t0, 2),
+    }
+def _run_step(trace: list, name: str, fn, progress_q=None) -> Any:
+    t0 = time.time()
+    rec = {"step": name, "started_at": t0, "ok": False}
+    trace.append(rec)
+    try:
+        out = fn()
+        rec["ok"] = True
+        rec["result"] = {k: out.get(k) for k in list(out.keys())[:3]} if isinstance(out, dict) else None
+        return out
+    except Exception as e:
+        rec["err"] = str(e)
+        log.exception("%s failed", name)
+        return None
+    finally:
+        rec["elapsed_s"] = round(time.time() - t0, 2)
+        if progress_q is not None:
+            progress_q.put({"kind": "step", **rec})
+def _doc(doc_id: str, body_lines: list[str]) -> dict:
+    return {"role": f"document {doc_id}", "content": "\n".join(body_lines)}
+def _reconcile(docs: list[dict], on_token=None) -> tuple[str, dict]:
+    from app.reconcile import verify_paragraph
+    messages = docs + [
+        {"role": "system", "content": EXTRA_SYSTEM_PROMPT},
+        {"role": "user", "content": "Write the live-conditions briefing now."},
+    ]
+    # live_now is the smallest intent: ~4 live docs, short briefing.
+    # num_predict 200 caps to a 2-section status note.
+    OPTS = {"temperature": 0, "num_ctx": 2048, "num_predict": 200}
+    if on_token is None:
+        resp = llm.chat(model=OLLAMA_MODEL, messages=messages, options=OPTS)
+        raw = resp["message"]["content"].strip()
+    else:
+        chunks: list[str] = []
+        for chunk in llm.chat(model=OLLAMA_MODEL, messages=messages,
+                                 stream=True, options=OPTS):
+            delta = (chunk.get("message") or {}).get("content") or ""
+            if delta:
+                chunks.append(delta)
+                on_token(delta)
+        raw = "".join(chunks).strip()
+    cleaned, dropped = verify_paragraph(raw, docs)
+    return cleaned, {"raw": raw, "dropped": dropped}

app/intents/neighborhood.py ADDED Viewed

	@@ -0,0 +1,492 @@

+"""neighborhood intent — resolve target text to one or more NTA polygons,
+then run polygon-level specialists and reconcile.
+The set of polygon-capable specialists is currently:
+  - sandy_inundation.coverage_for_polygon
+  - dep_stormwater.coverage_for_polygon (per scenario)
+  - nyc311.summary_for_polygon
+  - microtopo.microtopo_for_polygon
+Other specialists (FloodNet, Ida HWM, Prithvi) are still point-based;
+in Phase 2 we'll add polygon support for them. For now, neighborhood
+mode produces the four signals above + RAG, and the reconciler emits
+a structurally-different briefing aimed at a place rather than an
+address.
+"""
+from __future__ import annotations
+import logging
+import time
+from typing import Any
+from app import llm
+from app.areas import nta
+from app.context import microtopo, nyc311
+from app.flood_layers import dep_stormwater, sandy_inundation
+from app.rag import retrieve as rag_retrieve
+log = logging.getLogger("riprap.intent.neighborhood")
+import os as _os  # noqa: E402
+OLLAMA_MODEL = _os.environ.get("RIPRAP_RECONCILER_MODEL",
+                                _os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:8b"))
+EXTRA_SYSTEM_PROMPT = """Write a flood-exposure briefing for an NYC neighborhood. Use ONLY the facts in the provided documents.
+Output this markdown skeleton verbatim, filling each `<...>` with content drawn only from the documents. After every numerical claim, append the document id in square brackets — e.g. `<value> [sandy_nta]`. Bold at most one phrase per section using `**...**`. Omit any section whose supporting facts are absent from the documents.
+```
+**Status.**
+<one sentence: name the neighborhood from [nta_resolve] and the dominant exposure pattern>.
+**Empirical evidence.**
+<1-3 sentences citing observed flood evidence: Sandy coverage from [sandy_nta], 311 counts from [nyc311_nta], any FloodNet or HWM signals>.
+**Modeled scenarios.**
+<1-2 sentences citing modeled flooding from [dep_*_nta] (fraction of polygon in each scenario) and terrain from [microtopo_nta] (median HAND, fraction of polygon with HAND below 1 m)>.
+**Policy context.**
+<1 sentence per RAG hit, citing the agency name and [rag_*]>.
+```
+Constraints:
+- Copy numerical values verbatim from documents. Do not round, paraphrase, or substitute.
+- Speak about the place as a polygon (use phrases like "of the neighborhood" or "of the NTA"), not as an address.
+- If only [nta_resolve] is present and no other documents, output exactly: `No grounded data available for this neighborhood.`
+"""
+def run(plan, query: str, progress_q=None, strict: bool = False) -> dict[str, Any]:
+    """Execute the planner's neighborhood Plan.
+    Resolves all targets to NTAs, picks the largest matching NTA (or the
+    first if multiple equally good), runs the polygon specialists, and
+    reconciles via Granite 4.1.
+    strict=True routes the reconciler through Mellea-validated rejection
+    sampling. Disables token streaming.
+    """
+    t0 = time.time()
+    trace: list[dict] = []
+    def _emit(r: dict):
+        if progress_q is not None:
+            progress_q.put({"kind": "step", **r})
+    # Resolve targets to NTAs. Try the planner's pick first; if it gave no
+    # usable target, scan the raw query text for any known neighborhood name.
+    target_text = next(
+        (t["text"] for t in plan.targets if t.get("type") in ("nta", "borough")),
+        None,
+    )
+    rec = {"step": "nta_resolve", "started_at": t0, "ok": False}
+    trace.append(rec)
+    matches = nta.resolve(target_text) if target_text else []
+    if not matches:
+        matches = nta.resolve_from_text(query)
+    if not matches:
+        rec["err"] = f"no NTA match in target={target_text!r} or query={query!r}"
+        rec["elapsed_s"] = round(time.time() - t0, 2)
+        return _empty_result(plan, query, trace, error=rec["err"])
+    target = matches[0]
+    rec["ok"] = True
+    rec["result"] = {
+        "nta_code": target["nta_code"],
+        "nta_name": target["nta_name"],
+        "borough":  target["borough"],
+        "n_matches": len(matches),
+        # Bbox lets the UI fly-to and render the polygon while the rest
+        # of the specialists are still running.
+        "bbox":     list(target["geometry"].bounds),
+    }
+    rec["elapsed_s"] = round(time.time() - t0, 2)
+    _emit(rec)
+    poly = target["geometry"]
+    docs: list[dict] = []
+    sandy_out = None
+    dep_out = {}
+    nyc311_out = None
+    micro_out = None
+    rag_out = []
+    prithvi_live_out = None
+    terramind_out = None
+    # ---- sandy ----
+    if "sandy" in plan.specialists:
+        s_t0 = time.time()
+        srec = {"step": "sandy_nta", "started_at": s_t0, "ok": False}
+        trace.append(srec)
+        try:
+            sandy_out = sandy_inundation.coverage_for_polygon(poly)
+            srec["ok"] = True
+            srec["result"] = {"fraction": sandy_out["fraction"], "inside": sandy_out["inside"]}
+        except Exception as e:
+            srec["err"] = str(e)
+            log.exception("sandy polygon failed")
+        srec["elapsed_s"] = round(time.time() - s_t0, 2)
+        _emit(srec)
+    # ---- dep_stormwater ----
+    if "dep_stormwater" in plan.specialists:
+        for scen in ["dep_extreme_2080", "dep_moderate_2050", "dep_moderate_current"]:
+            d_t0 = time.time()
+            drec = {"step": f"{scen}_nta", "started_at": d_t0, "ok": False}
+            trace.append(drec)
+            try:
+                cov = dep_stormwater.coverage_for_polygon(poly, scen)
+                dep_out[scen] = cov
+                drec["ok"] = True
+                drec["result"] = {"fraction_any": cov["fraction_any"]}
+            except Exception as e:
+                drec["err"] = str(e)
+                log.exception("%s polygon failed", scen)
+            drec["elapsed_s"] = round(time.time() - d_t0, 2)
+            _emit(drec)
+    # ---- nyc311 ----
+    if "nyc311" in plan.specialists:
+        n_t0 = time.time()
+        nrec = {"step": "nyc311_nta", "started_at": n_t0, "ok": False}
+        trace.append(nrec)
+        try:
+            nyc311_out = nyc311.summary_for_polygon(poly, years=3)
+            nrec["ok"] = True
+            nrec["result"] = {"n": nyc311_out["n"]}
+        except Exception as e:
+            nrec["err"] = str(e)
+            log.exception("nyc311 polygon failed")
+        nrec["elapsed_s"] = round(time.time() - n_t0, 2)
+        _emit(nrec)
+    # ---- microtopo ----
+    if "microtopo" in plan.specialists:
+        m_t0 = time.time()
+        mrec = {"step": "microtopo_nta", "started_at": m_t0, "ok": False}
+        trace.append(mrec)
+        try:
+            micro_out = microtopo.microtopo_for_polygon(poly)
+            mrec["ok"] = micro_out is not None
+            mrec["result"] = {
+                "elev_median_m": (micro_out or {}).get("elev_median_m"),
+                "frac_hand_lt1": (micro_out or {}).get("frac_hand_lt1"),
+            }
+        except Exception as e:
+            mrec["err"] = str(e)
+            log.exception("microtopo polygon failed")
+        mrec["elapsed_s"] = round(time.time() - m_t0, 2)
+        _emit(mrec)
+    # ---- Prithvi-EO live water mask (NTA centroid) ----
+    # Polygon-scoped queries don't have a single point of interest, but
+    # the NTA centroid is a fair sampling point: the 5 km chip the
+    # specialist fetches comfortably covers any NTA. The reconciler
+    # gets an `[prithvi_live]` doc with the % water observed today, and
+    # the frontend gets a GeoJSON layer to paint over the NTA polygon.
+    try:
+        from app.flood_layers import prithvi_live as plive_mod
+        if plive_mod.ENABLE:
+            p_t0 = time.time()
+            prec = {"step": "prithvi_eo_live", "started_at": p_t0, "ok": False}
+            trace.append(prec)
+            centroid = poly.centroid
+            prithvi_live_out = plive_mod.fetch(centroid.y, centroid.x)
+            prec["ok"] = bool(prithvi_live_out and prithvi_live_out.get("ok"))
+            if prec["ok"]:
+                prec["result"] = {
+                    "scene_date": (prithvi_live_out.get("item_datetime") or "")[:10],
+                    "cloud_cover": prithvi_live_out.get("cloud_cover"),
+                    "pct_water_5km": prithvi_live_out.get("pct_water_full"),
+                }
+            else:
+                prec["err"] = (prithvi_live_out or {}).get("err") \
+                    or (prithvi_live_out or {}).get("skipped") or "no observation"
+            prec["elapsed_s"] = round(time.time() - p_t0, 2)
+            _emit(prec)
+    except Exception as e:
+        log.exception("prithvi_live (neighborhood) failed")
+        prithvi_live_out = {"ok": False, "err": str(e)}
+    # ---- TerraMind synthesis (NTA centroid) ----
+    # Generative-prior tier — synthesized ESRI Land Cover from the
+    # local LiDAR DEM at the NTA centroid. Renders as dashed-outline
+    # polygons on the map alongside the polygon-aggregated specialists.
+    try:
+        from app.context import terramind_synthesis as tm_mod
+        if tm_mod.ENABLE:
+            t_t0 = time.time()
+            trec = {"step": "terramind_synthesis", "started_at": t_t0, "ok": False}
+            trace.append(trec)
+            centroid = poly.centroid
+            terramind_out = tm_mod.fetch(centroid.y, centroid.x)
+            trec["ok"] = bool(terramind_out and terramind_out.get("ok"))
+            if trec["ok"]:
+                trec["result"] = {
+                    "tim_chain": terramind_out.get("tim_chain"),
+                    "dominant_class": terramind_out.get("dominant_class_display")
+                                       or terramind_out.get("dominant_class"),
+                    "dominant_pct": terramind_out.get("dominant_pct"),
+                    "n_classes": terramind_out.get("n_classes_observed"),
+                }
+            else:
+                trec["err"] = (terramind_out or {}).get("err") \
+                    or (terramind_out or {}).get("skipped") or "no synthesis"
+            trec["elapsed_s"] = round(time.time() - t_t0, 2)
+            _emit(trec)
+    except Exception as e:
+        log.exception("terramind (neighborhood) failed")
+        terramind_out = {"ok": False, "err": str(e)}
+    # ---- rag ----
+    if "rag" in plan.specialists:
+        r_t0 = time.time()
+        rrec = {"step": "rag_nta", "started_at": r_t0, "ok": False}
+        trace.append(rrec)
+        try:
+            q = (f"flood exposure {target['nta_name']} {target['borough']} "
+                 "vulnerability hardening mitigation")
+            rag_out = rag_retrieve(q, k=3, min_score=0.45)
+            rrec["ok"] = True
+            rrec["result"] = {"hits": len(rag_out)}
+        except Exception as e:
+            rrec["err"] = str(e)
+            log.exception("rag polygon failed")
+        rrec["elapsed_s"] = round(time.time() - r_t0, 2)
+        _emit(rrec)
+    # ---- build documents ----
+    docs.append(_doc("nta_resolve", [
+        "Source: NYC DCP Neighborhood Tabulation Areas 2020.",
+        f"Target neighborhood: {target['nta_name']} (NTA {target['nta_code']}), "
+        f"in the borough of {target['borough']}.",
+        f"Community District: {target.get('cdta') or 'unknown'}.",
+    ]))
+    if sandy_out and sandy_out["inside"]:
+        docs.append(_doc("sandy_nta", [
+            "Source: NYC Sandy Inundation Zone (NYC OD 5xsi-dfpx).",
+            f"Fraction of {target['nta_name']} inside the 2012 inundation extent: "
+            f"{sandy_out['fraction'] * 100:.1f}%.",
+            f"Total NTA area: {sandy_out['polygon_area_m2']/1e6:.2f} km².",
+        ]))
+    for scen, cov in dep_out.items():
+        if cov["fraction_any"] > 0:
+            cls = cov["fraction_class"]
+            docs.append(_doc(f"{scen}_nta", [
+                f"Source: {cov['label']}.",
+                f"Fraction of {target['nta_name']} inside any modeled flooded area: "
+                f"{cov['fraction_any'] * 100:.1f}%.",
+                f"Of which: {cls.get(1, 0) * 100:.1f}% in nuisance band (>4 in to 1 ft), "
+                f"{cls.get(2, 0) * 100:.1f}% in 1-4 ft band, "
+                f"{cls.get(3, 0) * 100:.1f}% in >4 ft band.",
+            ]))
+    if nyc311_out and nyc311_out.get("n", 0) > 0:
+        body = [
+            "Source: NYC 311 service requests (Socrata erm2-nwe9), aggregated inside the NTA polygon.",
+            f"Flood-related complaints in the last 3 years inside {target['nta_name']}: "
+            f"{nyc311_out['n']}.",
+        ]
+        if nyc311_out.get("by_descriptor"):
+            top = "; ".join(f"{k}: {v}" for k, v in list(nyc311_out["by_descriptor"].items())[:3])
+            body.append(f"Top descriptors: {top}.")
+        docs.append(_doc("nyc311_nta", body))
+    if micro_out and micro_out.get("n_cells", 0) > 0:
+        body = [
+            "Source: USGS 3DEP DEM (precomputed citywide GeoTIFF) with derived HAND and TWI rasters; aggregated over NTA polygon.",
+            f"Polygon contains {micro_out['n_cells']} 30-m DEM cells.",
+            f"Median elevation: {micro_out['elev_median_m']} m; "
+            f"10th-percentile elevation: {micro_out['elev_p10_m']} m.",
+        ]
+        if micro_out.get("hand_median_m") is not None:
+            body.append(
+                f"Median HAND (Height Above Nearest Drainage): "
+                f"{micro_out['hand_median_m']} m. "
+                f"Fraction of polygon cells with HAND below 1 m "
+                f"(near-channel, water reaches at flood): "
+                f"{(micro_out.get('frac_hand_lt1') or 0) * 100:.1f}%."
+            )
+        if micro_out.get("twi_median") is not None:
+            body.append(
+                f"Median TWI: {micro_out['twi_median']}. "
+                f"Fraction of polygon cells with TWI > 10 (saturation-prone): "
+                f"{(micro_out.get('frac_twi_gt10') or 0) * 100:.1f}%."
+            )
+        docs.append(_doc("microtopo_nta", body))
+    if prithvi_live_out and prithvi_live_out.get("ok"):
+        docs.append(_doc("prithvi_live", [
+            "Source: Prithvi-EO 2.0 (Sen1Floods11 fine-tune) live "
+            "segmentation over a Sentinel-2 L2A scene from Microsoft "
+            f"Planetary Computer, sampled at the NTA centroid of "
+            f"{target['nta_name']}.",
+            f"Sentinel-2 scene id: {prithvi_live_out.get('item_id')}.",
+            f"Observation date: "
+            f"{(prithvi_live_out.get('item_datetime') or '')[:10]}.",
+            f"Cloud cover: {prithvi_live_out.get('cloud_cover', 0):.3f}%.",
+            f"% water across the 5 km chip around the centroid: "
+            f"{prithvi_live_out.get('pct_water_full', 0):.2f}.",
+        ]))
+    if terramind_out and terramind_out.get("ok"):
+        body = [
+            "Source: TerraMind 1.0 base (IBM/ESA, Apache-2.0) any-to-any "
+            "generative foundation model. SYNTHETIC PRIOR — generated "
+            "categorical land-cover from the LiDAR DEM at the NTA "
+            f"centroid of {target['nta_name']}; not a measurement.",
+            f"Chain: {' -> '.join(terramind_out.get('tim_chain') or ['DEM','LULC_synthetic'])}.",
+            f"Diffusion steps: {terramind_out.get('diffusion_steps')}.",
+            f"Diffusion seed: {terramind_out.get('diffusion_seed')}.",
+            f"Dominant synthetic class: "
+            f"{terramind_out.get('dominant_class_display') or terramind_out.get('dominant_class')} "
+            f"at {terramind_out.get('dominant_pct', 0):.1f}% (tentative ESRI "
+            "Land Cover labels).",
+        ]
+        for label, pct in (terramind_out.get("class_fractions") or {}).items():
+            body.append(f"  - {label}: {pct:.1f}%")
+        body.append("Use 'TerraMind generated a plausible synthetic "
+                    "land-cover prior' framing — never 'imaged' or "
+                    "'reconstructed'.")
+        docs.append(_doc("terramind_synthetic", body))
+    for h in rag_out:
+        docs.append(_doc(h["doc_id"], [
+            f"Source: {h['citation']}, page {h.get('page', '')}.",
+            f"Retrieved passage (verbatim): {h['text']}",
+        ]))
+    # ---- reconcile ----
+    rec_t0 = time.time()
+    rec_step = {"step": "reconcile_neighborhood", "started_at": rec_t0, "ok": False}
+    trace.append(rec_step)
+    paragraph = ""
+    audit = {"raw": "", "dropped": []}
+    mellea_meta = None
+    if docs and strict:
+        rec_step["step"] = "mellea_reconcile_neighborhood"
+        try:
+            from app.mellea_validator import DEFAULT_LOOP_BUDGET, reconcile_strict_streaming
+            from app.reconcile import trim_docs_to_plan as _trim
+            docs = _trim(docs, set(plan.specialists or []))
+            def _on_token(delta: str, attempt_idx: int):
+                if progress_q is not None:
+                    progress_q.put({"kind": "token", "delta": delta,
+                                    "attempt": attempt_idx})
+            def _on_attempt_end(attempt_idx, passed, failed):
+                if progress_q is not None:
+                    progress_q.put({"kind": "mellea_attempt",
+                                    "attempt": attempt_idx,
+                                    "passed": passed, "failed": failed})
+            mres = reconcile_strict_streaming(
+                docs, EXTRA_SYSTEM_PROMPT,
+                user_prompt="Write the cited briefing now.",
+                model=OLLAMA_MODEL, loop_budget=DEFAULT_LOOP_BUDGET,
+                on_token=_on_token if progress_q else None,
+                on_attempt_end=_on_attempt_end if progress_q else None,
+            )
+            paragraph = mres["paragraph"]
+            audit = {"raw": paragraph, "dropped": []}
+            mellea_meta = {
+                "rerolls": mres["rerolls"],
+                "n_attempts": mres["n_attempts"],
+                "requirements_passed": mres["requirements_passed"],
+                "requirements_failed": mres["requirements_failed"],
+                "requirements_total": mres["requirements_total"],
+                "model": mres["model"], "loop_budget": mres["loop_budget"],
+            }
+            rec_step["ok"] = True
+            rec_step["result"] = {
+                "rerolls": mellea_meta["rerolls"],
+                "passed": f"{len(mellea_meta['requirements_passed'])}/{mellea_meta['requirements_total']}",
+                "paragraph_chars": len(paragraph),
+            }
+        except Exception as e:
+            rec_step["err"] = str(e)
+            log.exception("Mellea-validated reconcile failed")
+    elif docs:
+        def _on_token(delta: str):
+            if progress_q is not None:
+                progress_q.put({"kind": "token", "delta": delta})
+        try:
+            paragraph, audit = _reconcile(docs, on_token=_on_token if progress_q else None)
+            rec_step["ok"] = True
+            rec_step["result"] = {"paragraph_chars": len(paragraph),
+                                  "dropped": len(audit["dropped"])}
+        except Exception as e:
+            rec_step["err"] = str(e)
+            log.exception("neighborhood reconcile failed")
+    else:
+        paragraph = "No grounded data available for this neighborhood."
+        rec_step["ok"] = True
+        rec_step["result"] = {"paragraph_chars": len(paragraph)}
+    rec_step["elapsed_s"] = round(time.time() - rec_t0, 2)
+    _emit(rec_step)
+    target_safe = {k: v for k, v in target.items() if k != "geometry"}
+    target_safe["bbox"] = list(target["geometry"].bounds)  # [minx, miny, maxx, maxy]
+    return {
+        "intent":      "neighborhood",
+        "query":       query,
+        "plan": {
+            "intent": plan.intent,
+            "targets": plan.targets,
+            "specialists": plan.specialists,
+            "rationale": plan.rationale,
+        },
+        "target":      target_safe,
+        "n_matches":   len(matches),
+        "sandy_nta":   sandy_out,
+        "dep_nta":     dep_out,
+        "nyc311_nta":  nyc311_out,
+        "microtopo_nta": micro_out,
+        "prithvi_live": prithvi_live_out,
+        "terramind": terramind_out,
+        "rag":         rag_out,
+        "paragraph":   paragraph,
+        "audit":       audit,
+        "mellea":      mellea_meta,
+        "trace":       trace,
+        "total_s":     round(time.time() - t0, 2),
+    }
+def _doc(doc_id: str, body_lines: list[str]) -> dict:
+    return {"role": f"document {doc_id}", "content": "\n".join(body_lines)}
+def _reconcile(docs: list[dict], on_token=None) -> tuple[str, dict]:
+    from app.reconcile import verify_paragraph
+    messages = docs + [
+        {"role": "system", "content": EXTRA_SYSTEM_PROMPT},
+        {"role": "user", "content": "Write the cited briefing now."},
+    ]
+    # num_ctx 4096 covers our actual prompt (system ~600 + 6 docs ~2000)
+    # with margin; 8192 was over-allocating KV cache. num_predict caps the
+    # briefing at ~400 tokens — enough for 4 sections, no runaway.
+    OPTS = {"temperature": 0, "num_ctx": 4096, "num_predict": 400}
+    if on_token is None:
+        resp = llm.chat(model=OLLAMA_MODEL, messages=messages, options=OPTS)
+        raw = resp["message"]["content"].strip()
+    else:
+        chunks: list[str] = []
+        for chunk in llm.chat(model=OLLAMA_MODEL, messages=messages,
+                                 stream=True, options=OPTS):
+            delta = (chunk.get("message") or {}).get("content") or ""
+            if delta:
+                chunks.append(delta)
+                on_token(delta)
+        raw = "".join(chunks).strip()
+    cleaned, dropped = verify_paragraph(raw, docs)
+    return cleaned, {"raw": raw, "dropped": dropped}
+def _empty_result(plan, query: str, trace: list, error: str) -> dict:
+    return {
+        "intent":    "neighborhood",
+        "query":     query,
+        "error":     error,
+        "plan": {
+            "intent": plan.intent,
+            "targets": plan.targets,
+            "specialists": plan.specialists,
+            "rationale": plan.rationale,
+        },
+        "trace":     trace,
+        "paragraph": f"Could not resolve target to an NTA: {error}",
+    }

app/intents/single_address.py ADDED Viewed

	@@ -0,0 +1,73 @@

+"""single_address intent — the existing linear FSM, wrapped behind the
+planner-aware execution interface. The planner's specialist list is
+respected only as an OPT-OUT: if the planner explicitly omitted a
+specialist we'd otherwise run, we skip it. The fixed FSM stays as the
+canonical path because (a) it's well-tested, (b) order-of-execution
+matters slightly (geocode before everything), and (c) the executor
+parallelism for an address is bounded by Granite 4.1 reconcile time
+anyway."""
+from __future__ import annotations
+from app.fsm import run as run_linear
+def run(plan, query: str, progress_q=None, strict: bool = False) -> dict:
+    """Execute the planner's single_address Plan via the existing linear
+    FSM. If progress_q is provided, FSM steps and Granite reconcile tokens
+    are forwarded to it for live streaming.
+    strict=True flips the FSM's reconcile step to Mellea-validated
+    rejection sampling (via a thread-local flag). Disables token
+    streaming for that step."""
+    from app.fsm import (
+        iter_steps,
+        set_mellea_attempt_callback,
+        set_planned_specialists,
+        set_strict_mode,
+        set_token_callback,
+    )
+    planner_addr = next(
+        (t["text"] for t in plan.targets if t.get("type") == "address"),
+        None,
+    )
+    addr = planner_addr if (planner_addr and len(planner_addr) >= len(query) * 0.7) else query
+    set_strict_mode(strict)
+    set_planned_specialists(plan.specialists or [])
+    if progress_q is not None:
+        def _on_token(delta: str):
+            progress_q.put({"kind": "token", "delta": delta})
+        def _on_mellea_attempt(attempt_idx, passed, failed):
+            progress_q.put({"kind": "mellea_attempt",
+                            "attempt": attempt_idx,
+                            "passed": passed, "failed": failed})
+        # Streaming Mellea now emits tokens during each attempt — wire
+        # the token callback for both strict and non-strict paths.
+        set_token_callback(_on_token)
+        set_mellea_attempt_callback(_on_mellea_attempt)
+        try:
+            final = None
+            for ev in iter_steps(addr):
+                if ev["kind"] == "step":
+                    progress_q.put({"kind": "step", **ev})
+                else:
+                    final = ev
+            out = {**(final or {}), "trace": []}
+        finally:
+            set_token_callback(None)
+            set_mellea_attempt_callback(None)
+            set_strict_mode(False)
+            set_planned_specialists(None)
+    else:
+        try:
+            out = run_linear(addr)
+        finally:
+            set_strict_mode(False)
+            set_planned_specialists(None)
+    out["intent"] = "single_address"
+    out["plan"] = {
+        "intent": plan.intent,
+        "targets": plan.targets,
+        "specialists": plan.specialists,
+        "rationale": plan.rationale,
+    }
+    return out

app/live/__init__.py ADDED Viewed

File without changes

app/live/floodnet_forecast.py ADDED Viewed

	@@ -0,0 +1,184 @@

+"""Granite TimeSeries TTM r2 forecast on FloodNet sensor flood events.
+This is the strongest single TTM win for the NYU CUSP audience.
+FloodNet (CUSP/Brooklyn College, Charlie Mydlarz + Andrea Silverman)
+operates the sensor network and publishes the historical events; they
+do not publish per-sensor forecasts. Riprap producing a forecast on
+FloodNet's own data is a genuine ecosystem-extension capability — and
+unlike the surge / 311 forecasts, the audience explicitly cares about
+this dataset.
+Architecture:
+- Nearest FloodNet sensor to the queried address (reuse
+  `app.context.floodnet.sensors_near`).
+- 512 days of binary daily-event history at that sensor (1 if any
+  labeled flood event started on that day, else 0).
+- TTM r2 (512 → 96) reused from `app.live.ttm_forecast._load_model` —
+  *no new model class loaded into memory*. The existing 311 daily
+  forecaster has already paid this load cost.
+- 96-day-ahead daily forecast → aggregated into 4-week and 12-week
+  expected counts so the briefing narration stays human-readable.
+Silence over confabulation: returns `available: False` with a
+reason field on every failure path. Sensors with fewer than 5
+flood events in their entire history yield no forecast (the TTM
+output on near-empty histories is dominated by quantization noise).
+Doc-id format: `floodnet_forecast_<deployment_id>` so it's distinct
+from the existing `[floodnet]` event-history doc.
+"""
+from __future__ import annotations
+import logging
+from datetime import datetime, timedelta, timezone
+import numpy as np
+from app.context.floodnet import flood_events_for, sensors_near
+from app.live.ttm_forecast import (
+    DAILY_CONTEXT,
+    DAILY_PREDICTION,
+    _MODEL_LOAD_ERROR,
+    _run_ttm,
+)
+log = logging.getLogger("riprap.floodnet_forecast")
+DOC_ID_PREFIX = "floodnet_forecast"
+CITATION = (
+    "FloodNet NYC ultrasonic depth sensors (api.floodnet.nyc) + "
+    "IBM Granite TimeSeries TTM r2 (Ekambaram et al. 2024, NeurIPS) "
+    "via granite-tsfm — daily flood-event recurrence forecast"
+)
+# A sensor with <5 historical events in 512 days has too sparse a
+# signal for TTM to produce a meaningful forecast. The model still
+# runs, but the output is dominated by quantization noise around
+# zero; emitting a doc from that state is exactly the kind of
+# pseudo-quantitative claim the four-tier discipline guards against.
+MIN_EVENTS_FOR_FORECAST = 5
+# Search radius for nearest-sensor lookup. Wider than the existing
+# `floodnet` specialist's 600 m (which scans for *all* sensors at
+# the address) — we just need *one* relevant sensor for the forecast.
+NEAREST_SENSOR_RADIUS_M = 1500
+def _haversine_m(lat1, lon1, lat2, lon2) -> float:
+    from math import asin, cos, radians, sin, sqrt
+    R = 6371000.0
+    p1, p2 = radians(lat1), radians(lat2)
+    dp = radians(lat2 - lat1); dl = radians(lon2 - lon1)
+    a = sin(dp / 2) ** 2 + cos(p1) * cos(p2) * sin(dl / 2) ** 2
+    return 2 * R * asin(sqrt(a))
+def _build_daily_event_series(
+    deployment_id: str, days: int
+) -> tuple[np.ndarray, list[str], int]:
+    """Pull flood events for one sensor over `days` days, return a
+    daily binary series (1 if ≥1 flood event started that day, 0
+    otherwise) plus the event count."""
+    since = datetime.now(timezone.utc) - timedelta(days=days + 2)
+    events = flood_events_for([deployment_id], since=since)
+    end = datetime.now(timezone.utc).date()
+    start = end - timedelta(days=days - 1)
+    by_day: dict[str, int] = {}
+    for e in events:
+        ds = (e.start_time or "")[:10]
+        if not ds:
+            continue
+        by_day[ds] = 1
+    series: list[int] = []
+    labels: list[str] = []
+    for i in range(days):
+        d = start + timedelta(days=i)
+        d_iso = d.isoformat()
+        labels.append(d_iso)
+        series.append(by_day.get(d_iso, 0))
+    return np.array(series, dtype=np.float32), labels, len(events)
+def summary_for_point(lat: float, lon: float) -> dict:
+    """Forecast flood-event recurrence at the nearest FloodNet sensor.
+    Returns a dict with `available: bool`. On success, includes the
+    sensor identity, history summary, and forecast aggregates.
+    """
+    try:
+        sensors = sensors_near(lat, lon, NEAREST_SENSOR_RADIUS_M)
+    except Exception as e:
+        log.warning("FloodNet sensor lookup failed: %r", e)
+        return {"available": False, "reason": "FloodNet API unreachable"}
+    if not sensors:
+        return {"available": False,
+                "reason": f"no FloodNet sensor within {NEAREST_SENSOR_RADIUS_M} m"}
+    # Closest by haversine. Some deployments have null geometry; skip those.
+    geo_sensors = [s for s in sensors if s.lat is not None and s.lon is not None]
+    if not geo_sensors:
+        return {"available": False, "reason": "nearest sensor has no geometry"}
+    nearest = min(geo_sensors,
+                  key=lambda s: _haversine_m(lat, lon, s.lat, s.lon))
+    distance_m = _haversine_m(lat, lon, nearest.lat, nearest.lon)
+    try:
+        history, labels, total_events = _build_daily_event_series(
+            nearest.deployment_id, days=DAILY_CONTEXT
+        )
+    except Exception as e:
+        log.warning("FloodNet history fetch failed for %s: %r",
+                    nearest.deployment_id, e)
+        return {"available": False, "reason": "history fetch failed"}
+    if total_events < MIN_EVENTS_FOR_FORECAST:
+        return {
+            "available": False,
+            "reason": (f"sensor has only {total_events} historical events "
+                       f"(<{MIN_EVENTS_FOR_FORECAST}); forecast omitted"),
+            "sensor_id": nearest.deployment_id,
+            "sensor_name": nearest.name,
+        }
+    forecast = _run_ttm(history, DAILY_CONTEXT, DAILY_PREDICTION)
+    if forecast is None:
+        return {"available": False,
+                "reason": _MODEL_LOAD_ERROR or "TTM inference failed"}
+    fc = np.clip(forecast, 0, None)
+    fc28 = float(fc[:28].sum())
+    fc_total = float(fc.sum())
+    fc_peak_offset = int(fc.argmax()) + 1
+    fc_peak_value = float(fc.max())
+    hist_total = int(history.sum())
+    hist_recent_28d = float(history[-28:].sum())
+    # "Accelerating" if the next-28-days expected count materially
+    # exceeds the prior-28-days observed count.
+    accelerating = (hist_recent_28d > 0
+                    and fc28 > 1.5 * hist_recent_28d)
+    return {
+        "available": True,
+        "doc_id": f"{DOC_ID_PREFIX}_{nearest.deployment_id}",
+        "sensor_id": nearest.deployment_id,
+        "sensor_name": nearest.name,
+        "sensor_street": nearest.street,
+        "sensor_borough": nearest.borough,
+        "sensor_lat": nearest.lat,
+        "sensor_lon": nearest.lon,
+        "distance_from_query_m": round(distance_m, 1),
+        "history_window_days": DAILY_CONTEXT,
+        "history_total_events": hist_total,
+        "history_recent_28d_events": int(hist_recent_28d),
+        "forecast_horizon_days": DAILY_PREDICTION,
+        "forecast_28d_expected_events": round(fc28, 2),
+        "forecast_total_horizon_events": round(fc_total, 2),
+        "forecast_peak_day_offset": fc_peak_offset,
+        "forecast_peak_day_value": round(fc_peak_value, 3),
+        "accelerating": accelerating,
+        "model": "granite-timeseries-ttm-r2",
+        "citation": CITATION,
+    }

app/live/ttm_forecast.py ADDED Viewed

	@@ -0,0 +1,363 @@

+"""Granite TimeSeries TTM r2 — short-horizon nowcast for the live tide
+residual (storm surge / wind setup) at the NYC harbor entrance.
+Why TTM here, vs the existing live NOAA fetcher:
+- The existing `noaa_tides` specialist returns a single 6-min snapshot:
+  observed, predicted, residual = observed - predicted. That's "right now."
+- TTM forecasts the next ~9.6 hours of the *residual* — the meteorologic
+  component (surge + wind setup). NOAA already publishes the astronomical
+  tide; TTM tells us if the surge component is about to peak.
+- This is the genuinely useful add: a nowcast of the part NOAA *doesn't*
+  predict.
+Architecture: ibm-granite/granite-timeseries-ttm-r2, ~1.5M params,
+zero-shot multivariate (we use it univariate here on the residual
+series). 512-step context @ 6-min cadence = ~51 h of history;
+96-step horizon = ~9.6 h ahead.
+Citation: Ekambaram, V., et al. (2024). "Tiny Time Mixers (TTMs):
+Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of
+Multivariate Time Series." NeurIPS 2024.
+Gated emission: a doc is only added when the forecast peak residual
+exceeds an absolute threshold (default 0.3 ft / 9 cm). On a calm day
+the model still runs, but the reconciler sees no doc — silence over
+confabulation.
+"""
+from __future__ import annotations
+import logging
+from datetime import datetime, timedelta
+import httpx
+import numpy as np
+log = logging.getLogger("riprap.ttm_forecast")
+DOC_ID = "ttm_forecast"
+CITATION = ("IBM Granite TimeSeries TTM r2 (Ekambaram et al. 2024, NeurIPS); "
+            "ibm-granite/granite-timeseries-ttm-r2 via granite-tsfm")
+# Three NOAA stations covering NYC harbor + Long Island Sound + Bight.
+# step_ttm_forecast picks the closest to the queried address (matches the
+# existing nearest-gauge behaviour in step_noaa_tides). This means an
+# inland-Queens query forecasts at Kings Point (Long Island Sound), a
+# Coney Island query forecasts at Sandy Hook (Bight), and a Manhattan
+# query forecasts at the Battery — each gauge characterises a different
+# storm-surge regime.
+STATIONS = [
+    ("8518750", "The Battery, NY",  40.7006, -74.0142),
+    ("8516945", "Kings Point, NY",  40.8103, -73.7649),
+    ("8531680", "Sandy Hook, NJ",   40.4669, -74.0094),
+]
+NOAA_URL = "https://api.tidesandcurrents.noaa.gov/api/prod/datagetter"
+CONTEXT_LENGTH = 512   # ~51 h at 6-min cadence (surge forecast)
+PREDICTION_LENGTH = 96  # ~9.6 h horizon (surge forecast)
+MIN_INTERESTING_RESIDUAL_FT = 0.3  # ~9 cm — gate for doc emission
+# 311 daily-counts forecast — TTM r2's smallest pretrained config is
+# 512 context which is awkward for weekly counts on a single address.
+# Daily aggregation (512 days ≈ 17 months of complaint history) lets
+# the model run natively at its standard resolution; we forecast the
+# next 96 days (~3 months).
+DAILY_CONTEXT = 512
+DAILY_PREDICTION = 96
+NYC_311_URL = "https://data.cityofnewyork.us/resource/erm2-nwe9.json"
+NYC_311_FLOOD_DESCRIPTORS = (
+    "Sewer Backup (Use Comments) (SA)",
+    "Catch Basin Clogged/Flooding (Use Comments) (SC)",
+    "Street Flooding (SJ)",
+    "Manhole Overflow (Use Comments) (SA1)",
+    "Flooding on Street",
+)
+# ---- Lazy-loaded model singleton -----------------------------------------
+_MODELS: dict[tuple[int, int], object] = {}
+_MODEL_LOAD_ERROR: str | None = None
+def _load_model(context_length: int = CONTEXT_LENGTH,
+                prediction_length: int = PREDICTION_LENGTH):
+    """TTM r2 is configured per (context, prediction) length pair. Cache
+    by that pair so the surge forecaster (512→96) and the weekly 311
+    forecaster (52→4) each get their own model handle on first use."""
+    global _MODEL_LOAD_ERROR
+    key = (context_length, prediction_length)
+    if key in _MODELS:
+        return _MODELS[key]
+    if _MODEL_LOAD_ERROR is not None:
+        return None
+    try:
+        import torch  # noqa: F401
+        from tsfm_public.toolkit.get_model import get_model
+        m = get_model(
+            "ibm-granite/granite-timeseries-ttm-r2",
+            context_length=context_length,
+            prediction_length=prediction_length,
+        )
+        m.eval()
+        _MODELS[key] = m
+        log.info("TTM r2 loaded (context=%d horizon=%d)",
+                 context_length, prediction_length)
+        return m
+    except Exception as e:
+        _MODEL_LOAD_ERROR = repr(e)
+        log.exception("TTM model load failed; future calls will be skipped")
+        return None
+# Closest-of-three station selection (mirrors app/context/noaa_tides.py).
+def _haversine_km(lat1, lon1, lat2, lon2) -> float:
+    from math import asin, cos, radians, sin, sqrt
+    R = 6371.0
+    p1, p2 = radians(lat1), radians(lat2)
+    dp = radians(lat2 - lat1); dl = radians(lon2 - lon1)
+    a = sin(dp / 2) ** 2 + cos(p1) * cos(p2) * sin(dl / 2) ** 2
+    return 2 * R * asin(sqrt(a))
+def _nearest_station(lat: float, lon: float):
+    return min(STATIONS, key=lambda s: _haversine_km(lat, lon, s[2], s[3]))
+# ---- NOAA history fetch --------------------------------------------------
+def _fetch_noaa_series(begin_iso: str, end_iso: str, product: str,
+                       station_id: str) -> dict:
+    """One-shot NOAA datagetter for a date range. Returns the JSON body."""
+    r = httpx.get(NOAA_URL, params={
+        "begin_date": begin_iso, "end_date": end_iso,
+        "station": station_id, "product": product,
+        "datum": "MLLW", "units": "english", "time_zone": "lst_ldt",
+        "format": "json",
+    }, timeout=15.0)
+    r.raise_for_status()
+    return r.json()
+def _residual_series(station_id: str,
+                     n_obs_needed: int = CONTEXT_LENGTH) -> tuple[np.ndarray, list[str]] | None:
+    """Build the recent residual series (observed - predicted) at 6-min
+    cadence, length CONTEXT_LENGTH. Returns (values_ft, timestamps_iso).
+    Returns None if NOAA refused, returned mismatched shapes, or the
+    series is too short."""
+    # Fetch slightly more than we need to absorb the occasional missing
+    # 6-min sample; we'll trim to exact length below.
+    end = datetime.utcnow()
+    # NOAA recommends LST/LDT for time_zone matching across products
+    begin = end - timedelta(minutes=6 * (n_obs_needed + 50))
+    fmt = "%Y%m%d %H:%M"
+    begin_s = begin.strftime(fmt)
+    end_s = end.strftime(fmt)
+    try:
+        obs_j = _fetch_noaa_series(begin_s, end_s, "water_level", station_id)
+        pred_j = _fetch_noaa_series(begin_s, end_s, "predictions", station_id)
+    except Exception as e:
+        log.warning("NOAA fetch failed: %r", e)
+        return None
+    obs_data = obs_j.get("data") or []
+    pred_data = pred_j.get("predictions") or []
+    if not obs_data or not pred_data:
+        return None
+    # Both products are 6-min cadence and share timestamps; align by t.
+    obs_by_t = {row["t"]: float(row["v"]) for row in obs_data if row.get("v")}
+    pred_by_t = {row["t"]: float(row["v"]) for row in pred_data if row.get("v")}
+    common_ts = sorted(set(obs_by_t) & set(pred_by_t))
+    if len(common_ts) < n_obs_needed:
+        log.warning("only %d aligned NOAA samples (need %d)",
+                    len(common_ts), n_obs_needed)
+        return None
+    common_ts = common_ts[-n_obs_needed:]
+    residual = np.array([obs_by_t[t] - pred_by_t[t] for t in common_ts],
+                        dtype=np.float32)
+    return residual, common_ts
+# ---- Forecast --------------------------------------------------------------
+def _run_ttm(history: np.ndarray,
+             context_length: int = CONTEXT_LENGTH,
+             prediction_length: int = PREDICTION_LENGTH) -> np.ndarray | None:
+    """Channel-wise standardize, run model, de-standardize. Returns a
+    `prediction_length`-step de-standardized forecast in input units."""
+    model = _load_model(context_length, prediction_length)
+    if model is None:
+        return None
+    import torch
+    mu = float(history.mean())
+    sigma = float(history.std() + 1e-6)
+    normed = (history - mu) / sigma
+    x = torch.from_numpy(normed.astype(np.float32))[None, :, None]
+    try:
+        with torch.no_grad():
+            out = model(past_values=x)
+    except Exception as e:
+        log.exception("TTM inference failed: %r", e)
+        return None
+    pred = out.prediction_outputs[0, :, 0].cpu().numpy()
+    return pred * sigma + mu
+def summary_for_point(lat: float, lon: float) -> dict:
+    """Surge forecast at the NOAA gauge nearest the queried address.
+    Three gauges cover NYC: Battery (harbor entrance), Kings Point
+    (LI Sound), Sandy Hook (Bight). Surge regimes differ — Sandy 2012
+    peaked at +14 ft at the Battery vs. lower at Kings Point because
+    the gauges respond to different forcing geometries. Picking the
+    closest gauge to the queried address makes the forecast
+    address-relevant rather than always city-wide.
+    """
+    sid, sname, slat, slon = _nearest_station(lat, lon)
+    distance_km = round(_haversine_km(lat, lon, slat, slon), 1)
+    series = _residual_series(sid)
+    if series is None:
+        return {"available": False,
+                "reason": "NOAA history fetch returned insufficient data",
+                "station_id": sid, "station_name": sname,
+                "distance_km": distance_km}
+    history, timestamps = series
+    forecast = _run_ttm(history, CONTEXT_LENGTH, PREDICTION_LENGTH)
+    if forecast is None:
+        return {"available": False,
+                "reason": _MODEL_LOAD_ERROR or "TTM inference failed",
+                "station_id": sid, "station_name": sname,
+                "distance_km": distance_km}
+    history_peak = float(np.max(np.abs(history)))
+    fc_peak_idx = int(np.argmax(np.abs(forecast)))
+    fc_peak_ft = float(forecast[fc_peak_idx])
+    fc_peak_minutes_ahead = (fc_peak_idx + 1) * 6
+    fc_peak_time = datetime.utcnow() + timedelta(minutes=fc_peak_minutes_ahead)
+    interesting = (abs(fc_peak_ft) >= MIN_INTERESTING_RESIDUAL_FT or
+                   history_peak >= MIN_INTERESTING_RESIDUAL_FT)
+    return {
+        "available": True,
+        "interesting": interesting,
+        "station_id": sid,
+        "station_name": sname,
+        "distance_km": distance_km,
+        "context_length": int(len(history)),
+        "horizon_steps": int(len(forecast)),
+        "history_peak_abs_ft": round(history_peak, 2),
+        "history_recent_ft": round(float(history[-1]), 2),
+        "forecast_peak_ft": round(fc_peak_ft, 2),
+        "forecast_peak_minutes_ahead": fc_peak_minutes_ahead,
+        "forecast_peak_time_utc": fc_peak_time.isoformat(timespec="minutes") + "Z",
+        "threshold_ft": MIN_INTERESTING_RESIDUAL_FT,
+    }
+# ---- Per-address daily 311 flood-complaint forecast ----------------------
+def _fetch_311_flood_daily(lat: float, lon: float,
+                            radius_m: int = 200,
+                            days: int = DAILY_CONTEXT,
+                            ) -> tuple[np.ndarray, list[str]] | None:
+    """Pull `days` of daily flood-complaint counts within `radius_m` of
+    (lat, lon) from NYC OpenData. Returns (counts_array_length_days,
+    date_labels) or None on failure. Missing days are zero-filled."""
+    from collections import defaultdict
+    from datetime import datetime as _dt
+    from datetime import timedelta as _td
+    end = _dt.utcnow().date()
+    start = end - _td(days=days + 1)
+    descs = " OR ".join(f"descriptor='{d}'" for d in NYC_311_FLOOD_DESCRIPTORS)
+    where = (
+        f"created_date between '{start.isoformat()}T00:00:00' and "
+        f"'{end.isoformat()}T23:59:59' AND "
+        f"latitude IS NOT NULL AND longitude IS NOT NULL AND "
+        f"({descs}) AND "
+        f"within_circle(location, {lat}, {lon}, {radius_m})"
+    )
+    try:
+        r = httpx.get(NYC_311_URL,
+                      params={"$select": "created_date",
+                              "$where": where,
+                              "$limit": "50000"},
+                      timeout=20.0)
+        r.raise_for_status()
+        rows = r.json()
+    except Exception as e:
+        log.warning("311 flood fetch for TTM failed: %r", e)
+        return None
+    counts: dict[str, int] = defaultdict(int)
+    for row in rows or []:
+        ds = (row.get("created_date") or "")[:10]
+        if not ds:
+            continue
+        counts[ds] += 1
+    series: list[int] = []
+    labels: list[str] = []
+    for i in range(days):
+        d = end - _td(days=days - 1 - i)
+        d_iso = d.isoformat()
+        labels.append(d_iso)
+        series.append(counts.get(d_iso, 0))
+    return np.array(series, dtype=np.float32), labels
+def weekly_311_forecast_for_point(lat: float, lon: float,
+                                  radius_m: int = 200) -> dict:
+    """TTM r2 zero-shot forecast on per-address daily 311
+    flood-complaint counts. Despite the name — kept for FSM-call-site
+    stability — this now operates on daily resolution (TTM r2's
+    smallest native config is 512 context, awkward for weekly).
+    History: 512 days (~17 months); forecast: 96 days (~3 months).
+    Returns daily and weekly summaries so the reconciler narration
+    stays human-readable.
+    Designed not to raise. Returns `available: False` with a reason
+    field on any failure path."""
+    series = _fetch_311_flood_daily(lat, lon, radius_m=radius_m)
+    if series is None:
+        return {"available": False, "reason": "311 history fetch failed"}
+    history, labels = series
+    forecast = _run_ttm(history, DAILY_CONTEXT, DAILY_PREDICTION)
+    if forecast is None:
+        return {"available": False,
+                "reason": _MODEL_LOAD_ERROR or "TTM inference failed"}
+    fc_clipped = np.clip(forecast, 0, None)
+    hist_total = int(history.sum())
+    hist_mean_per_day = float(history.mean())
+    hist_recent_mean_30d = float(history[-30:].mean())
+    fc_total = float(fc_clipped.sum())
+    fc_mean_per_day = float(fc_clipped.mean())
+    fc_peak_day = float(fc_clipped.max())
+    fc_peak_day_offset = int(fc_clipped.argmax()) + 1
+    # Aggregate to weekly equivalents for the briefing narration —
+    # readers think in weeks, not days.
+    history_weekly_mean = hist_mean_per_day * 7
+    forecast_weekly_mean = fc_mean_per_day * 7
+    accelerating = (hist_recent_mean_30d > 0 and
+                    fc_mean_per_day > 1.5 * hist_recent_mean_30d)
+    return {
+        "available": True,
+        "radius_m": radius_m,
+        "days_context": DAILY_CONTEXT,
+        "days_horizon": DAILY_PREDICTION,
+        "history_total_complaints": hist_total,
+        "history_mean_per_day": round(hist_mean_per_day, 3),
+        "history_recent_30d_mean": round(hist_recent_mean_30d, 3),
+        "history_weekly_equivalent": round(history_weekly_mean, 2),
+        "forecast_total_next_horizon": round(fc_total, 1),
+        "forecast_mean_per_day": round(fc_mean_per_day, 3),
+        "forecast_weekly_equivalent": round(forecast_weekly_mean, 2),
+        "forecast_peak_day": round(fc_peak_day, 2),
+        "forecast_peak_day_offset": fc_peak_day_offset,
+        "accelerating": accelerating,
+        "context_window_start": labels[0] if labels else None,
+        "context_window_end": labels[-1] if labels else None,
+    }

app/llm.py ADDED Viewed

	@@ -0,0 +1,296 @@

+"""LiteLLM-backed shim around the ollama.chat call surface.
+Single function `chat(model, messages, options, stream)` that returns the
+same dict / iterator-of-dicts shape `ollama.chat` returns, so existing
+call sites swap `import ollama` -> `from app import llm` with no other
+changes.
+Backend selection (env):
+  RIPRAP_LLM_PRIMARY   = "vllm" | "ollama"   (default: ollama)
+  RIPRAP_LLM_BASE_URL  = http://amd:8000/v1  (vllm only)
+  RIPRAP_LLM_API_KEY   = <token>             (vllm only)
+  RIPRAP_LLM_FALLBACK  = "ollama" | ""       (default: "ollama" when
+                                              primary=vllm, else "")
+  OLLAMA_BASE_URL      = http://host:11434   (ollama backend only)
+Model routing: callers may pass either Ollama tags ("granite4.1:8b") or
+logical aliases ("granite-8b"). Mapped to:
+  vllm   -> openai/granite-4.1-{3b,8b} on RIPRAP_LLM_BASE_URL
+  ollama -> ollama_chat/granite4.1:{3b,8b} on OLLAMA_BASE_URL
+When primary=vllm with fallback=ollama, the LiteLLM Router auto-fails
+over to the local Ollama deployment if the AMD endpoint errors (timeout,
+connection refused, 5xx). Existing call sites are unaware of the swap.
+"""
+from __future__ import annotations
+import logging
+import os
+from collections.abc import Iterator
+from typing import Any
+import litellm
+from litellm import Router
+log = logging.getLogger(__name__)
+litellm.suppress_debug_info = True
+litellm.drop_params = True  # silently drop unsupported params instead of erroring
+_VLLM_BASE = os.environ.get("RIPRAP_LLM_BASE_URL", "").rstrip("/")
+_VLLM_KEY = os.environ.get("RIPRAP_LLM_API_KEY", "") or "EMPTY"
+_PRIMARY = os.environ.get("RIPRAP_LLM_PRIMARY", "ollama").lower()
+_FALLBACK = os.environ.get(
+    "RIPRAP_LLM_FALLBACK",
+    "ollama" if _PRIMARY == "vllm" else "",
+).lower()
+_OLLAMA_BASE = os.environ.get(
+    "OLLAMA_BASE_URL",
+    os.environ.get("OLLAMA_HOST", "http://localhost:11434"),
+)
+if not _OLLAMA_BASE.startswith("http"):
+    _OLLAMA_BASE = "http://" + _OLLAMA_BASE
+# alias -> (vllm-served-name, ollama-tag)
+# In our hackathon vLLM deployment only the 8B is served (one served-name
+# per vLLM process and we don't want a second container). Override the
+# 3B served-name with RIPRAP_LLM_VLLM_3B_NAME if you stand up a second
+# vLLM serving the 3B and want the planner to hit it specifically.
+_VLLM_8B = os.environ.get("RIPRAP_LLM_VLLM_8B_NAME", "granite-4.1-8b")
+_VLLM_3B = os.environ.get("RIPRAP_LLM_VLLM_3B_NAME", _VLLM_8B)
+# Ollama tag overrides: HF Spaces' build disk fills past the threshold
+# when both granite4.1:3b and granite4.1:8b are pulled alongside the
+# Phase 1 / Phase 4 EO toolchain. Set RIPRAP_OLLAMA_3B_TAG=granite4.1:8b
+# on disk-constrained deployments — the planner output is short, so
+# the 8B-vs-3B difference is latency, not correctness.
+#
+# RIPRAP_OLLAMA_8B_TAG is also the cheapest knob for swapping quants
+# without touching code: e.g. "granite4.1:8b-q3_K_M" gives ~1 GB of
+# memory back vs the default Q4_K_M, at minor grounding-discipline cost
+# (re-run the Hollis probe before committing — see CLAUDE.md).
+_OLLAMA_3B_TAG = os.environ.get("RIPRAP_OLLAMA_3B_TAG", "granite4.1:3b")
+_OLLAMA_8B_TAG = os.environ.get("RIPRAP_OLLAMA_8B_TAG", "granite4.1:8b")
+_LOGICAL: dict[str, tuple[str, str]] = {
+    "granite-3b": (_VLLM_3B, _OLLAMA_3B_TAG),
+    "granite-8b": (_VLLM_8B, _OLLAMA_8B_TAG),
+}
+_OLLAMA_TO_LOGICAL = {v[1]: k for k, v in _LOGICAL.items()}
+# Also accept the canonical hardcoded tag names so callers passing
+# `granite4.1:3b` resolve to the alias even when the env override
+# remapped that alias to `granite4.1:8b`.
+_OLLAMA_TO_LOGICAL.setdefault("granite4.1:3b", "granite-3b")
+_OLLAMA_TO_LOGICAL.setdefault("granite4.1:8b", "granite-8b")
+def _build_router() -> Router:
+    model_list: list[dict[str, Any]] = []
+    fallbacks: list[dict[str, list[str]]] = []
+    use_vllm = _PRIMARY == "vllm" and bool(_VLLM_BASE)
+    for alias, (vllm_name, ollama_tag) in _LOGICAL.items():
+        if use_vllm:
+            model_list.append({
+                "model_name": alias,
+                "litellm_params": {
+                    "model": f"openai/{vllm_name}",
+                    "api_base": _VLLM_BASE,
+                    "api_key": _VLLM_KEY,
+                    "timeout": 240,
+                    "stream_timeout": 240,
+                },
+            })
+            if _FALLBACK == "ollama":
+                fb_alias = f"{alias}-ollama"
+                model_list.append({
+                    "model_name": fb_alias,
+                    "litellm_params": {
+                        "model": f"ollama_chat/{ollama_tag}",
+                        "api_base": _OLLAMA_BASE,
+                        "timeout": 240,
+                        "stream_timeout": 240,
+                    },
+                })
+                fallbacks.append({alias: [fb_alias]})
+        else:
+            model_list.append({
+                "model_name": alias,
+                "litellm_params": {
+                    "model": f"ollama_chat/{ollama_tag}",
+                    "api_base": _OLLAMA_BASE,
+                    "timeout": 240,
+                    "stream_timeout": 240,
+                },
+            })
+    log.info("llm router primary=%s fallback=%s vllm_base=%s ollama_base=%s",
+             _PRIMARY, _FALLBACK or "<none>",
+             _VLLM_BASE or "<unset>", _OLLAMA_BASE)
+    return Router(
+        model_list=model_list,
+        fallbacks=fallbacks,
+        num_retries=0,  # Router fallback handles the failover; no point
+                        # burning seconds re-hitting a dead endpoint.
+        timeout=240,
+    )
+_router = _build_router()
+def _resolve_alias(model: str) -> str:
+    if model in _LOGICAL:
+        return model
+    if model in _OLLAMA_TO_LOGICAL:
+        return _OLLAMA_TO_LOGICAL[model]
+    return model  # pass through; let the router report unknowns
+def _opts_to_kwargs(options: dict | None) -> dict:
+    """Translate ollama-style options dict to LiteLLM kwargs.
+    Ollama-only knobs (num_ctx) are forwarded via extra_body so that the
+    ollama_chat backend still receives them; OpenAI/vLLM ignores them
+    (litellm.drop_params=True).
+    """
+    kw: dict[str, Any] = {}
+    extra: dict[str, Any] = {}
+    if options:
+        if "temperature" in options:
+            kw["temperature"] = options["temperature"]
+        if "top_p" in options:
+            kw["top_p"] = options["top_p"]
+        if "num_predict" in options:
+            kw["max_tokens"] = options["num_predict"]
+        for k in ("num_ctx",):
+            if k in options:
+                extra[k] = options[k]
+    if extra:
+        kw["extra_body"] = extra
+    return kw
+def _extract_documents(messages: list[dict]) -> list[dict]:
+    """Pull document-role messages into Granite's HF chat-template format.
+    Ollama's Modelfile template recognizes `role: "document <id>"` and
+    bundles the message into a <documents> block automatically. The HF
+    tokenizer chat template (used by vLLM) does *not* — it silently
+    drops non-standard roles. To make vLLM honor the same grounding
+    contract, we extract the documents into the chat-template kwarg
+    `documents=[{"doc_id": ..., "text": ...}]` while leaving the
+    original document-role messages in place so the Ollama backend
+    keeps working unchanged on the fallback path.
+    """
+    docs: list[dict] = []
+    for m in messages:
+        role = m.get("role", "")
+        if role.startswith("document "):
+            docs.append({
+                "doc_id": role.split(" ", 1)[1],
+                "text": m.get("content", ""),
+            })
+    return docs
+# vLLM's Granite chat template emits citations as `[doc_id=foo]`; the rest
+# of Riprap (Mellea checks, frontend chip rendering, citations regex) all
+# expect the bare `[foo]` form that Ollama's Modelfile template produces.
+# Normalize transparently so the two backends are interchangeable.
+_CITE_NORMALIZE_RE = __import__("re").compile(r"\[doc_id=([A-Za-z0-9_]+)\]")
+def _normalize_citations(text: str) -> str:
+    return _CITE_NORMALIZE_RE.sub(r"[\1]", text)
+def _to_ollama_shape(resp) -> dict:
+    msg = resp.choices[0].message
+    content = _normalize_citations(msg.content or "")
+    return {"message": {"role": "assistant", "content": content}}
+def _stream_to_ollama_shape(stream) -> Iterator[dict]:
+    for chunk in stream:
+        try:
+            delta = chunk.choices[0].delta
+            content = getattr(delta, "content", None) or ""
+        except (IndexError, AttributeError):
+            content = ""
+        # Per-chunk normalize is safe: `[doc_id=X]` arrives as a single
+        # token sequence inside one chunk in practice, and the regex is
+        # idempotent / no-op on partial matches.
+        if content:
+            content = _normalize_citations(content)
+        yield {"message": {"role": "assistant", "content": content}}
+def _default_hardware_label() -> str:
+    """Best-guess hardware label for the UI badge.
+    Auto-detected from env. Operators can override with
+    RIPRAP_HARDWARE_LABEL (e.g. "AMD MI300X" / "NVIDIA T4" / "Apple M3 Pro").
+    """
+    if _PRIMARY == "vllm" and _VLLM_BASE:
+        return "AMD MI300X"
+    if os.environ.get("SPACE_ID") or os.environ.get("HF_SPACE_ID"):
+        return "NVIDIA T4"
+    return "Local"
+def backend_info() -> dict[str, Any]:
+    """Static description of the active LLM routing for the /api/backend
+    endpoint and the UI badge. Does not perform a network call; the
+    /api/backend handler does its own reachability ping."""
+    primary_engine = "vLLM" if _PRIMARY == "vllm" and _VLLM_BASE else "Ollama"
+    fallback_engine = (
+        "Ollama" if (_PRIMARY == "vllm" and _FALLBACK == "ollama")
+        else None
+    )
+    return {
+        "primary": _PRIMARY if _VLLM_BASE or _PRIMARY != "vllm" else "ollama",
+        "engine": os.environ.get("RIPRAP_ENGINE_LABEL", primary_engine),
+        "hardware": os.environ.get("RIPRAP_HARDWARE_LABEL",
+                                   _default_hardware_label()),
+        "model": os.environ.get("RIPRAP_RECONCILER_MODEL", _OLLAMA_8B_TAG),
+        "vllm_base_url": _VLLM_BASE or None,
+        "ollama_base_url": _OLLAMA_BASE,
+        "fallback_engine": fallback_engine,
+    }
+def chat(model: str, messages: list[dict], options: dict | None = None,
+         stream: bool = False, format: str | None = None):
+    """Drop-in replacement for ollama.chat with router-managed failover.
+    Returns:
+      - stream=False: dict shaped like ollama's response
+        ({"message": {"role": "assistant", "content": "..."}}).
+      - stream=True: iterator yielding chunk dicts of the same shape.
+    `format="json"` mirrors Ollama's JSON-mode forcing — translated to
+    OpenAI's response_format for vLLM, and passed through unchanged for
+    the Ollama backend.
+    """
+    alias = _resolve_alias(model)
+    kwargs = _opts_to_kwargs(options)
+    docs = _extract_documents(messages)
+    if docs:
+        # Merge into extra_body so Granite's HF chat template (vLLM)
+        # picks them up. Ollama backend ignores extra_body and keeps
+        # using the role="document <id>" messages already in `messages`.
+        eb = kwargs.setdefault("extra_body", {})
+        eb["documents"] = docs
+        eb.setdefault("chat_template_kwargs", {})["documents"] = docs
+    if format == "json":
+        # OpenAI/vLLM path
+        kwargs["response_format"] = {"type": "json_object"}
+        # Ollama path (LiteLLM forwards this via extra_body for ollama_chat)
+        kwargs.setdefault("extra_body", {})["format"] = "json"
+    if stream:
+        s = _router.completion(model=alias, messages=messages,
+                               stream=True, **kwargs)
+        return _stream_to_ollama_shape(s)
+    resp = _router.completion(model=alias, messages=messages, **kwargs)
+    return _to_ollama_shape(resp)

app/mellea_validator.py ADDED Viewed

	@@ -0,0 +1,501 @@

+"""Mellea-validated reconciliation for Riprap.
+Wraps the existing Granite-via-Ollama reconciliation in IBM Research's
+Mellea framework: typed output + programmatic post-conditions +
+rejection sampling. Replaces post-hoc sentence-dropping with
+"don't accept output until requirements pass."
+Streaming and rejection sampling are mutually exclusive — by the time
+we'd validate, the user has watched the bad output appear. Strict mode
+trades streaming for compliance; the UI shows a "validating" skeleton
+instead of token-by-token render.
+The four invariants ported from the parent project's mellea_probe:
+  1. no_invented_numbers       — every number in output appears in source
+  2. no_placeholder_tokens     — output never contains "[source]" or
+                                 raw <document> markup
+  3. every_claim_cited         — each numeric token has a [doc_id] within
+                                 ~40 chars
+  4. referenced_doc_ids_exist  — cited doc_ids ⊆ input doc_ids
+"""
+from __future__ import annotations
+import logging
+import os
+import re
+import time
+from typing import Any
+from mellea import start_session
+from mellea.stdlib.requirements import req, simple_validate
+from mellea.stdlib.sampling import RejectionSamplingStrategy
+from app import llm
+log = logging.getLogger("riprap.mellea")
+# Default reconciler model — same env-var contract as app/reconcile.py.
+DEFAULT_MODEL = os.environ.get(
+    "RIPRAP_RECONCILER_MODEL",
+    os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:8b"),
+)
+# Loop budget — try up to N samples before falling back to the last
+# candidate even if it didn't pass all requirements. Low ceiling so a
+# pathological case can't run away with latency.
+#
+# Override at process start with RIPRAP_MELLEA_MAX_ATTEMPTS. We default
+# to 2 on the local Ollama path (where each attempt is 30-90 s on the
+# Mac) and 3 on remote/vLLM (where attempts are seconds). This caps
+# worst-case demo latency without giving up the principal grounding
+# guarantee — the first-attempt pass rate on the curated probes is >85%.
+def _default_loop_budget() -> int:
+    try:
+        n = int(os.environ.get("RIPRAP_MELLEA_MAX_ATTEMPTS", "0"))
+        if n > 0:
+            return n
+    except ValueError:
+        pass
+    return 2 if os.environ.get("RIPRAP_LLM_PRIMARY", "ollama").lower() == "ollama" else 3
+DEFAULT_LOOP_BUDGET = _default_loop_budget()
+# Number tokens — \b enforces a word boundary so identifier codes like
+# QN1206, B12 (community board), or M14 (bus route) are skipped entirely.
+# Inside QN1206 there's no \b between any chars, so no submatch leaks.
+_NUM_RE = re.compile(r"\b-?\d[\d,]*(?:\.\d+)?\b")
+_CITE_RE = re.compile(r"\[(?P<id>[a-z][a-z0-9_]*)\]")
+# Same trivial-numbers list as the post-hoc verifier — well-known service
+# line numbers, single digits.
+_TRIVIAL_NUMS = {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "100",
+                 "311", "911", "211"}
+def _strip_markdown_for_check(text: str) -> str:
+    """Drop bold markers + citation tags so the numeric scan is clean."""
+    text = re.sub(r"\*\*([^*]+)\*\*", r"\1", text)
+    text = re.sub(r"\[[a-z0-9_]+\]", "", text, flags=re.I)
+    return text
+def _normalize_num(s: str) -> set[str]:
+    forms = {s}
+    no_comma = s.replace(",", "")
+    forms.add(no_comma)
+    if "." in no_comma:
+        forms.add(no_comma.rstrip("0").rstrip("."))
+    return {f for f in forms if f}
+def _haystack(doc_msgs: list[dict]) -> str:
+    return "\n".join(m.get("content", "") for m in doc_msgs)
+def _doc_ids(doc_msgs: list[dict]) -> set[str]:
+    """Each doc message has role like "document <id>"; extract ids."""
+    out = set()
+    for m in doc_msgs:
+        role = m.get("role", "")
+        if role.startswith("document "):
+            out.add(role.split(" ", 1)[1].strip())
+    return out
+# --- the four invariants ---------------------------------------------------
+def _check_no_invented_numbers(doc_msgs: list[dict]):
+    haystack = _haystack(doc_msgs)
+    def _fn(text: str):
+        clean = _strip_markdown_for_check(text)
+        invented = []
+        for n in _NUM_RE.findall(clean):
+            if n in _TRIVIAL_NUMS:
+                continue
+            forms = _normalize_num(n)
+            if not any(f in haystack for f in forms):
+                invented.append(n)
+        return not invented  # pass = no invented numbers
+    return _fn
+def _check_no_placeholder_tokens():
+    def _fn(text: str):
+        bad = []
+        if "[source]" in text.lower():
+            bad.append("[source]")
+        if "<document" in text:
+            bad.append("<document>")
+        if "</document" in text:
+            bad.append("</document>")
+        return not bad
+    return _fn
+def _check_every_claim_cited():
+    """Each non-trivial numeric token must have a [doc_id] somewhere in
+    the same sentence. Sentence boundaries are conservative: a period
+    followed by whitespace, or end of text. This matches how a reader
+    actually attributes claims — the citation can be anywhere in the
+    sentence, not just adjacent to the number."""
+    # Sentence end = `. ` or `.\n` or end-of-string. Question/exclamation
+    # marks rarely appear in these briefings; period is enough.
+    _SENT_END = re.compile(r"\.[\s)]|\.$")
+    def _sentence_span(text: str, pos: int) -> tuple[int, int]:
+        # Walk backwards to the previous sentence terminator.
+        start = 0
+        for m in _SENT_END.finditer(text, 0, pos):
+            start = m.end()
+        # Walk forwards to the next.
+        m = _SENT_END.search(text, pos)
+        end = m.start() + 1 if m else len(text)
+        return start, end
+    def _fn(text: str):
+        clean = re.sub(r"\*\*([^*]+)\*\*", r"\1", text)
+        for m in _NUM_RE.finditer(clean):
+            n = m.group(0)
+            if n in _TRIVIAL_NUMS:
+                continue
+            s, e = _sentence_span(clean, m.start())
+            if not _CITE_RE.search(clean[s:e]):
+                return False
+        return True
+    return _fn
+def _failing_sentences_for_citations(text: str) -> list[str]:
+    """Return the sentences in `text` that contain a non-trivial number
+    but no [doc_id] citation. Used to give the model targeted reroll
+    feedback so it can fix the exact spots that failed."""
+    clean = re.sub(r"\*\*([^*]+)\*\*", r"\1", text)
+    sents = re.split(r"\.[\s)]|\.$", clean)
+    bad = []
+    for s in sents:
+        nums = [n for n in _NUM_RE.findall(s) if n not in _TRIVIAL_NUMS]
+        if nums and not _CITE_RE.search(s):
+            bad.append(s)
+    return bad
+def _check_referenced_doc_ids_exist(doc_msgs: list[dict]):
+    valid = _doc_ids(doc_msgs)
+    def _fn(text: str):
+        cited = {m.group("id") for m in _CITE_RE.finditer(text)}
+        rogue = cited - valid
+        return not rogue
+    return _fn
+# --- main entry point ------------------------------------------------------
+def reconcile_strict(doc_msgs: list[dict],
+                     system_prompt: str,
+                     user_prompt: str = "Write the cited briefing now.",
+                     model: str | None = None,
+                     loop_budget: int = DEFAULT_LOOP_BUDGET,
+                     ollama_options: dict | None = None) -> dict[str, Any]:
+    """Run Granite reconciliation with Mellea rejection sampling.
+    Returns a dict with:
+        paragraph         — final validated text
+        rerolls           — number of resamples (0 = passed first try)
+        requirements_passed — list of requirement names that passed in the
+                              accepted sample
+        requirements_failed — list of requirement names that failed
+                              (empty on accepted sample)
+        elapsed_s         — total seconds including rerolls
+        model             — model id used
+        loop_budget       — configured budget
+    """
+    model = model or DEFAULT_MODEL
+    t0 = time.time()
+    # Per-requirement closures wired with the doc context.
+    # Keep the validator functions in our own table so we can re-run them
+    # on the final paragraph to produce reliable pass/fail metadata for
+    # the report — Mellea's internal validation-result objects vary by
+    # version and aren't great for downstream display.
+    checks = [
+        ("numerics_grounded",
+         "All numbers in the output must appear verbatim in the source documents.",
+         _check_no_invented_numbers(doc_msgs)),
+        ("no_placeholder_tokens",
+         "The output must not contain placeholder tokens like [source] or raw <document> markup.",
+         _check_no_placeholder_tokens()),
+        ("citations_dense",
+         "Every numeric claim must have a [doc_id] citation within ~120 characters.",
+         _check_every_claim_cited()),
+        ("citations_resolve",
+         "Every cited [doc_id] must correspond to a real source document.",
+         _check_referenced_doc_ids_exist(doc_msgs)),
+    ]
+    requirements = [
+        req(desc, validation_fn=simple_validate(fn, reason=name))
+        for name, desc, fn in checks
+    ]
+    session = start_session(backend_name="ollama", model_id=model,
+                            model_options=ollama_options or {})
+    try:
+        # Build the prompt: system + serialized doc context + user task.
+        # Mellea's instruct() takes the whole instruction; we serialize
+        # the doc messages into the description so the haystack is
+        # available to the model the same way it would be via
+        # ollama.chat with role="document <id>" messages.
+        doc_block = "\n\n".join(
+            f"<document id=\"{m['role'].split(' ', 1)[1] if m['role'].startswith('document ') else 'unknown'}\">\n"
+            f"{m['content']}\n</document>"
+            for m in doc_msgs
+        )
+        instruction = (
+            f"{system_prompt}\n\n"
+            f"DOCUMENTS:\n{doc_block}\n\n"
+            f"TASK: {user_prompt}"
+        )
+        result = session.instruct(
+            description=instruction,
+            strategy=RejectionSamplingStrategy(
+                loop_budget=loop_budget,
+                requirements=requirements,
+            ),
+            requirements=requirements,
+            return_sampling_results=True,
+            model_options={"temperature": 0,
+                           "num_ctx": int(os.environ.get("RIPRAP_MELLEA_NUM_CTX", "4096")),
+                           "num_predict": int(os.environ.get("RIPRAP_MELLEA_NUM_PREDICT", "400")),
+                           **(ollama_options or {})},
+        )
+        paragraph = _extract_text(result).strip()
+        n_attempts = _extract_attempts(result)
+        rerolls = max(0, n_attempts - 1)
+    finally:
+        try:
+            session.cleanup()
+        except Exception:
+            pass
+    # Re-run our own checks on the final paragraph for clean pass/fail
+    # metadata. This is what shows up in the report's compliance section.
+    passed: list[str] = []
+    failed: list[str] = []
+    for name, _desc, fn in checks:
+        try:
+            if fn(paragraph):
+                passed.append(name)
+            else:
+                failed.append(name)
+        except Exception as e:
+            log.warning("requirement %s raised: %r", name, e)
+            failed.append(name)
+    return {
+        "paragraph": paragraph,
+        "rerolls": rerolls,
+        "n_attempts": n_attempts,
+        "requirements_total": len(checks),
+        "requirements_passed": passed,
+        "requirements_failed": failed,
+        "elapsed_s": round(time.time() - t0, 2),
+        "model": model,
+        "loop_budget": loop_budget,
+    }
+def reconcile_strict_streaming(
+    doc_msgs: list[dict],
+    system_prompt: str,
+    user_prompt: str = "Write the cited briefing now.",
+    model: str | None = None,
+    loop_budget: int = DEFAULT_LOOP_BUDGET,
+    ollama_options: dict | None = None,
+    on_token=None,
+    on_attempt_end=None,
+) -> dict[str, Any]:
+    """Hand-rolled rejection sampler that *streams* each attempt to the
+    user instead of waiting silently for Mellea to validate behind the
+    scenes. Same compliance contract as reconcile_strict — runs the
+    same four checks, accepts the first attempt that passes, falls back
+    to the last attempt if the budget is exhausted.
+    Callbacks (both optional, both fire on the calling thread):
+      on_token(delta: str, attempt_idx: int)
+        — fires for every token chunk as it arrives from Granite.
+      on_attempt_end(attempt_idx: int, passed: list[str], failed: list[str])
+        — fires after each attempt with its per-requirement outcome.
+        The frontend uses this to render reroll banners + reset the
+        briefing buffer when a new attempt begins.
+    """
+    model = model or DEFAULT_MODEL
+    t0 = time.time()
+    checks = [
+        ("numerics_grounded",
+         _check_no_invented_numbers(doc_msgs)),
+        ("no_placeholder_tokens",
+         _check_no_placeholder_tokens()),
+        ("citations_dense",
+         _check_every_claim_cited()),
+        ("citations_resolve",
+         _check_referenced_doc_ids_exist(doc_msgs)),
+    ]
+    base_messages = doc_msgs + [
+        {"role": "system", "content": system_prompt},
+        {"role": "user", "content": user_prompt},
+    ]
+    # num_ctx 4096 fits a typical trimmed prompt (≈700 system + ≈2500 docs);
+    # num_predict 400 caps the 4-section briefing at ≈300-350 tokens. With
+    # RIPRAP_TRIM_DOCS=1 and the planner picking 6-9 specialists, the 4096
+    # window has been sufficient on every probe; the previous 6144/600 was
+    # sized for the *untrimmed* fan-out and was forcing Ollama to grow the
+    # KV cache (33% more memory + a full re-init) every Mellea attempt.
+    # Override with RIPRAP_MELLEA_NUM_CTX / RIPRAP_MELLEA_NUM_PREDICT.
+    base_opts = {"temperature": 0,
+                 "num_ctx": int(os.environ.get("RIPRAP_MELLEA_NUM_CTX", "4096")),
+                 "num_predict": int(os.environ.get("RIPRAP_MELLEA_NUM_PREDICT", "400")),
+                 **(ollama_options or {})}
+    paragraph = ""
+    last_passed: list[str] = []
+    last_failed: list[str] = [name for name, _ in checks]
+    last_paragraph = ""
+    attempts = 0
+    for attempt_idx in range(loop_budget):
+        attempts = attempt_idx + 1
+        # On reroll, append a tight feedback message naming what failed AND
+        # the specific failing sentences (so the model knows exactly which
+        # ones to fix). Granite responds well to surgical corrections.
+        messages = list(base_messages)
+        if attempt_idx > 0 and last_failed:
+            feedback = [
+                f"Your previous draft failed: {', '.join(last_failed)}.",
+            ]
+            if "citations_dense" in last_failed and last_paragraph:
+                bad = _failing_sentences_for_citations(last_paragraph)
+                if bad:
+                    feedback.append(
+                        "Specific sentences with uncited numbers:"
+                    )
+                    for s in bad[:3]:
+                        feedback.append(f"  - {s.strip()}")
+                    feedback.append(
+                        "Add a [doc_id] citation at the end of each. "
+                        "Re-emit the FULL briefing."
+                    )
+            else:
+                feedback.append(
+                    "Re-write so every sentence containing a number ends "
+                    "with a [doc_id] citation."
+                )
+            messages.append({"role": "user", "content": "\n".join(feedback)})
+        chunks: list[str] = []
+        for chunk in llm.chat(model=model, messages=messages,
+                                 stream=True, options=base_opts):
+            delta = (chunk.get("message") or {}).get("content") or ""
+            if delta:
+                chunks.append(delta)
+                if on_token is not None:
+                    try:
+                        on_token(delta, attempt_idx)
+                    except Exception:
+                        log.exception("on_token callback raised")
+        paragraph = "".join(chunks).strip()
+        passed: list[str] = []
+        failed: list[str] = []
+        for name, fn in checks:
+            try:
+                (passed if fn(paragraph) else failed).append(name)
+            except Exception as e:
+                log.warning("requirement %s raised: %r", name, e)
+                failed.append(name)
+        last_passed = passed
+        last_failed = failed
+        last_paragraph = paragraph
+        if on_attempt_end is not None:
+            try:
+                on_attempt_end(attempt_idx, passed, failed)
+            except Exception:
+                log.exception("on_attempt_end callback raised")
+        if not failed:
+            break
+    return {
+        "paragraph": paragraph,
+        "rerolls": max(0, attempts - 1),
+        "n_attempts": attempts,
+        "requirements_total": len(checks),
+        "requirements_passed": last_passed,
+        "requirements_failed": last_failed,
+        "elapsed_s": round(time.time() - t0, 2),
+        "model": model,
+        "loop_budget": loop_budget,
+    }
+def _extract_text(result) -> str:
+    """SamplingResult / ModelOutputThunk text extraction."""
+    for attr in ("sample", "result", "value", "content"):
+        v = getattr(result, attr, None)
+        if v is not None:
+            if hasattr(v, "value"):
+                return str(v.value)
+            return str(v)
+    return str(result)
+def _extract_attempts(result) -> int:
+    """How many samples were drawn before stopping."""
+    for attr in ("n_attempts", "num_attempts", "attempts"):
+        v = getattr(result, attr, None)
+        if isinstance(v, int):
+            return v
+    samples = getattr(result, "sample_validations", None) or getattr(result, "samples", None)
+    if isinstance(samples, list):
+        return len(samples)
+    return 1
+def _extract_pass_fail(result) -> tuple[list[str], list[str]]:
+    """Best-effort extraction of which requirements passed on the
+    accepted sample. mellea v0.4 exposes sample_validations as a list
+    where each entry is itself a list of (Requirement, ValidationResult)
+    tuples — duck-type defensively.
+    """
+    validations = getattr(result, "sample_validations", None)
+    if not validations:
+        return [], []
+    last = validations[-1] if isinstance(validations, list) else validations
+    passed: list[str] = []
+    failed: list[str] = []
+    items = last if isinstance(last, list) else [last]
+    for item in items:
+        # Item might be (Requirement, ValidationResult) tuple, or a single
+        # ValidationResult, or a Requirement, depending on mellea version.
+        ok = None
+        descr = ""
+        if isinstance(item, tuple) and len(item) >= 2:
+            descr = str(item[0])[:80]
+            v = item[1]
+            ok = bool(getattr(v, "passed", getattr(v, "is_valid",
+                          getattr(v, "result", False))))
+        else:
+            descr = str(getattr(item, "requirement", item))[:80]
+            ok = bool(getattr(item, "passed", getattr(item, "is_valid",
+                          getattr(item, "result", False))))
+        if ok:
+            passed.append(descr)
+        else:
+            failed.append(descr)
+    return passed, failed

app/planner.py ADDED Viewed

	@@ -0,0 +1,250 @@

+"""Riprap query planner — Granite 4.1 routes a natural-language query
+to one of several intents and selects which specialists to invoke.
+This is the agentic kernel: instead of running every specialist on
+every query, the planner reads the query and emits a structured plan.
+The executor then runs only the relevant specialists, in parallel
+where dependencies permit.
+Output is a single JSON object with a fixed schema (see PLAN_SCHEMA).
+We use Ollama's `format='json'` constrained-decoding mode so Granite
+4.1 cannot emit malformed structure. A deterministic post-validator
+sanity-checks the plan against the supported intents and specialists.
+"""
+from __future__ import annotations
+import json
+import logging
+import os
+from dataclasses import dataclass
+from typing import Any
+from app import llm
+log = logging.getLogger("riprap.planner")
+# Routing is a small structured-output task; speed wins over depth here.
+# Pin to the 3b variant explicitly — even if a deployment pulls 8b for
+# reconciliation, the planner stays small to keep TTFB low.
+OLLAMA_MODEL = os.environ.get("RIPRAP_PLANNER_MODEL",
+                              os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:3b"))
+# ---- Plan schema -----------------------------------------------------------
+#
+# The set of intents Riprap currently supports. Every plan picks exactly
+# one; the executor maps intent → action graph in app/intents/.
+INTENTS = {
+    "single_address": (
+        "Use ONLY when the query contains a specific street ADDRESS — "
+        "house number + street name (e.g. '116-50 Sutphin Blvd', '350 5th "
+        "Ave Manhattan'). If the query names only a neighborhood or "
+        "borough without a house number, the intent is 'neighborhood', "
+        "even if phrased as a yes/no question like 'is X at risk?' or "
+        "'is X safe?'."
+    ),
+    "neighborhood": (
+        "Use when the query names a NEIGHBORHOOD or BOROUGH with no "
+        "specific street address (e.g. 'Brighton Beach', 'Carroll "
+        "Gardens', 'Brooklyn', 'is Red Hook at risk?', 'show me Hollis "
+        "flooding'). Skip geocoding; resolve to NTA polygon(s) and run "
+        "polygon-level specialists."
+    ),
+    "live_now": (
+        "User asked about CURRENT CONDITIONS in NYC (e.g. 'is there "
+        "flooding right now', 'what's the surge tonight'). Skip historic "
+        "and modeled specialists; focus on live-data specialists."
+    ),
+    "development_check": (
+        "User asked about CURRENT/IN-PROGRESS CONSTRUCTION OR DEVELOPMENT "
+        "in a place, with implicit interest in flood risk for those projects "
+        "(e.g. 'what are they building in Gowanus and is it risky?', "
+        "'show me new construction in flood zones', 'are there projects "
+        "underway in Red Hook?'). Resolve target to NTA polygon, pull active "
+        "DOB construction permits inside it, cross-reference each project "
+        "with Sandy + DEP flood layers, return a flagged-projects list."
+    ),
+}
+SPECIALISTS = {
+    # name: (description, which intents may invoke it)
+    "geocode":       ("Resolve address text to lat/lon via NYC DCP Geosearch.",     ["single_address"]),
+    "nta_resolve":   ("Resolve a neighborhood or borough name to NTA polygon(s).",  ["neighborhood"]),
+    "sandy":         ("2012 Sandy inundation extent (point-in-polygon or % of NTA).", ["single_address", "neighborhood"]),
+    "dep_stormwater":("DEP Stormwater Maps — 3 modeled scenarios.",                ["single_address", "neighborhood"]),
+    "floodnet":      ("Live FloodNet ultrasonic sensors + trigger history.",      ["single_address", "neighborhood", "live_now"]),
+    "nyc311":        ("NYC 311 flood-related complaints in buffer or polygon.",    ["single_address", "neighborhood"]),
+    "noaa_tides":    ("Live NOAA Battery / Kings Pt / Sandy Hook water level.",   ["single_address", "neighborhood", "live_now"]),
+    "nws_alerts":    ("Live NWS active flood-relevant alerts at point.",           ["single_address", "neighborhood", "live_now"]),
+    "nws_obs":       ("Live NWS hourly precip from nearest ASOS station.",         ["single_address", "neighborhood", "live_now"]),
+    "ttm_forecast":  ("Granite TTM r2 surge-residual nowcast at the Battery.",     ["single_address", "neighborhood", "live_now"]),
+    "microtopo":     ("LiDAR-derived terrain (HAND, TWI, percentile) at point or aggregated over polygon.", ["single_address", "neighborhood"]),
+    "ida_hwm":       ("USGS Hurricane Ida 2021 high-water marks proximity.",       ["single_address", "neighborhood"]),
+    "prithvi":       ("Prithvi-EO 2.0 Hurricane Ida 2021 satellite flood polygons.", ["single_address", "neighborhood"]),
+    "rag":           ("Retrieve relevant agency-report passages over the policy corpus.", ["single_address", "neighborhood", "development_check"]),
+    "dob_permits":   ("Active NYC DOB construction permits inside a polygon, each cross-referenced with Sandy + DEP flood scenarios. Use for 'what are they building' / 'projects in progress' queries.", ["development_check"]),
+}
+@dataclass
+class Plan:
+    intent: str
+    targets: list[dict[str, str]]
+    specialists: list[str]
+    rationale: str
+PLAN_SCHEMA_DESC = """The output JSON must have exactly these keys:
+{
+  "intent": one of [single_address, neighborhood, live_now, development_check],
+  "targets": [
+    // one or more target objects, each with:
+    //   {"type": "address", "text": "<address text>"}    when intent=single_address
+    //   {"type": "nta",     "text": "<neighborhood>"}    when intent=neighborhood
+    //   {"type": "borough", "text": "<borough>"}         when intent=neighborhood (boro-wide)
+    //   {"type": "nyc",     "text": "NYC"}               when intent=live_now (no specific place)
+  ],
+  "specialists": [list of specialist names from the SPECIALISTS catalog the executor should run],
+  "rationale": "<one sentence: why this intent + this set of specialists>"
+}
+Hard rules:
+- Pick ONE intent only.
+- Specialists must be drawn from the catalog and must be applicable to the chosen intent.
+- For intent=single_address: ALWAYS include "geocode". Typically include all static + live specialists.
+- For intent=neighborhood: ALWAYS include "nta_resolve". Skip "geocode". Include polygon-capable specialists.
+- For intent=live_now: ONLY live specialists. Skip historic/modeled (sandy, dep_*, ida_hwm, prithvi).
+- For intent=development_check: ALWAYS include "nta_resolve" AND "dob_permits". Sandy + DEP are also useful so the model can compare project locations to flood layers.
+- IMPORTANT — TARGETS: extract neighborhood/borough names directly from the query text. If the query says "in Gowanus", "what about Brighton Beach", "around Carroll Gardens", etc., the target MUST be {"type": "nta", "text": "<the place name>"}. Use {"type": "nyc"} ONLY when the query mentions NYC as a whole and no specific place. Failing to extract a place name will cause the executor to give up — be explicit.
+- "targets" is a list because the user may name multiple places (e.g. "compare Brighton Beach and Coney Island").
+- "rationale" is one short sentence — what your reasoning was.
+"""
+SYSTEM_PROMPT = f"""You are Riprap's query planner. You read a user's natural-language flood-risk query and emit a structured execution plan.
+You do NOT have access to any data. You only decide which intent fits the query and which specialists are relevant. Another component (the executor) will run the specialists.
+Available intents:
+{chr(10).join(f"  - {k}: {v}" for k, v in INTENTS.items())}
+Available specialists (and which intents they apply to):
+{chr(10).join(f"  - {name}: {desc} (intents: {', '.join(intents)})" for name, (desc, intents) in SPECIALISTS.items())}
+{PLAN_SCHEMA_DESC}
+Output ONLY the JSON object. No commentary, no markdown."""
+# ---- Planner call ----------------------------------------------------------
+def plan(query: str, model: str = OLLAMA_MODEL, on_token=None) -> Plan:
+    """Ask Granite 4.1 to plan a query. Returns a validated Plan.
+    If on_token is provided, the planner runs in streaming mode and
+    on_token(delta) is called for each chunk of the JSON output as
+    Granite generates. The streaming endpoint uses this to show the
+    agent's reasoning forming live in the UI.
+    """
+    messages = [
+        {"role": "system", "content": SYSTEM_PROMPT},
+        {"role": "user",   "content": query},
+    ]
+    if on_token is None:
+        resp = llm.chat(model=model, messages=messages,
+                           format="json", options={"temperature": 0})
+        raw = resp["message"]["content"].strip()
+    else:
+        chunks: list[str] = []
+        for chunk in llm.chat(model=model, messages=messages,
+                                 format="json", stream=True,
+                                 options={"temperature": 0}):
+            delta = (chunk.get("message") or {}).get("content") or ""
+            if delta:
+                chunks.append(delta)
+                on_token(delta)
+        raw = "".join(chunks).strip()
+    log.info("planner raw: %s", raw[:400])
+    try:
+        d = json.loads(raw)
+    except json.JSONDecodeError as e:
+        raise ValueError(f"planner emitted non-JSON: {raw!r}") from e
+    return _validate(d, raw_query=query)
+def _validate(d: dict[str, Any], raw_query: str) -> Plan:
+    """Defensive parse + sanitize. The model might pick an invalid intent
+    or a specialist that isn't applicable; fall back to single_address
+    with the raw query as the address (the most common case)."""
+    intent = d.get("intent")
+    if intent not in INTENTS:
+        log.warning("planner picked invalid intent %r; defaulting to single_address", intent)
+        intent = "single_address"
+    raw_targets = d.get("targets") or []
+    targets: list[dict[str, str]] = []
+    for t in raw_targets:
+        if not isinstance(t, dict):
+            continue
+        t_type = t.get("type")
+        t_text = (t.get("text") or "").strip()
+        if not t_text or t_type not in ("address", "nta", "borough", "nyc"):
+            continue
+        targets.append({"type": t_type, "text": t_text})
+    if not targets:
+        # Reasonable fallback: assume the raw query IS the target
+        if intent == "single_address":
+            targets = [{"type": "address", "text": raw_query}]
+        elif intent == "neighborhood":
+            targets = [{"type": "nta", "text": raw_query}]
+        else:
+            targets = [{"type": "nyc", "text": "NYC"}]
+    raw_specialists = d.get("specialists") or []
+    specialists: list[str] = []
+    for s in raw_specialists:
+        if isinstance(s, str) and s in SPECIALISTS:
+            _, applicable = SPECIALISTS[s]
+            if intent in applicable:
+                specialists.append(s)
+    # Enforce a floor: each intent has canonical specialists that should
+    # always run. The planner picks ADDITIONS; we ensure the minimum.
+    required = _required_specialists(intent)
+    added = [s for s in required if s not in specialists]
+    if added:
+        log.info("planner missed required %s for intent=%s; adding", added, intent)
+        specialists = list(dict.fromkeys(specialists + required))
+    if not specialists:
+        specialists = _default_specialists(intent)
+    rationale = (d.get("rationale") or "").strip() or "(no rationale provided)"
+    return Plan(intent=intent, targets=targets, specialists=specialists, rationale=rationale)
+def _required_specialists(intent: str) -> list[str]:
+    """Floor: specialists that are ALWAYS run for an intent regardless of
+    what the planner emitted. Captures load-bearing signals the planner
+    sometimes forgets (sandy / dep for neighborhood; geocode for address)."""
+    if intent == "single_address":
+        return ["geocode", "sandy", "dep_stormwater", "microtopo"]
+    if intent == "neighborhood":
+        return ["nta_resolve", "sandy", "dep_stormwater", "nyc311"]
+    if intent == "live_now":
+        return ["nws_alerts", "noaa_tides"]
+    if intent == "development_check":
+        return ["nta_resolve", "dob_permits", "sandy", "dep_stormwater"]
+    return []
+def _default_specialists(intent: str) -> list[str]:
+    if intent == "single_address":
+        return ["geocode", "sandy", "dep_stormwater", "floodnet", "nyc311",
+                "noaa_tides", "nws_alerts", "nws_obs", "ttm_forecast",
+                "microtopo", "ida_hwm", "prithvi", "rag"]
+    if intent == "neighborhood":
+        return ["nta_resolve", "sandy", "dep_stormwater", "nyc311",
+                "microtopo", "rag"]
+    if intent == "live_now":
+        return ["noaa_tides", "nws_alerts", "nws_obs", "ttm_forecast", "floodnet"]
+    return []

app/rag.py CHANGED Viewed

@@ -12,6 +12,7 @@ The index is small (~1k chunks across 5 PDFs).
 from __future__ import annotations
 import logging
 import re
 from dataclasses import dataclass
 from pathlib import Path
@@ -102,6 +103,18 @@ def _chunks_from_pdf(path: Path, target_chars: int = 700) -> list[Chunk]:
 _INDEX: dict | None = None
 def _ensure_index():
@@ -132,8 +145,28 @@ def _ensure_index():
     return _INDEX
 def warm():
     _ensure_index()
 def retrieve(query: str, k: int = 4, min_score: float = 0.30) -> list[dict]:
@@ -142,19 +175,57 @@ def retrieve(query: str, k: int = 4, min_score: float = 0.30) -> list[dict]:
         return []
     qv = idx["model"].encode([query], convert_to_numpy=True,
                              normalize_embeddings=True).astype("float32")
-    # cosine similarity (vectors are L2-normalized)
     sims = (idx["embs"] @ qv.T).ravel()
-    top = np.argsort(-sims)[:k * 3]  # over-fetch then de-dupe per doc
-    out: list[dict] = []
-    seen_per_doc: dict[str, int] = {}
     for i in top:
         if sims[i] < min_score:
             continue
         c = idx["chunks"][i]
-        if seen_per_doc.get(c.doc_id, 0) >= 1:  # at most 1 chunk per doc
             continue
-        seen_per_doc[c.doc_id] = seen_per_doc.get(c.doc_id, 0) + 1
-        out.append({
             "doc_id": c.doc_id,
             "title": c.title,
             "citation": c.citation,
@@ -163,6 +234,6 @@ def retrieve(query: str, k: int = 4, min_score: float = 0.30) -> list[dict]:
             "text": c.text,
             "score": float(sims[i]),
         })
-        if len(out) >= k:
             break
-    return out

 from __future__ import annotations
 import logging
+import os
 import re
 from dataclasses import dataclass
 from pathlib import Path
 _INDEX: dict | None = None
+_RERANKER = None  # lazy CrossEncoder
+# Reranker switch: when "1", retrieve() over-fetches K*5 candidates without
+# the per-doc dedup, scores them via the Granite Embedding Reranker R2
+# cross-encoder, then dedups to K. Falls back to the baseline ranker when
+# disabled. See experiments/03_granite_reranker/RESULTS.md for the
+# reasoning behind inverting dedup vs rerank.
+_RERANKER_ENABLE = os.environ.get("RIPRAP_RERANKER_ENABLE", "").lower() in ("1", "true", "yes")
+_RERANKER_MODEL_NAME = os.environ.get(
+    "RIPRAP_RERANKER_MODEL",
+    "ibm-granite/granite-embedding-reranker-english-r2",
+)
 def _ensure_index():
     return _INDEX
+def _ensure_reranker():
+    """Lazy-load the cross-encoder. Returns None if disabled or load fails;
+    callers fall back to the baseline ranker silently."""
+    global _RERANKER
+    if not _RERANKER_ENABLE:
+        return None
+    if _RERANKER is not None:
+        return _RERANKER
+    try:
+        from sentence_transformers import CrossEncoder
+        log.info("rag: loading reranker %s", _RERANKER_MODEL_NAME)
+        _RERANKER = CrossEncoder(_RERANKER_MODEL_NAME)
+        log.info("rag: reranker ready")
+    except Exception:
+        log.exception("rag: reranker load failed; falling back to baseline")
+        _RERANKER = False  # sentinel: don't retry every call
+    return _RERANKER or None
 def warm():
     _ensure_index()
+    _ensure_reranker()
 def retrieve(query: str, k: int = 4, min_score: float = 0.30) -> list[dict]:
         return []
     qv = idx["model"].encode([query], convert_to_numpy=True,
                              normalize_embeddings=True).astype("float32")
     sims = (idx["embs"] @ qv.T).ravel()
+    reranker = _ensure_reranker()
+    if reranker is not None:
+        # Over-fetch K*5 candidates (no per-doc dedup yet), rerank, then
+        # dedup to K. This keeps high-relevance chunks alive long enough
+        # for the cross-encoder to see them — the legacy path's
+        # dedup-before-rank threw them away.
+        cand_n = min(len(idx["chunks"]), max(k * 5, 20))
+        top_idx = np.argsort(-sims)[:cand_n]
+        candidates = [(int(i), idx["chunks"][int(i)],
+                       float(sims[int(i)])) for i in top_idx
+                      if float(sims[int(i)]) >= min_score]
+        if not candidates:
+            return []
+        pairs = [[query, c.text] for _, c, _ in candidates]
+        scores = reranker.predict(pairs)
+        ranked = sorted(zip(candidates, scores, strict=True),
+                        key=lambda x: float(x[1]), reverse=True)
+        out: list[dict] = []
+        seen_per_doc: dict[str, int] = {}
+        for (_i, c, retr_score), rerank_score in ranked:
+            if seen_per_doc.get(c.doc_id, 0) >= 1:
+                continue
+            seen_per_doc[c.doc_id] = 1
+            out.append({
+                "doc_id": c.doc_id,
+                "title": c.title,
+                "citation": c.citation,
+                "file": c.file,
+                "page": c.page,
+                "text": c.text,
+                "score": float(rerank_score),
+                "retriever_score": retr_score,
+            })
+            if len(out) >= k:
+                break
+        return out
+    # Baseline ranker (unchanged behaviour when reranker disabled)
+    top = np.argsort(-sims)[:k * 3]
+    out2: list[dict] = []
+    seen_per_doc2: dict[str, int] = {}
     for i in top:
         if sims[i] < min_score:
             continue
         c = idx["chunks"][i]
+        if seen_per_doc2.get(c.doc_id, 0) >= 1:
             continue
+        seen_per_doc2[c.doc_id] = 1
+        out2.append({
             "doc_id": c.doc_id,
             "title": c.title,
             "citation": c.citation,
             "text": c.text,
             "score": float(sims[i]),
         })
+        if len(out2) >= k:
             break
+    return out2

app/reconcile.py CHANGED Viewed

@@ -21,41 +21,73 @@ import os
 import re
 from typing import Any
-import ollama
 log = logging.getLogger("riprap.reconcile")
-OLLAMA_MODEL = os.environ.get("HELIOS_NYC_OLLAMA_MODEL", "granite4.1:3b")
-# Granite auto-prepends its own grounded-generation system prompt when the
-# message list contains "document" roles. This adds *additional* rules.
-EXTRA_SYSTEM_PROMPT = """You are Riprap's grounded reconciler. Produce a SHORT factual paragraph (4-7 sentences) summarising flood risk at a NYC address. Use ONLY information from the documents provided.
-Citation format — STRICT:
-- After every factual or numerical claim, cite the originating document by its doc_id in square brackets, e.g. [sandy] or [floodnet].
-- Use square brackets [ and ]. Never parentheses, never the word "source".
-- A claim drawn from multiple documents may carry multiple tags, e.g. [sandy][floodnet].
-Hard rules — non-negotiable:
 - Copy numerical values verbatim from documents. Do not round.
-- Do NOT name a specific weather event (Hurricane Sandy, Ida, Henri, Ophelia, etc.) unless THIS document set explicitly mentions that event applies to THIS address. The fact that a RAG passage discusses an event in passing is NOT licence to apply it to the address. If you mention an event, you must cite the specific document supporting that the event affected this address.
-- Do NOT invent dates, sensor IDs, hazard categories, or street/neighborhood names beyond what the documents contain.
-- For RAG documents whose id starts with `rag_`: paraphrase the retrieved passage at the policy / agency level — talk about what the agency report SAYS about flood risk in general or for this asset class — do not assert findings the report did not make about this specific address. Cite with the doc_id.
-- Stay neutral. No editorialising. No future speculation.
-- If no documents are present, output exactly: No grounded data available for this address.
-Microtopo interpretation hint:
-- A LOW percentile (e.g. 5%) means the address is at a topographic LOW POINT in its surroundings — water tends to pool there. A HIGH percentile (e.g. 80%) means the address sits on relatively HIGH ground. Get this direction right or omit the percentile.
 """
 # ---- Hallucination guardrail: numeric grounding post-check -----------------
-_NUM_RE = re.compile(r"-?\d[\d,]*(?:\.\d+)?")
 _SENTENCE_END_RE = re.compile(r"(?<=[.!?])\s+(?=[A-Z\[])")
-# Strings that are too generic to be useful as grounding evidence; ignore
-# them when matching numeric tokens.
-_TRIVIAL_NUMS = {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "100"}
 def _normalize_num(s: str) -> set[str]:
@@ -77,44 +109,94 @@ def _docs_corpus(doc_msgs: list[dict]) -> str:
     return "\n".join(m.get("content", "") for m in doc_msgs)
 def verify_paragraph(paragraph: str, doc_msgs: list[dict]) -> tuple[str, list[dict]]:
     """Drop sentences whose numeric tokens don't appear in any source doc.
-    Returns (clean_paragraph, dropped_sentences_with_reason). Sentences are
-    split on sentence-end punctuation followed by whitespace + a capital
-    letter or '['. The bracketed-citation tags `[doc_id]` and document
-    roles in the source message list are excluded from the haystack so we
-    don't accidentally accept fabricated values that happen to be
-    substrings of doc_ids.
     """
     haystack = _docs_corpus(doc_msgs)
-    sentences = _SENTENCE_END_RE.split(paragraph.strip())
-    kept: list[str] = []
     dropped: list[dict] = []
-    for sent in sentences:
-        sent_stripped = sent.strip()
-        if not sent_stripped:
-            continue
-        # remove citation tags before extracting numbers (they're not claims)
-        sent_no_cites = re.sub(r"\[[a-z0-9_]+\]", "", sent_stripped, flags=re.I)
-        nums = _NUM_RE.findall(sent_no_cites)
-        ungrounded = []
-        for n in nums:
-            if n in _TRIVIAL_NUMS:
                 continue
-            forms = _normalize_num(n)
-            if not any(f in haystack for f in forms):
-                ungrounded.append(n)
-        if ungrounded:
-            dropped.append({"sentence": sent_stripped, "ungrounded_numbers": ungrounded})
-            log.warning("dropped ungrounded sentence: %r (nums: %s)", sent_stripped, ungrounded)
-            continue
-        kept.append(sent_stripped)
-    cleaned = " ".join(kept).strip()
     if not cleaned:
         cleaned = "Could not produce a verifiable summary; see the data panels."
     return cleaned, dropped
@@ -127,15 +209,194 @@ def _doc_message(doc_id: str, body_lines: list[str]) -> dict:
     return {"role": f"document {doc_id}", "content": "\n".join(body_lines)}
 def build_documents(state: dict[str, Any]) -> list[dict]:
     """Build Granite-native document-role messages, gated so absent
-    specialists emit no document at all."""
     docs: list[dict] = []
-    geo = state.get("geocode")
     if geo:
         body = [
-            f"Source: NYC DCP Geosearch (geosearch.planninglabs.nyc).",
             f"Resolved address: {geo['address']}.",
             f"Borough: {geo.get('borough') or 'unknown'}.",
             f"Coordinates: {geo['lat']:.5f} N, {geo['lon']:.5f} W.",
@@ -148,7 +409,7 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
     # 2012 extent. Granite has a strong training prior associating NYC + flood
     # + Brooklyn with Sandy and will misread "outside" as "inside" if given
     # the chance — silence-over-confabulation rules.
-    if state.get("sandy") is True:
         body = [
             "Source: NYC Sandy Inundation Zone (NYC OpenData 5xsi-dfpx, "
             "empirical extent of areas flooded by Hurricane Sandy in 2012).",
@@ -159,18 +420,18 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
         docs.append(_doc_message("sandy", body))
     dep = state.get("dep")
-    if dep:
         for scen, info in dep.items():
             if info.get("depth_class", 0) > 0:
                 body = [
                     f"Source: {info['citation']}.",
-                    f"Address inside scenario footprint: yes.",
                     f"Modeled depth class: {info['depth_label']}.",
                 ]
                 docs.append(_doc_message(scen, body))
     fn = state.get("floodnet")
-    if fn and fn.get("n_sensors", 0) > 0:
         body = [
             "Source: FloodNet NYC ultrasonic depth sensor network (api.floodnet.nyc).",
             f"Sensors within {fn['radius_m']} m: {fn['n_sensors']}.",
@@ -187,7 +448,7 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
         docs.append(_doc_message("floodnet", body))
     pw = state.get("prithvi_water")
-    if pw and pw.get("nearest_distance_m") is not None:
         body = [
             "Source: Prithvi-EO 2.0 (300M params, NASA/IBM, Apache-2.0). "
             "Sen1Floods11 fine-tune for water/flood semantic segmentation, "
@@ -211,7 +472,7 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
         docs.append(_doc_message("prithvi_water", body))
     ida = state.get("ida_hwm")
-    if ida and (ida.get("n_within_radius") or 0) > 0:
         body = [
             "Source: USGS STN Hurricane Ida 2021 high-water marks (Event 312, NY State).",
             f"USGS HWMs within {ida['radius_m']} m: {ida['n_within_radius']}.",
@@ -225,7 +486,7 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
         docs.append(_doc_message("ida_hwm", body))
     mt = state.get("microtopo")
-    if mt:
         # Compute a categorical topographic position so Granite can't flip
         # the directional reading of the percentile.
         p200 = mt["rel_elev_pct_200m"]
@@ -280,7 +541,7 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
             )
         docs.append(_doc_message("microtopo", body))
-    rag_hits = state.get("rag") or []
     for h in rag_hits:
         body = [
             f"Source: {h['citation']}, page {h['page']}.",
@@ -288,8 +549,275 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
         ]
         docs.append(_doc_message(h["doc_id"], body))
     nyc311 = state.get("nyc311")
-    if nyc311 and nyc311.get("n", 0) > 0:
         body = [
             "Source: NYC 311 service requests (Socrata erm2-nwe9, 2010-present).",
             f"311 flood-related complaints within {nyc311['radius_m']} m, last {nyc311['years']} years: {nyc311['n']}.",
@@ -302,13 +830,145 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
             body.append(f"Per-year counts: {yrs}.")
         docs.append(_doc_message("nyc311", body))
     return docs
 def reconcile(state: dict[str, Any], model: str = OLLAMA_MODEL,
-              return_audit: bool = False):
     """Run Granite reconciliation, then drop sentences with ungrounded numbers.
     If return_audit=True, returns (paragraph, audit_dict) where audit_dict
     has 'raw' (Granite's original output) and 'dropped' (list of dropped
     sentences with their ungrounded numeric tokens).
@@ -318,21 +978,28 @@ def reconcile(state: dict[str, Any], model: str = OLLAMA_MODEL,
         msg = "No grounded data available for this address."
         return (msg, {"raw": msg, "dropped": []}) if return_audit else msg
-    messages = (
-        doc_msgs
-        + [
-            {"role": "system", "content": EXTRA_SYSTEM_PROMPT},
-            {"role": "user", "content": "Write the cited paragraph now."},
-        ]
-    )
-    resp = ollama.chat(
-        model=model,
-        messages=messages,
-        options={"temperature": 0, "num_ctx": 8192},
-    )
-    raw = resp["message"]["content"].strip()
-    cleaned, dropped = verify_paragraph(raw, doc_msgs)
     if return_audit:
         return cleaned, {"raw": raw, "dropped": dropped}
     return cleaned

 import re
 from typing import Any
+from app import llm
 log = logging.getLogger("riprap.reconcile")
+# Reconciliation is the synthesis step — citation discipline + structured
+# output adherence both improve materially with the 8b variant.
+# RIPRAP_RECONCILER_MODEL is the canonical name; RIPRAP_OLLAMA_MODEL is
+# kept as a back-compat fallback. Default is now 8b on production
+# deployments (HF Space ships granite4.1:8b in the container).
+OLLAMA_MODEL = os.environ.get("RIPRAP_RECONCILER_MODEL",
+                              os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:8b"))
+CITATION_NOAA_TIDES = ("NOAA CO-OPS Tides & Currents API "
+                       "(api.tidesandcurrents.noaa.gov), 6-min cadence")
+CITATION_NWS_ALERTS = ("NWS Public Alerts API (api.weather.gov/alerts/active), "
+                       "filtered to flood-relevant event types")
+CITATION_NWS_OBS = ("NWS Station Observations API "
+                    "(api.weather.gov/stations/<id>/observations/latest)")
+CITATION_TTM_FORECAST = (
+    "Granite TimeSeries TTM r2 (Ekambaram et al. 2024, NeurIPS) — "
+    "ibm-granite/granite-timeseries-ttm-r2 via granite-tsfm. "
+    "Zero-shot forecast of the surge residual (observed minus astronomical "
+    "tide) at the Battery, NY (NOAA station 8518750). 6-min cadence, "
+    "~51 h context, ~9.6 h horizon."
+)
+# The Ollama chat template auto-prepends Granite's own grounded-generation
+# system suffix once the message list contains role="document" entries.
+# This text is OUR additional system prompt, prepended to that suffix.
+EXTRA_SYSTEM_PROMPT = """Write a flood-exposure briefing for an NYC address. Use ONLY the facts in the provided documents.
+Output this markdown skeleton verbatim, filling each `<...>` with content drawn only from the documents. **Every sentence that contains a number MUST end with a `[doc_id]` citation — including derived measurements (TWI, percentile, ratio).** Repeat the source citation if the value is reused. Bold at most one phrase per section using `**...**`. Omit any section whose supporting facts are absent from the documents.
+```
+**Status.**
+<one sentence: dominant exposure signal(s) for this address, citing the strongest documents>.
+**Empirical evidence.**
+<1-3 sentences citing observed flood evidence: Sandy from [sandy], 311 counts from [nyc311], FloodNet from [floodnet], Ida HWMs from [ida_hwm], Prithvi polygons from [prithvi_water]>.
+**Modeled scenarios.**
+<1-2 sentences citing modeled flooding from [dep_*] and terrain from [microtopo] (HAND, TWI, percentile). When a [floodnet_forecast_*] doc is present, add one sentence on the forecast event recurrence at the cited sensor>.
+**Policy context.**
+<1 sentence per RAG hit, citing the agency name and [rag_*]>.
+```
+Constraints:
 - Copy numerical values verbatim from documents. Do not round.
+- Name a specific weather event only if a document explicitly applies it to this address.
+- For RAG documents (doc_ids starting with `rag_`): describe what the report SAYS at the policy or asset-class level. Do not assert findings the report did not make about this specific address.
+- Microtopo percentile direction: a LOW percentile means topographic LOW POINT (water pools); HIGH percentile means HIGH GROUND. State the direction correctly or omit the percentile.
+- If no documents are present, output exactly: `No grounded data available for this address.`
 """
 # ---- Hallucination guardrail: numeric grounding post-check -----------------
+# Numbers must be preceded by whitespace, start-of-string, or punctuation
+# OTHER than '-'. This prevents `Extreme-2080` from being parsed as the
+# negative number `-2080` (the hyphen is a word separator, not a sign).
+_NUM_RE = re.compile(r"(?:(?<=^)|(?<=[\s(\[/]))-?\d[\d,]*(?:\.\d+)?")
 _SENTENCE_END_RE = re.compile(r"(?<=[.!?])\s+(?=[A-Z\[])")
+# Strings that are too generic OR are well-known NYC system names rather
+# than measurements (311, 911 are city service lines, not values).
+_TRIVIAL_NUMS = {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "100",
+                 "311", "911", "211"}
 def _normalize_num(s: str) -> set[str]:
     return "\n".join(m.get("content", "") for m in doc_msgs)
+# Recognise structured-output section headers like `**Status.**` on their
+# own line. These are NOT sentences and are kept verbatim.
+_SECTION_HEADER_RE = re.compile(r"^\s*\*\*[A-Z][A-Za-z\s/]+\.\*\*\s*$", re.MULTILINE)
+# Granite sometimes emits the four headers inline rather than on their own
+# lines (e.g. `**Status.** This address ... **Empirical evidence.** ...`).
+# Normalise to one-per-line so the section-renderer regex matches.
+_KNOWN_SECTION_HEADERS = ["Status", "Empirical evidence", "Modeled scenarios",
+                          "Policy context"]
+_INLINE_HEADER_RE = re.compile(
+    r"\*\*(" + "|".join(re.escape(h) for h in _KNOWN_SECTION_HEADERS) + r")\.\*\*"
+)
+def _split_inline_headers(text: str) -> str:
+    """Inject a newline before each `**Header.**` so headers sit on their own
+    line. The render path and verifier both depend on this."""
+    text = _INLINE_HEADER_RE.sub(lambda m: f"\n**{m.group(1)}.**\n", text)
+    text = re.sub(r"\n{3,}", "\n\n", text)
+    return text.strip()
+def _strip_markdown(text: str) -> str:
+    """Remove bold markers and citation tags so the numeric scan operates on
+    raw content. Used only for the haystack-substring check, not the rendered
+    output."""
+    text = re.sub(r"\*\*([^*]+)\*\*", r"\1", text)  # **bold** -> bold
+    text = re.sub(r"\[[a-z0-9_]+\]", "", text, flags=re.I)  # drop [doc_id]
+    return text
 def verify_paragraph(paragraph: str, doc_msgs: list[dict]) -> tuple[str, list[dict]]:
     """Drop sentences whose numeric tokens don't appear in any source doc.
+    Section-header lines (e.g. `**Status.**`) and inline bold (`**foo**`)
+    are preserved verbatim; the verifier strips them only for the
+    numeric-grounding check. Sentences are split on sentence-end
+    punctuation followed by whitespace + a capital letter or '['.
+    Returns (clean_paragraph, dropped_sentences_with_reason).
     """
+    paragraph = _split_inline_headers(paragraph)
     haystack = _docs_corpus(doc_msgs)
+    out_blocks: list[str] = []
     dropped: list[dict] = []
+    body_buf: list[str] = []
+    def flush_body():
+        if not body_buf:
+            return
+        body = " ".join(body_buf).strip()
+        body_buf.clear()
+        if not body:
+            return
+        sentences = _SENTENCE_END_RE.split(body)
+        kept_sents: list[str] = []
+        for sent in sentences:
+            sent_stripped = sent.strip()
+            if not sent_stripped:
                 continue
+            sent_clean = _strip_markdown(sent_stripped)
+            nums = _NUM_RE.findall(sent_clean)
+            ungrounded = []
+            for n in nums:
+                if n in _TRIVIAL_NUMS:
+                    continue
+                forms = _normalize_num(n)
+                if not any(f in haystack for f in forms):
+                    ungrounded.append(n)
+            if ungrounded:
+                dropped.append({"sentence": sent_stripped,
+                                "ungrounded_numbers": ungrounded})
+                log.warning("dropped ungrounded sentence: %r (nums: %s)",
+                            sent_stripped, ungrounded)
+                continue
+            kept_sents.append(sent_stripped)
+        if kept_sents:
+            out_blocks.append(" ".join(kept_sents))
+    for line in paragraph.splitlines():
+        if _SECTION_HEADER_RE.match(line):
+            flush_body()
+            out_blocks.append(line.strip())
+        else:
+            body_buf.append(line.strip())
+    flush_body()
+    cleaned = "\n".join(b for b in out_blocks if b).strip()
     if not cleaned:
         cleaned = "Could not produce a verifiable summary; see the data panels."
     return cleaned, dropped
     return {"role": f"document {doc_id}", "content": "\n".join(body_lines)}
+def trim_docs_to_plan(doc_msgs: list[dict],
+                      planned_specialists: set[str] | None) -> list[dict]:
+    """Drop document messages whose doc_id family wasn't in the planner's
+    specialist list.
+    The FSM's parallel fan-out runs every specialist regardless of what
+    the planner asked for; this lets the user see all the data come in
+    via the trace + map. But for the reconciler we want only what the
+    planner judged relevant, both to cut prompt tokens (≈30-50% on
+    typical single_address queries) and to keep the briefing focused.
+    Doc IDs are mapped to specialist family prefixes:
+      sandy -> {sandy}
+      dep_stormwater -> {dep_*}
+      floodnet -> {floodnet}
+      nyc311 -> {nyc311}
+      microtopo -> {microtopo}
+      mta_entrances -> {mta_entrance_*}
+      nycha_developments -> {nycha_dev_*}
+      doe_schools -> {doe_school_*}
+      doh_hospitals -> {nyc_hospital_*}        # historical id naming
+      ida_hwm -> {ida_hwm}
+      prithvi_water -> {prithvi_water}
+      noaa_tides -> {noaa_tides}
+      nws_alerts -> {nws_alerts}
+      nws_obs -> {nws_obs}
+      ttm_forecast -> {ttm_forecast}
+      ttm_311_forecast -> {ttm_311_forecast}
+      floodnet_forecast -> {floodnet_forecast_*}
+      terramind -> {terramind_*, syn_*}
+      rag -> {rag_*}
+      nta_resolve -> {nta_resolve, nta_*}
+      dob_permits -> {dob_*}
+    Always preserved (never trimmed):
+      geocode, scope_note, nta_resolve — they orient the briefing or
+      gate scope and the planner doesn't always name them explicitly.
+    Set RIPRAP_TRIM_DOCS=0 to disable (defaults on).
+    """
+    import os as _os  # local import to keep module top tidy
+    if not planned_specialists or not doc_msgs:
+        return doc_msgs
+    if _os.environ.get("RIPRAP_TRIM_DOCS", "1").lower() in ("0", "false", "no"):
+        return doc_msgs
+    # Build the allowed-prefix set from the planner's specialists.
+    PREFIXES_BY_SPECIALIST: dict[str, tuple[str, ...]] = {
+        "sandy":              ("sandy",),
+        "dep_stormwater":     ("dep_",),
+        "floodnet":           ("floodnet",),
+        "nyc311":             ("nyc311",),
+        "microtopo":          ("microtopo",),
+        "ida_hwm":            ("ida_hwm",),
+        "prithvi_water":      ("prithvi_water",),
+        "noaa_tides":         ("noaa_tides",),
+        "nws_alerts":         ("nws_alerts",),
+        "nws_obs":            ("nws_obs",),
+        "ttm_forecast":       ("ttm_forecast",),
+        "ttm_311_forecast":   ("ttm_311_forecast",),
+        "floodnet_forecast":  ("floodnet_forecast",),
+        "terramind":          ("terramind", "syn_"),
+        "rag":                ("rag_",),
+        "rag_mta":            ("rag_",),
+        "nta_resolve":        ("nta_resolve", "nta_"),
+        "dob_permits":        ("dob_",),
+        "mta_entrances":      ("mta_entrance",),
+        "nycha_developments": ("nycha_dev",),
+        "doe_schools":        ("doe_school", "nyc_school"),
+        "doh_hospitals":      ("doh_hospital", "nyc_hospital"),
+    }
+    ALWAYS_KEEP = ("geocode", "scope_note", "nta_resolve")
+    allowed_prefixes: set[str] = set()
+    for spec in planned_specialists:
+        for p in PREFIXES_BY_SPECIALIST.get(spec, ()):
+            allowed_prefixes.add(p)
+    if not allowed_prefixes:
+        return doc_msgs  # planner gave us nothing matchable; bail safely
+    kept: list[dict] = []
+    for m in doc_msgs:
+        role = m.get("role", "")
+        if not role.startswith("document "):
+            kept.append(m)
+            continue
+        doc_id = role[len("document "):].strip()
+        if doc_id.startswith(ALWAYS_KEEP):
+            kept.append(m)
+            continue
+        if any(doc_id.startswith(p) for p in allowed_prefixes):
+            kept.append(m)
+    return kept
 def build_documents(state: dict[str, Any]) -> list[dict]:
     """Build Granite-native document-role messages, gated so absent
+    specialists emit no document at all.
+    Scope guard: if the resolved address is OUTSIDE the NYC bbox, only
+    the geocode + live national specialists emit documents. NYC-specific
+    layers (Sandy, DEP, FloodNet, NYC 311, microtopo, Ida HWMs, Prithvi,
+    NYC RAG corpus) are suppressed and a `scope_note` doc is added telling
+    the reconciler not to invoke NYC content.
+    """
     docs: list[dict] = []
+    geo = state.get("geocode") or {}
+    NYC_S, NYC_W, NYC_N, NYC_E = 40.49, -74.27, 40.92, -73.69
+    out_of_nyc = (
+        geo.get("lat") is not None and geo.get("lon") is not None and not (
+            NYC_S <= geo["lat"] <= NYC_N and NYC_W <= geo["lon"] <= NYC_E
+        )
+    )
+    if out_of_nyc:
+        # Compose a single live-conditions snapshot from whatever the
+        # national specialists produced. This always emits when out_of_nyc,
+        # even on a calm day, so the reconciler has SOMETHING grounded to
+        # report instead of only a list of what doesn't apply.
+        place_label = (geo.get("borough") or geo.get("address") or
+                       f"{geo['lat']:.4f}, {geo['lon']:.4f}")
+        body = [
+            "Source: Riprap planner + national live specialists. Scope "
+            "guard: this address is OUTSIDE NYC; NYC-specific datasets "
+            "are not in scope at this location.",
+            f"Resolved location: {place_label} ({geo['lat']:.4f}, "
+            f"{geo['lon']:.4f}).",
+        ]
+        tides = state.get("noaa_tides") or {}
+        if tides.get("station_id") and tides.get("error") is None:
+            tline = (f"NOAA Tides & Currents — nearest gauge: "
+                     f"{tides.get('station_name')} (NOAA "
+                     f"{tides.get('station_id')}, "
+                     f"{tides.get('distance_km')} km from address).")
+            body.append(tline)
+            if tides.get("observed_ft_mllw") is not None:
+                body.append(
+                    f"Observed water level: {tides['observed_ft_mllw']} ft "
+                    f"above MLLW; predicted: "
+                    f"{tides.get('predicted_ft_mllw')} ft; residual "
+                    f"(observed minus predicted): "
+                    f"{tides.get('residual_ft')} ft."
+                )
+            else:
+                body.append("No water-level observation reported by the "
+                            "gauge in the last poll.")
+        alerts = state.get("nws_alerts") or {}
+        body.append(
+            f"NWS Public Alerts at point: {alerts.get('n_active', 0)} "
+            "active flood-relevant alert(s)."
+        )
+        if alerts.get("alerts"):
+            for a in alerts["alerts"][:3]:
+                body.append(
+                    f"- {a.get('event','?')} (severity "
+                    f"{a.get('severity','?')}, urgency "
+                    f"{a.get('urgency','?')}); expires "
+                    f"{(a.get('expires') or '')[:16]}; area: "
+                    f"{(a.get('areaDesc') or '')[:120]}."
+                )
+        obs = state.get("nws_obs") or {}
+        if obs.get("station_id") and obs.get("error") is None:
+            line = (f"Nearest NWS ASOS: {obs.get('station_name')} "
+                    f"({obs.get('station_id')}, "
+                    f"{obs.get('distance_km')} km).")
+            body.append(line)
+            if obs.get("precip_last_hour_mm") is not None:
+                body.append(
+                    f"Precipitation last 1 h: "
+                    f"{obs['precip_last_hour_mm']} mm; last 6 h: "
+                    f"{obs.get('precip_last_6h_mm')} mm."
+                )
+            else:
+                body.append("No precipitation reported in the last hourly "
+                            "observation.")
+        ttm = state.get("ttm_forecast") or {}
+        if ttm.get("available") and ttm.get("interesting"):
+            body.append(
+                f"Granite TTM r2 surge forecast at the Battery: peak "
+                f"residual {ttm.get('forecast_peak_ft')} ft expected in "
+                f"{ttm.get('forecast_peak_minutes_ahead')} minutes — note "
+                f"this gauge is in NYC harbor, not local to this address."
+            )
+        docs.append(_doc_message("scope_note", body))
     if geo:
         body = [
+            "Source: NYC DCP Geosearch (geosearch.planninglabs.nyc).",
             f"Resolved address: {geo['address']}.",
             f"Borough: {geo.get('borough') or 'unknown'}.",
             f"Coordinates: {geo['lat']:.5f} N, {geo['lon']:.5f} W.",
     # 2012 extent. Granite has a strong training prior associating NYC + flood
     # + Brooklyn with Sandy and will misread "outside" as "inside" if given
     # the chance — silence-over-confabulation rules.
+    if not out_of_nyc and state.get("sandy") is True:
         body = [
             "Source: NYC Sandy Inundation Zone (NYC OpenData 5xsi-dfpx, "
             "empirical extent of areas flooded by Hurricane Sandy in 2012).",
         docs.append(_doc_message("sandy", body))
     dep = state.get("dep")
+    if not out_of_nyc and dep:
         for scen, info in dep.items():
             if info.get("depth_class", 0) > 0:
                 body = [
                     f"Source: {info['citation']}.",
+                    "Address inside scenario footprint: yes.",
                     f"Modeled depth class: {info['depth_label']}.",
                 ]
                 docs.append(_doc_message(scen, body))
     fn = state.get("floodnet")
+    if not out_of_nyc and fn and fn.get("n_sensors", 0) > 0:
         body = [
             "Source: FloodNet NYC ultrasonic depth sensor network (api.floodnet.nyc).",
             f"Sensors within {fn['radius_m']} m: {fn['n_sensors']}.",
         docs.append(_doc_message("floodnet", body))
     pw = state.get("prithvi_water")
+    if not out_of_nyc and pw and pw.get("nearest_distance_m") is not None:
         body = [
             "Source: Prithvi-EO 2.0 (300M params, NASA/IBM, Apache-2.0). "
             "Sen1Floods11 fine-tune for water/flood semantic segmentation, "
         docs.append(_doc_message("prithvi_water", body))
     ida = state.get("ida_hwm")
+    if not out_of_nyc and ida and (ida.get("n_within_radius") or 0) > 0:
         body = [
             "Source: USGS STN Hurricane Ida 2021 high-water marks (Event 312, NY State).",
             f"USGS HWMs within {ida['radius_m']} m: {ida['n_within_radius']}.",
         docs.append(_doc_message("ida_hwm", body))
     mt = state.get("microtopo")
+    if not out_of_nyc and mt:
         # Compute a categorical topographic position so Granite can't flip
         # the directional reading of the percentile.
         p200 = mt["rel_elev_pct_200m"]
             )
         docs.append(_doc_message("microtopo", body))
+    rag_hits = [] if out_of_nyc else (state.get("rag") or [])
     for h in rag_hits:
         body = [
             f"Source: {h['citation']}, page {h['page']}.",
         ]
         docs.append(_doc_message(h["doc_id"], body))
+    # ---- GLiNER typed extractions (Phase 2 specialist) -------------------
+    # Per-source structured fields the reconciler can cite as
+    # [gliner_<source>] in addition to the parent [rag_<source>].
+    gliner = (state.get("gliner") or {})
+    if not out_of_nyc and gliner:
+        for source, payload in gliner.items():
+            ents = payload.get("entities") or []
+            if not ents:
+                continue
+            body = [
+                f"Source PDF (parent retriever doc_id: {payload.get('rag_doc_id', '?')}, "
+                f"title: {payload.get('title', '?')}).",
+                f"Paragraph excerpt: \"{payload.get('paragraph_excerpt', '')}\"",
+                "Typed entities extracted by GLiNER (verbatim spans):",
+            ]
+            for e in ents:
+                body.append(
+                    f"  - [{e['label']}] {e['text']}  (score={e.get('score', 0):.2f})"
+                )
+            docs.append(_doc_message(f"gliner_{source}", body))
+    # ---- TerraMind synthesis (Phase 4 cognitive engine) ------------------
+    # Synthetic-prior tier — explicitly fourth epistemic class alongside
+    # empirical / modeled / proxy. Reconciler narration must frame this
+    # as "TerraMind generated a plausible land-cover map from terrain
+    # context", never "imaged" or "reconstructed". Class labels are
+    # tentative against ESRI Land Cover 2020-2022 schema.
+    tm = state.get("terramind")
+    if not out_of_nyc and tm and tm.get("ok"):
+        body = [
+            "Source: TerraMind 1.0 base (IBM/ESA, Apache-2.0) any-to-any "
+            "generative foundation model. This is a SYNTHETIC PRIOR, "
+            "not a measurement: TerraMind generates plausible categorical "
+            "land-cover maps from terrain context, never observations.",
+            f"Chain: {' -> '.join(tm.get('tim_chain') or ['DEM', 'LULC_synthetic'])}.",
+            f"Diffusion steps: {tm.get('diffusion_steps', '?')}.",
+            f"Diffusion seed (reproducibility): {tm.get('diffusion_seed', '?')}.",
+            f"Input DEM mean elevation at this address: "
+            f"{tm.get('dem_mean_m', 0):.2f} m (NYC 30 m LiDAR raster).",
+            f"Label schema: {tm.get('label_schema', 'ESRI Land Cover, tentative')}.",
+            f"Dominant synthetic land-cover class: "
+            f"{tm.get('dominant_class_display') or tm.get('dominant_class', 'unknown')} at "
+            f"{tm.get('dominant_pct', 0):.1f}% of the 5 km area.",
+            f"Synthetic class fractions ({tm.get('n_classes_observed', 0)} "
+            f"classes observed):",
+        ]
+        for label, pct in (tm.get("class_fractions") or {}).items():
+            body.append(f"  - {label}: {pct:.1f}%")
+        body.extend([
+            "synthetic_modality: true",
+            "Use only the careful framing 'TerraMind generated a "
+            "plausible synthetic land-cover prior from the terrain "
+            "context, with class labels tentatively aligned to ESRI "
+            "schema'. Do NOT claim measurement, imaging, observation, "
+            "or reconstruction.",
+        ])
+        docs.append(_doc_message("terramind_synthetic", body))
+    # ---- Prithvi-EO live water (Phase 1 specialist) ----------------------
+    # Per-query Sentinel-2 water-segmentation observation. Distinct from
+    # `prithvi_water` (the offline 2021 Ida polygons) — this one fires
+    # against today's imagery and emits a dated observation.
+    plive = state.get("prithvi_live")
+    if not out_of_nyc and plive and plive.get("ok"):
+        body = [
+            "Source: Prithvi-EO 2.0 (Sen1Floods11 fine-tune) live "
+            "segmentation over a Sentinel-2 L2A scene from Microsoft "
+            "Planetary Computer.",
+            f"Sentinel-2 scene id: {plive.get('item_id', 'unknown')}.",
+            f"Observation date: {(plive.get('item_datetime') or 'unknown')[:10]}.",
+            f"Cloud cover: {plive.get('cloud_cover', 0):.3f}%.",
+            f"% water within 500 m of address: "
+            f"{plive.get('pct_water_within_500m', 0):.2f}.",
+            f"% water across 5 km chip: "
+            f"{plive.get('pct_water_full', 0):.2f}.",
+        ]
+        docs.append(_doc_message("prithvi_live", body))
+    # ---- live signals -------------------------------------------------------
+    # NOAA tides, NWS alerts, NWS hourly obs change by the minute; reconciler
+    # treats these as "right now" context, not historical record.
+    # Live signals fold into scope_note for out-of-NYC; only emit standalone
+    # docs when the address is inside NYC (where the briefing has multiple
+    # sections that each cite different live sources).
+    tides = state.get("noaa_tides")
+    if not out_of_nyc and tides and tides.get("observed_ft_mllw") is not None:
+        body = [
+            f"Source: {CITATION_NOAA_TIDES}.",
+            f"Nearest tide gauge: {tides['station_name']} (NOAA station "
+            f"{tides['station_id']}, {tides['distance_km']} km away).",
+            f"Observation time (LST/LDT): {tides.get('obs_time') or 'unknown'}.",
+            f"Current observed water level above MLLW: {tides['observed_ft_mllw']} ft.",
+        ]
+        if tides.get("predicted_ft_mllw") is not None:
+            body.append(
+                f"Astronomical tide prediction at the same instant: "
+                f"{tides['predicted_ft_mllw']} ft above MLLW."
+            )
+        if tides.get("residual_ft") is not None:
+            interp = (
+                "approximately at predicted level"
+                if abs(tides["residual_ft"]) < 0.5 else
+                "elevated above prediction (positive residual is consistent with "
+                "wind-driven setup or storm surge)"
+                if tides["residual_ft"] > 0 else
+                "below prediction (negative residual is consistent with offshore wind)"
+            )
+            body.append(
+                f"Residual (observed minus predicted): {tides['residual_ft']} ft — "
+                f"{interp}."
+            )
+        body.append(
+            "Note: this is real-time tidal context for nearby coastal water level. "
+            "The address itself may be inland — the reading describes the bay/harbor "
+            "level the gauge is in, not the address."
+        )
+        docs.append(_doc_message("noaa_tides", body))
+    alerts = state.get("nws_alerts") or {}
+    active = alerts.get("alerts") or []
+    if not out_of_nyc and active:
+        body = [
+            f"Source: {CITATION_NWS_ALERTS}.",
+            f"Active flood-relevant alerts at this address right now: {len(active)}.",
+        ]
+        for a in active[:4]:
+            body.append(
+                f"- {a.get('event','(event)')} (severity: {a.get('severity','?')}, "
+                f"urgency: {a.get('urgency','?')}); issued {a.get('sent','')[:16]}, "
+                f"expires {a.get('expires','')[:16]}; "
+                f"sender: {a.get('sender_name','NWS')}; "
+                f"area: {(a.get('areaDesc') or '')[:120]}."
+            )
+            if a.get("headline"):
+                body.append(f"  Headline (verbatim): {a['headline'][:240]}")
+        body.append(
+            "These are official NWS alerts retrieved live; if any FLOOD or "
+            "FLASH FLOOD WARNING/WATCH is in this list, it applies to the "
+            "address right now and should be foregrounded."
+        )
+        docs.append(_doc_message("nws_alerts", body))
+    ttm = state.get("ttm_forecast")
+    if not out_of_nyc and ttm and ttm.get("available") and ttm.get("interesting"):
+        body = [
+            f"Source: {CITATION_TTM_FORECAST}.",
+            f"Gauge: {ttm['station_name']} (NOAA {ttm['station_id']}, "
+            f"{ttm.get('distance_km', '?')} km from address — closest of "
+            "Battery / Kings Point / Sandy Hook).",
+            f"Context window: {ttm['context_length']} samples (~"
+            f"{ttm['context_length']*6/60:.1f} h of 6-min residual).",
+            f"Forecast horizon: {ttm['horizon_steps']} samples (~"
+            f"{ttm['horizon_steps']*6/60:.1f} h ahead).",
+            f"Recent residual: {ttm['history_recent_ft']} ft "
+            f"(residual = observed water level minus astronomical prediction).",
+            f"Recent peak |residual| in context: {ttm['history_peak_abs_ft']} ft.",
+            f"Forecast peak residual: {ttm['forecast_peak_ft']} ft, expected "
+            f"{ttm['forecast_peak_minutes_ahead']} minutes from now "
+            f"(at {ttm['forecast_peak_time_utc']} UTC).",
+            "INTERPRETATION: positive residual is a wind-driven setup or "
+            "storm-surge component on top of the tide; the model predicts the "
+            "non-tidal part NOAA's astronomical predictor does not cover.",
+        ]
+        docs.append(_doc_message("ttm_forecast", body))
+    # Per-address 311 flood-complaint forecast — different time scale,
+    # different signal entirely. TTM r2 zero-shot on daily counts
+    # (~17 months of history → ~3 months of forecast). Aggregated to
+    # weekly for the narration since readers think in weeks.
+    ttm311 = state.get("ttm_311_forecast")
+    if not out_of_nyc and ttm311 and ttm311.get("available"):
+        accel = ('YES — forecast > 50% above recent 30-day baseline'
+                 if ttm311.get('accelerating')
+                 else 'no — forecast in line with recent baseline')
+        body = [
+            "Source: IBM Granite TimeSeries TTM r2 (Ekambaram et al. 2024, "
+            "NeurIPS) zero-shot forecast on NYC 311 flood-complaint history "
+            "(Sewer Backup, Catch Basin Clogged/Flooding, Street Flooding, "
+            "Manhole Overflow) within "
+            f"{ttm311.get('radius_m', 200)} m of the address.",
+            f"Context window: {ttm311['days_context']} days "
+            f"({ttm311['days_context'] // 7} weeks) ending "
+            f"{ttm311.get('context_window_end', '?')}.",
+            f"Total complaints in context window: "
+            f"{ttm311['history_total_complaints']}.",
+            f"History recent 30-day rate: {ttm311['history_recent_30d_mean']} "
+            f"complaints/day "
+            f"(≈{ttm311['history_weekly_equivalent']} per week).",
+            f"Forecast horizon: {ttm311['days_horizon']} days "
+            f"({ttm311['days_horizon'] // 7} weeks) ahead.",
+            f"Forecast rate: {ttm311['forecast_mean_per_day']} complaints/day "
+            f"(≈{ttm311['forecast_weekly_equivalent']} per week).",
+            f"Forecast peak day: {ttm311['forecast_peak_day']} complaints, "
+            f"day +{ttm311['forecast_peak_day_offset']}.",
+            f"Acceleration cue: {accel}.",
+            "INTERPRETATION: this is a per-address pattern forecast, not "
+            "a city-wide trend. Zero-history addresses get a zero-baseline "
+            "forecast (legitimate); the more relevant cite is when there's "
+            "a multi-month complaint history that the model is extrapolating.",
+        ]
+        docs.append(_doc_message("ttm_311_forecast", body))
+    # FloodNet sensor forecast — TTM r2 on the nearest sensor's
+    # historical flood-event recurrence. Reuses the (512, 96)
+    # singleton from ttm_311_forecast — same model class, different
+    # data stream. Doc id includes the sensor deployment id so the
+    # citation is unambiguous when multiple sensors are nearby.
+    fnf = state.get("floodnet_forecast")
+    if not out_of_nyc and fnf and fnf.get("available"):
+        accel = ("YES — next-28-day forecast > 50% above prior-28-day "
+                 "observed count"
+                 if fnf.get("accelerating")
+                 else "no — forecast in line with recent baseline")
+        doc_id = fnf.get("doc_id") or "floodnet_forecast"
+        body = [
+            "Source: FloodNet NYC ultrasonic depth sensor network "
+            "(api.floodnet.nyc) historical flood events, forecast by "
+            "IBM Granite TimeSeries TTM r2 (Ekambaram et al. 2024, "
+            "NeurIPS).",
+            f"Sensor: {fnf['sensor_name']} (deployment "
+            f"{fnf['sensor_id']}) at {fnf['sensor_street']}, "
+            f"{fnf['sensor_borough']}.",
+            f"Distance from query: {fnf['distance_from_query_m']} m.",
+            f"History window: {fnf['history_window_days']} days; "
+            f"{fnf['history_total_events']} flood events observed total, "
+            f"{fnf['history_recent_28d_events']} in the most recent "
+            f"28 days.",
+            f"Forecast horizon: {fnf['forecast_horizon_days']} days.",
+            f"Forecast next-28-day expected events: "
+            f"{fnf['forecast_28d_expected_events']}.",
+            f"Forecast peak day offset: +{fnf['forecast_peak_day_offset']} "
+            f"(value {fnf['forecast_peak_day_value']}).",
+            f"Acceleration cue: {accel}.",
+            "INTERPRETATION: this is a per-sensor recurrence forecast — "
+            "expected count of labelled flood events at that specific "
+            "deployment over the horizon, not an above-curb-event "
+            "probability. CUSP/Brooklyn College operates the sensors and "
+            "publishes the historical events; this forecast is Riprap's "
+            "extension to the same dataset, computable per-query.",
+        ]
+        docs.append(_doc_message(doc_id, body))
+    obs = state.get("nws_obs")
+    if not out_of_nyc and obs and obs.get("station_id") and obs.get("error") is None and (
+        obs.get("precip_last_hour_mm") is not None or
+        obs.get("precip_last_6h_mm") is not None
+    ):
+        body = [
+            f"Source: {CITATION_NWS_OBS}.",
+            f"Nearest hourly METAR station: {obs['station_name']} ({obs['station_id']}, "
+            f"{obs['distance_km']} km away).",
+            f"Observation time: {obs.get('obs_time') or 'unknown'}.",
+        ]
+        if obs.get("precip_last_hour_mm") is not None:
+            body.append(f"Precipitation last 1 h: {obs['precip_last_hour_mm']} mm.")
+        if obs.get("precip_last_3h_mm") is not None:
+            body.append(f"Precipitation last 3 h: {obs['precip_last_3h_mm']} mm.")
+        if obs.get("precip_last_6h_mm") is not None:
+            body.append(f"Precipitation last 6 h: {obs['precip_last_6h_mm']} mm.")
+        body.append(
+            "Heavy short-duration rainfall (e.g. >25 mm/h or >50 mm/6 h) is the "
+            "primary driver of NYC pluvial / sewer-backup flooding; the static "
+            "DEP scenarios assume specific rainfall intensities."
+        )
+        docs.append(_doc_message("nws_obs", body))
     nyc311 = state.get("nyc311")
+    if not out_of_nyc and nyc311 and nyc311.get("n", 0) > 0:
         body = [
             "Source: NYC 311 service requests (Socrata erm2-nwe9, 2010-present).",
             f"311 flood-related complaints within {nyc311['radius_m']} m, last {nyc311['years']} years: {nyc311['n']}.",
             body.append(f"Per-year counts: {yrs}.")
         docs.append(_doc_message("nyc311", body))
+    # ---- Register specialists (transit / housing / education / healthcare) ----
+    # Each emits one doc per asset so the reconciler can cite specifically
+    # (e.g. [mta_entrance_54], [nycha_dev_004]). Caps keep the total payload
+    # bounded; specialists already truncated to their per-query maxes.
+    mta = state.get("mta_entrances")
+    if not out_of_nyc and mta and mta.get("available"):
+        for e in mta.get("entrances", [])[:6]:
+            sid = e.get("station_id")
+            body = [
+                "Source: MTA Open Data subway entrances "
+                "+ NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) "
+                "+ NYC DEP Stormwater Flood Maps + USGS 3DEP DEM.",
+                (f"Station {e.get('station_name')} ({e.get('daytime_routes')}), "
+                 f"entrance type {e.get('entrance_type')}, "
+                 f"{e.get('distance_m')} m from query."),
+                (f"Entrance elevation {e.get('elevation_m')} m, "
+                 f"HAND (height above nearest drainage) {e.get('hand_m')} m."),
+            ]
+            if e.get("inside_sandy_2012"):
+                body.append("This entrance is inside the 2012 Sandy "
+                            "Inundation Zone (empirical).")
+            else:
+                body.append("This entrance is NOT inside the 2012 Sandy "
+                            "Inundation Zone.")
+            if (e.get("dep_extreme_2080_class") or 0) > 0:
+                body.append(
+                    f"NYC DEP Extreme-2080 scenario: "
+                    f"{e.get('dep_extreme_2080_label')}.")
+            if (e.get("dep_moderate_2050_class") or 0) > 0:
+                body.append(
+                    f"NYC DEP Moderate-2050 scenario: "
+                    f"{e.get('dep_moderate_2050_label')}.")
+            body.append("ADA-accessible (heuristic from entrance_type): "
+                        f"{'yes' if e.get('ada_accessible') else 'no'}.")
+            docs.append(_doc_message(f"mta_entrance_{sid}", body))
+    nycha = state.get("nycha_developments")
+    if not out_of_nyc and nycha and nycha.get("available"):
+        for d in nycha.get("developments", [])[:4]:
+            tds = d.get("tds_num")
+            body = [
+                "Source: NYC Open Data NYCHA Developments (phvi-damg) "
+                "+ NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) "
+                "+ NYC DEP Stormwater Flood Maps + USGS 3DEP DEM.",
+                (f"NYCHA development {d.get('development')} (TDS {tds}, "
+                 f"{d.get('borough')}), footprint {d.get('footprint_km2')} km², "
+                 f"{d.get('distance_m')} m from query."),
+                (f"Representative-point elevation {d.get('rep_elevation_m')} m, "
+                 f"HAND {d.get('rep_hand_m')} m."),
+                (f"{d.get('pct_inside_sandy_2012')}% of footprint inside the "
+                 "2012 Sandy Inundation Zone (empirical)."),
+            ]
+            if (d.get("pct_in_dep_extreme_2080") or 0) > 0:
+                body.append(
+                    f"{d.get('pct_in_dep_extreme_2080')}% of footprint inside "
+                    "NYC DEP Extreme-2080 scenario "
+                    f"(of which {d.get('pct_in_dep_extreme_2080_deep')}% in the "
+                    "deepest >4 ft band).")
+            if (d.get("pct_in_dep_moderate_2050") or 0) > 0:
+                body.append(
+                    f"{d.get('pct_in_dep_moderate_2050')}% of footprint inside "
+                    "NYC DEP Moderate-2050 scenario.")
+            docs.append(_doc_message(f"nycha_dev_{tds}", body))
+    schools = state.get("doe_schools")
+    if not out_of_nyc and schools and schools.get("available"):
+        for s in schools.get("schools", [])[:5]:
+            lc = s.get("loc_code")
+            body = [
+                "Source: NYC DOE Locations Points "
+                "+ NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) "
+                "+ NYC DEP Stormwater Flood Maps + USGS 3DEP DEM.",
+                (f"School {s.get('loc_name')} ({lc}, {s.get('address')}, "
+                 f"{s.get('borough')}), {s.get('distance_m')} m from query."),
+                (f"School-point elevation {s.get('elevation_m')} m, "
+                 f"HAND {s.get('hand_m')} m."),
+            ]
+            if s.get("inside_sandy_2012"):
+                body.append("This school is inside the 2012 Sandy "
+                            "Inundation Zone (empirical).")
+            else:
+                body.append("This school is NOT inside the 2012 Sandy "
+                            "Inundation Zone (centroid-point join; "
+                            "building-footprint join is a documented "
+                            "follow-up).")
+            if (s.get("dep_extreme_2080_class") or 0) > 0:
+                body.append(
+                    f"NYC DEP Extreme-2080 scenario: "
+                    f"{s.get('dep_extreme_2080_label')}.")
+            if (s.get("dep_moderate_2050_class") or 0) > 0:
+                body.append(
+                    f"NYC DEP Moderate-2050 scenario: "
+                    f"{s.get('dep_moderate_2050_label')}.")
+            docs.append(_doc_message(f"doe_school_{lc}", body))
+    hospitals = state.get("doh_hospitals")
+    if not out_of_nyc and hospitals and hospitals.get("available"):
+        for h in hospitals.get("hospitals", [])[:4]:
+            fid = h.get("fac_id")
+            body = [
+                "Source: NYS DOH Health Facility Certification (vn5v-hh5r) "
+                "+ NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) "
+                "+ NYC DEP Stormwater Flood Maps + USGS 3DEP DEM.",
+                (f"Hospital {h.get('facility_name')} (NYS DOH facility "
+                 f"{fid}, {h.get('address')}, {h.get('borough')}), "
+                 f"operator {h.get('operator_name')}, "
+                 f"ownership {h.get('ownership_type')}, "
+                 f"{h.get('distance_m')} m from query."),
+                (f"Hospital-point elevation {h.get('elevation_m')} m, "
+                 f"HAND {h.get('hand_m')} m."),
+            ]
+            if h.get("inside_sandy_2012"):
+                body.append("This hospital is inside the 2012 Sandy "
+                            "Inundation Zone (empirical).")
+            else:
+                body.append("This hospital is NOT inside the 2012 Sandy "
+                            "Inundation Zone (centroid-point join; "
+                            "building-footprint join is a documented "
+                            "follow-up).")
+            if (h.get("dep_extreme_2080_class") or 0) > 0:
+                body.append(
+                    f"NYC DEP Extreme-2080 scenario: "
+                    f"{h.get('dep_extreme_2080_label')}.")
+            if (h.get("dep_moderate_2050_class") or 0) > 0:
+                body.append(
+                    f"NYC DEP Moderate-2050 scenario: "
+                    f"{h.get('dep_moderate_2050_label')}.")
+            docs.append(_doc_message(f"nyc_hospital_{fid}", body))
     return docs
 def reconcile(state: dict[str, Any], model: str = OLLAMA_MODEL,
+              return_audit: bool = False, on_token=None):
     """Run Granite reconciliation, then drop sentences with ungrounded numbers.
+    If on_token is provided, the model is run in streaming mode and
+    on_token(delta) is called for each chunk as Granite generates.
     If return_audit=True, returns (paragraph, audit_dict) where audit_dict
     has 'raw' (Granite's original output) and 'dropped' (list of dropped
     sentences with their ungrounded numeric tokens).
         msg = "No grounded data available for this address."
         return (msg, {"raw": msg, "dropped": []}) if return_audit else msg
+    messages = doc_msgs + [
+        {"role": "system", "content": EXTRA_SYSTEM_PROMPT},
+        {"role": "user", "content": "Write the cited paragraph now."},
+    ]
+    # single_address: 13 specialists may fire, doc bodies are short.
+    # num_ctx 4096 covers ~700 system + ~2500 docs. num_predict 400 caps
+    # the 4-section briefing at ~300-350 tokens.
+    OPTS = {"temperature": 0, "num_ctx": 4096, "num_predict": 400}
+    if on_token is None:
+        resp = llm.chat(model=model, messages=messages, options=OPTS)
+        raw = resp["message"]["content"].strip()
+    else:
+        chunks: list[str] = []
+        for chunk in llm.chat(model=model, messages=messages, stream=True,
+                                 options=OPTS):
+            delta = (chunk.get("message") or {}).get("content") or ""
+            if delta:
+                chunks.append(delta)
+                on_token(delta)
+        raw = "".join(chunks).strip()
+    cleaned, dropped = verify_paragraph(raw, doc_msgs)
     if return_audit:
         return cleaned, {"raw": raw, "dropped": dropped}
     return cleaned

app/register_builder.py CHANGED Viewed

@@ -9,15 +9,16 @@ from __future__ import annotations
 import json
 import sys
 import time
 from pathlib import Path
-from typing import Any, Callable
 import geopandas as gpd
-from shapely.geometry import Point
 from app.context import floodnet, microtopo, nyc311
 from app.flood_layers import dep_stormwater, ida_hwm, sandy_inundation
-from app.rag import retrieve as rag_retrieve, warm as rag_warm
 from app.reconcile import reconcile as run_reconcile
 from app.score import score_frame

 import json
 import sys
 import time
+from collections.abc import Callable
 from pathlib import Path
+from typing import Any
 import geopandas as gpd
 from app.context import floodnet, microtopo, nyc311
 from app.flood_layers import dep_stormwater, ida_hwm, sandy_inundation
+from app.rag import retrieve as rag_retrieve
+from app.rag import warm as rag_warm
 from app.reconcile import reconcile as run_reconcile
 from app.score import score_frame

app/registers/__init__.py ADDED Viewed

File without changes

app/registers/_footprint.py ADDED Viewed

	@@ -0,0 +1,84 @@

+"""Buffered point-overlap helpers for the register specialists.
+The four register specialists (MTA entrances, NYCHA developments,
+DOE schools, NYS DOH hospitals) all need to test whether an asset
+intersects a flood polygon. NYCHA developments are already polygons
+(real building-group footprints), so polygon-vs-polygon `intersects`
+is correct. The other three are stored as point centroids:
+- MTA entrances are physical entrances; the point is the centerline
+- DOE schools are address centroids (administrative point), but the
+  actual building extends ~50 m around it
+- NYS DOH hospitals are address centroids; campuses are 80–250 m wide
+Pure point-in-polygon on the centroid produces false negatives at
+the boundary: NYU Langone, Stuyvesant HS, P.S. 89 all sit on
+buildings whose footprints overlap the 2012 Sandy zone but whose
+recorded centroid points just miss it.
+The honest fix is a join against the actual NYC Building Footprints
++ PLUTO BBL → footprint dataset (~400 MB). That's a separate
+ingestion task. This module is the surgical-and-shippable
+intermediate fix: buffer the centroid by an asset-class-appropriate
+radius, then ask `intersects` against the same Sandy / DEP polygons
+the existing helpers use. The `footprint_buffer_m` is recorded in
+the specialist output so the trace UI shows what radius was used —
+auditability over hidden assumptions.
+"""
+from __future__ import annotations
+import logging
+log = logging.getLogger("riprap.register.footprint")
+# Per-asset-class footprint buffer (metres). Conservative enough to
+# catch known canonical false-negatives (NYU Langone, Stuyvesant HS,
+# P.S. 89) without sweeping in obviously-distant buildings.
+BUFFER_MTA_ENTRANCE_M = 8
+BUFFER_DOE_SCHOOL_M = 50
+BUFFER_DOH_HOSPITAL_M = 100
+def inside_sandy_buffered(lat: float, lon: float, buffer_m: float) -> bool:
+    """True if the buffer of (lat, lon) by buffer_m metres intersects
+    the 2012 Sandy Inundation Zone."""
+    try:
+        import geopandas as gpd
+        from shapely.geometry import Point
+        from app.flood_layers import sandy_inundation
+        # Project before buffering so the buffer is metric. EPSG:2263
+        # is NYC State Plane (feet) — convert metres to feet for buffer.
+        ft = buffer_m * 3.280839895
+        pt = gpd.GeoDataFrame(
+            geometry=[Point(lon, lat)], crs="EPSG:4326"
+        ).to_crs("EPSG:2263")
+        pt["geometry"] = pt.geometry.buffer(ft)
+        return bool(sandy_inundation.join(pt).iloc[0])
+    except Exception:
+        log.exception("buffered sandy join failed")
+        return False
+def dep_class_buffered(lat: float, lon: float, buffer_m: float,
+                        scenario: str) -> tuple[int | None, str | None]:
+    """Max DEP depth class within `buffer_m` of (lat, lon).
+    Returns (depth_class, depth_label). Higher class wins on overlap,
+    matching `dep_stormwater.join`'s semantics. None on failure.
+    """
+    try:
+        import geopandas as gpd
+        from shapely.geometry import Point
+        from app.flood_layers import dep_stormwater
+        ft = buffer_m * 3.280839895
+        pt = gpd.GeoDataFrame(
+            geometry=[Point(lon, lat)], crs="EPSG:4326"
+        ).to_crs("EPSG:2263")
+        pt["geometry"] = pt.geometry.buffer(ft)
+        j = dep_stormwater.join(pt, scenario).iloc[0]
+        return int(j["depth_class"]), str(j["depth_label"])
+    except Exception:
+        log.exception("buffered dep join failed for %s", scenario)
+        return None, None

app/registers/doe_schools.py ADDED Viewed

	@@ -0,0 +1,200 @@

+"""doe_school_exposure — flood-exposure briefing per NYC public school.
+Point-based register specialist (1992 NYC DOE school points). Same
+join pattern as the MTA-entrance specialist. Per queried (lat, lon),
+returns up to N schools within `radius_m`, enriched with:
+  - inside_sandy_2012        (point-in-polygon, empirical)
+  - dep_extreme_2080_class   (point-in-polygon, modeled)
+  - dep_moderate_2050_class  (point-in-polygon, modeled)
+  - elevation_m              (USGS 3DEP DEM, proxy)
+  - hand_m                   (derived HAND raster, proxy)
+doc_id format: `doe_school_<loc_code>`. Schools are physical
+buildings that serve as evacuation hubs in city OEM plans, so
+"this school sits inside the 2012 Sandy zone" is a structural
+claim that's directly relevant to flood planning.
+"""
+from __future__ import annotations
+import json
+import logging
+import math
+import sys
+from dataclasses import dataclass
+from functools import lru_cache
+from pathlib import Path
+_ROOT = Path(__file__).resolve().parents[2]
+if str(_ROOT) not in sys.path:
+    sys.path.insert(0, str(_ROOT))
+log = logging.getLogger("riprap.doe_school")
+DATA = _ROOT / "data"
+SCHOOLS = DATA / "schools.geojson"
+DEFAULT_RADIUS_M = 1500
+DEFAULT_MAX_PER_QUERY = 6
+BORO_NAME = {"1": "MANHATTAN", "2": "BRONX", "3": "BROOKLYN",
+             "4": "QUEENS", "5": "STATEN ISLAND"}
+MANAGED_BY_LABEL = {"1": "DOE-managed", "2": "Charter or other"}
+@dataclass
+class SchoolFinding:
+    loc_code: str
+    loc_name: str
+    address: str
+    borough: str
+    bin: str
+    bbl: str
+    managed_by: str
+    school_lat: float
+    school_lon: float
+    distance_m: float
+    elevation_m: float | None
+    hand_m: float | None
+    inside_sandy_2012: bool
+    dep_extreme_2080_class: int | None
+    dep_extreme_2080_label: str | None
+    dep_moderate_2050_class: int | None
+    dep_moderate_2050_label: str | None
+def _haversine_m(lat1, lon1, lat2, lon2) -> float:
+    R = 6371000.0
+    p1, p2 = math.radians(lat1), math.radians(lat2)
+    dp = math.radians(lat2 - lat1); dl = math.radians(lon2 - lon1)
+    a = math.sin(dp / 2) ** 2 + math.cos(p1) * math.cos(p2) * math.sin(dl / 2) ** 2
+    return 2 * R * math.asin(math.sqrt(a))
+@lru_cache(maxsize=1)
+def _load_schools():
+    import geopandas as gpd
+    gdf = gpd.read_file(SCHOOLS)
+    gdf["lat"] = gdf.geometry.y
+    gdf["lon"] = gdf.geometry.x
+    return gdf.reset_index(drop=True)
+def _schools_near(lat: float, lon: float, radius_m: float):
+    gdf = _load_schools()
+    deg = radius_m / 90_000
+    sub = gdf[(gdf["lat"].between(lat - deg, lat + deg))
+              & (gdf["lon"].between(lon - deg, lon + deg))].copy()
+    if sub.empty:
+        return sub
+    sub["distance_m"] = sub.apply(
+        lambda r: _haversine_m(lat, lon, r["lat"], r["lon"]), axis=1)
+    return sub[sub["distance_m"] <= radius_m].sort_values("distance_m")
+def _sample_raster(raster_path: Path, lat: float, lon: float) -> float | None:
+    if not raster_path.exists():
+        return None
+    try:
+        import rasterio
+        with rasterio.open(raster_path) as src:
+            v = next(src.sample([(lon, lat)]))[0]
+            v = float(v)
+            if math.isnan(v) or v == src.nodata:
+                return None
+            return v
+    except Exception:
+        log.exception("raster sample failed for %s", raster_path)
+        return None
+from app.registers._footprint import (
+    BUFFER_DOE_SCHOOL_M,
+    dep_class_buffered,
+    inside_sandy_buffered,
+)
+def _inside_sandy(lat: float, lon: float) -> bool:
+    return inside_sandy_buffered(lat, lon, BUFFER_DOE_SCHOOL_M)
+def _dep_class(lat: float, lon: float, scenario: str):
+    return dep_class_buffered(lat, lon, BUFFER_DOE_SCHOOL_M, scenario)
+def summary_for_point(lat: float, lon: float,
+                       radius_m: float = DEFAULT_RADIUS_M,
+                       max_schools: int = DEFAULT_MAX_PER_QUERY) -> dict:
+    near = _schools_near(lat, lon, radius_m)
+    if near.empty:
+        return {"available": False,
+                "n_schools": 0,
+                "radius_m": radius_m,
+                "schools": []}
+    near = near.head(max_schools)
+    findings: list[SchoolFinding] = []
+    for _, row in near.iterrows():
+        slat, slon = float(row["lat"]), float(row["lon"])
+        elev = _sample_raster(DATA / "nyc_dem_30m.tif", slat, slon)
+        hand = _sample_raster(DATA / "hand.tif", slat, slon)
+        in_sandy = _inside_sandy(slat, slon)
+        d80c, d80l = _dep_class(slat, slon, "dep_extreme_2080")
+        d50c, d50l = _dep_class(slat, slon, "dep_moderate_2050")
+        boronum = str(row.get("boronum", ""))
+        findings.append(SchoolFinding(
+            loc_code=str(row["loc_code"]),
+            loc_name=str(row["loc_name"]),
+            address=str(row["address"]).strip(),
+            borough=BORO_NAME.get(boronum, boronum),
+            bin=str(row["bin"]),
+            bbl=str(row["bbl"]),
+            managed_by=MANAGED_BY_LABEL.get(str(row["managed_by"]),
+                                              str(row["managed_by"])),
+            school_lat=round(slat, 5),
+            school_lon=round(slon, 5),
+            distance_m=round(float(row["distance_m"]), 1),
+            elevation_m=round(elev, 2) if elev is not None else None,
+            hand_m=round(hand, 2) if hand is not None else None,
+            inside_sandy_2012=in_sandy,
+            dep_extreme_2080_class=d80c,
+            dep_extreme_2080_label=d80l,
+            dep_moderate_2050_class=d50c,
+            dep_moderate_2050_label=d50l,
+        ))
+    n_in_sandy = sum(1 for f in findings if f.inside_sandy_2012)
+    n_dep_2080 = sum(1 for f in findings
+                       if (f.dep_extreme_2080_class or 0) > 0)
+    return {
+        "available": True,
+        "n_schools": len(findings),
+        "radius_m": radius_m,
+        "footprint_buffer_m": BUFFER_DOE_SCHOOL_M,
+        "n_inside_sandy_2012": n_in_sandy,
+        "n_in_dep_extreme_2080": n_dep_2080,
+        "schools": [vars(f) for f in findings],
+        "citation": ("NYC DOE Locations Points + NYC OEM Sandy 2012 "
+                     "Inundation Zone (5xsi-dfpx) + NYC DEP Stormwater "
+                     "Flood Maps + USGS 3DEP DEM"),
+    }
+def main() -> int:
+    import argparse
+    ap = argparse.ArgumentParser()
+    ap.add_argument("--lat", type=float, required=True)
+    ap.add_argument("--lon", type=float, required=True)
+    ap.add_argument("--radius", type=float, default=DEFAULT_RADIUS_M)
+    ap.add_argument("--max", type=int, default=DEFAULT_MAX_PER_QUERY)
+    args = ap.parse_args()
+    s = summary_for_point(args.lat, args.lon, args.radius, args.max)
+    print(json.dumps(s, indent=2, default=str))
+    return 0
+if __name__ == "__main__":
+    sys.exit(main())

app/registers/doh_hospitals.py ADDED Viewed

	@@ -0,0 +1,194 @@

+"""nys_doh_hospital_exposure — flood-exposure briefing per NYC hospital.
+Point-based register specialist on 67 NYC hospitals from the NYS DOH
+Health Facility Certification Information dataset (Article 28
+hospitals only, filtered to the 5 NYC counties). Same join pattern
+as MTA entrances and DOE schools.
+Hospitals are essential infrastructure: a hospital inside the 2012
+Sandy Inundation Zone tells planners and emergency-management
+audiences something concrete about lifeline-asset exposure. NYU
+Langone, Bellevue, and Coney Island Hospital all evacuated patients
+during Sandy — those events are public-record and well-documented.
+doc_id format: `nyc_hospital_<fac_id>` (NYS DOH facility ID).
+"""
+from __future__ import annotations
+import json
+import logging
+import math
+import sys
+from dataclasses import dataclass
+from functools import lru_cache
+from pathlib import Path
+_ROOT = Path(__file__).resolve().parents[2]
+if str(_ROOT) not in sys.path:
+    sys.path.insert(0, str(_ROOT))
+log = logging.getLogger("riprap.hospital")
+DATA = _ROOT / "data"
+HOSPITALS = DATA / "hospitals.geojson"
+DEFAULT_RADIUS_M = 3000  # hospitals are sparse; wider radius
+DEFAULT_MAX_PER_QUERY = 5
+COUNTY_TO_BOROUGH = {
+    "New York": "MANHATTAN", "Kings": "BROOKLYN", "Bronx": "BRONX",
+    "Queens": "QUEENS", "Richmond": "STATEN ISLAND",
+}
+@dataclass
+class HospitalFinding:
+    fac_id: str
+    facility_name: str
+    address: str
+    borough: str
+    operator_name: str
+    ownership_type: str
+    hospital_lat: float
+    hospital_lon: float
+    distance_m: float
+    elevation_m: float | None
+    hand_m: float | None
+    inside_sandy_2012: bool
+    dep_extreme_2080_class: int | None
+    dep_extreme_2080_label: str | None
+    dep_moderate_2050_class: int | None
+    dep_moderate_2050_label: str | None
+def _haversine_m(lat1, lon1, lat2, lon2) -> float:
+    R = 6371000.0
+    p1, p2 = math.radians(lat1), math.radians(lat2)
+    dp = math.radians(lat2 - lat1); dl = math.radians(lon2 - lon1)
+    a = math.sin(dp / 2) ** 2 + math.cos(p1) * math.cos(p2) * math.sin(dl / 2) ** 2
+    return 2 * R * math.asin(math.sqrt(a))
+@lru_cache(maxsize=1)
+def _load_hospitals():
+    import geopandas as gpd
+    gdf = gpd.read_file(HOSPITALS)
+    gdf["lat"] = gdf.geometry.y
+    gdf["lon"] = gdf.geometry.x
+    return gdf.reset_index(drop=True)
+def _hospitals_near(lat: float, lon: float, radius_m: float):
+    gdf = _load_hospitals()
+    deg = radius_m / 90_000
+    sub = gdf[(gdf["lat"].between(lat - deg, lat + deg))
+              & (gdf["lon"].between(lon - deg, lon + deg))].copy()
+    if sub.empty:
+        return sub
+    sub["distance_m"] = sub.apply(
+        lambda r: _haversine_m(lat, lon, r["lat"], r["lon"]), axis=1)
+    return sub[sub["distance_m"] <= radius_m].sort_values("distance_m")
+def _sample_raster(raster_path: Path, lat: float, lon: float) -> float | None:
+    if not raster_path.exists():
+        return None
+    try:
+        import rasterio
+        with rasterio.open(raster_path) as src:
+            v = next(src.sample([(lon, lat)]))[0]
+            v = float(v)
+            if math.isnan(v) or v == src.nodata:
+                return None
+            return v
+    except Exception:
+        log.exception("raster sample failed for %s", raster_path)
+        return None
+from app.registers._footprint import (
+    BUFFER_DOH_HOSPITAL_M,
+    dep_class_buffered,
+    inside_sandy_buffered,
+)
+def _inside_sandy(lat: float, lon: float) -> bool:
+    return inside_sandy_buffered(lat, lon, BUFFER_DOH_HOSPITAL_M)
+def _dep_class(lat: float, lon: float, scenario: str):
+    return dep_class_buffered(lat, lon, BUFFER_DOH_HOSPITAL_M, scenario)
+def summary_for_point(lat: float, lon: float,
+                       radius_m: float = DEFAULT_RADIUS_M,
+                       max_hospitals: int = DEFAULT_MAX_PER_QUERY) -> dict:
+    near = _hospitals_near(lat, lon, radius_m)
+    if near.empty:
+        return {"available": False,
+                "n_hospitals": 0,
+                "radius_m": radius_m,
+                "hospitals": []}
+    near = near.head(max_hospitals)
+    findings: list[HospitalFinding] = []
+    for _, row in near.iterrows():
+        hlat, hlon = float(row["lat"]), float(row["lon"])
+        elev = _sample_raster(DATA / "nyc_dem_30m.tif", hlat, hlon)
+        hand = _sample_raster(DATA / "hand.tif", hlat, hlon)
+        in_sandy = _inside_sandy(hlat, hlon)
+        d80c, d80l = _dep_class(hlat, hlon, "dep_extreme_2080")
+        d50c, d50l = _dep_class(hlat, hlon, "dep_moderate_2050")
+        findings.append(HospitalFinding(
+            fac_id=str(row["fac_id"]),
+            facility_name=str(row["facility_name"]),
+            address=f"{row['address1']}, {row['city']}".strip(", "),
+            borough=COUNTY_TO_BOROUGH.get(str(row["county"]), str(row["county"])),
+            operator_name=str(row["operator_name"]),
+            ownership_type=str(row["ownership_type"]),
+            hospital_lat=round(hlat, 5),
+            hospital_lon=round(hlon, 5),
+            distance_m=round(float(row["distance_m"]), 1),
+            elevation_m=round(elev, 2) if elev is not None else None,
+            hand_m=round(hand, 2) if hand is not None else None,
+            inside_sandy_2012=in_sandy,
+            dep_extreme_2080_class=d80c,
+            dep_extreme_2080_label=d80l,
+            dep_moderate_2050_class=d50c,
+            dep_moderate_2050_label=d50l,
+        ))
+    n_in_sandy = sum(1 for f in findings if f.inside_sandy_2012)
+    n_dep_2080 = sum(1 for f in findings
+                       if (f.dep_extreme_2080_class or 0) > 0)
+    return {
+        "available": True,
+        "n_hospitals": len(findings),
+        "radius_m": radius_m,
+        "footprint_buffer_m": BUFFER_DOH_HOSPITAL_M,
+        "n_inside_sandy_2012": n_in_sandy,
+        "n_in_dep_extreme_2080": n_dep_2080,
+        "hospitals": [vars(f) for f in findings],
+        "citation": ("NYS DOH Health Facility Certification (vn5v-hh5r) + "
+                     "NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) + "
+                     "NYC DEP Stormwater Flood Maps + USGS 3DEP DEM"),
+    }
+def main() -> int:
+    import argparse
+    ap = argparse.ArgumentParser()
+    ap.add_argument("--lat", type=float, required=True)
+    ap.add_argument("--lon", type=float, required=True)
+    ap.add_argument("--radius", type=float, default=DEFAULT_RADIUS_M)
+    ap.add_argument("--max", type=int, default=DEFAULT_MAX_PER_QUERY)
+    args = ap.parse_args()
+    s = summary_for_point(args.lat, args.lon, args.radius, args.max)
+    print(json.dumps(s, indent=2, default=str))
+    return 0
+if __name__ == "__main__":
+    sys.exit(main())

app/registers/mta_entrances.py ADDED Viewed

	@@ -0,0 +1,244 @@

+"""mta_entrance_exposure — flood-exposure briefing per subway entrance.
+The headline new specialist for the IBM senior technical staffer's
+"subway entrances" reaction. Joins:
+  - MTA Open Data subway-entrance geometry (data/mta_entrances.geojson,
+    2120 entrances city-wide).
+  - NYC OEM Sandy 2012 Inundation Zone (data/sandy_inundation.geojson)
+    — empirical evidence (a flood actually happened here).
+  - NYC DEP Stormwater Flood Maps for Extreme-2080, Moderate-2050,
+    Moderate-current scenarios — modeled evidence.
+  - USGS 3DEP DEM (data/nyc_dem_30m.tif) for entrance-level elevation.
+  - HAND raster (data/hand.tif) for height above nearest drainage.
+  - Entrance type → ADA-status heuristic (Elevator / Ramp = accessible).
+Per queried address, returns the entrances within a configurable
+radius (default 800 m) with structured per-entrance claims the
+reconciler can cite. doc_id format: `mta_entrance_<station_id>`.
+Honest scope (per Riprap discipline):
+  - This is an EXPOSURE specialist, not a damage forecast. We say
+    "this entrance sits inside the 2012 Sandy zone" — we don't say
+    "this entrance will flood again in the next storm".
+  - The Sandy / DEP layers are point-in-polygon over public-record
+    geometry; ADA status from the MTA Open Data `entrance_type`
+    column is a heuristic, not the authoritative MTA accessibility
+    list.
+  - Documented MTA Sandy-recovery records for specific stations are
+    NOT included in this first cut — only the empirical-inundation
+    membership. Adding station-level recovery citations requires
+    parsing the MTA's "Hurricane Sandy: Three Years Later" report
+    and is a follow-up.
+"""
+from __future__ import annotations
+import json
+import logging
+import math
+import sys
+from dataclasses import dataclass
+from functools import lru_cache
+from pathlib import Path
+# Ensure `app/` is importable when this experiment is invoked directly
+# from its own subdir.
+_ROOT = Path(__file__).resolve().parents[2]
+if str(_ROOT) not in sys.path:
+    sys.path.insert(0, str(_ROOT))
+log = logging.getLogger("riprap.mta_entrance")
+DATA = Path(__file__).resolve().parents[2] / "data"
+MTA_ENTRANCES = DATA / "mta_entrances.geojson"
+ADA_ACCESSIBLE_TYPES = {"Elevator", "Ramp"}
+DEFAULT_RADIUS_M = 800
+DEFAULT_MAX_PER_QUERY = 8  # cap per station so doc payload stays small
+@dataclass
+class EntranceFinding:
+    station_id: str
+    station_name: str
+    daytime_routes: str
+    borough: str
+    entrance_type: str
+    entrance_lat: float
+    entrance_lon: float
+    distance_m: float
+    ada_accessible: bool
+    elevation_m: float | None
+    hand_m: float | None  # height above nearest drainage
+    inside_sandy_2012: bool
+    dep_extreme_2080_class: int | None     # 0/1/2/3
+    dep_extreme_2080_label: str | None
+    dep_moderate_2050_class: int | None
+    dep_moderate_2050_label: str | None
+def _haversine_m(lat1, lon1, lat2, lon2) -> float:
+    R = 6371000.0
+    p1, p2 = math.radians(lat1), math.radians(lat2)
+    dp = math.radians(lat2 - lat1); dl = math.radians(lon2 - lon1)
+    a = math.sin(dp / 2) ** 2 + math.cos(p1) * math.cos(p2) * math.sin(dl / 2) ** 2
+    return 2 * R * math.asin(math.sqrt(a))
+@lru_cache(maxsize=1)
+def _load_entrances():
+    import geopandas as gpd
+    import pandas as pd
+    gdf = gpd.read_file(MTA_ENTRANCES)
+    # The lat/lon columns are strings in this GeoJSON; coerce so we
+    # can do range comparisons in the bbox prefilter.
+    gdf["entrance_latitude"] = pd.to_numeric(gdf["entrance_latitude"],
+                                              errors="coerce")
+    gdf["entrance_longitude"] = pd.to_numeric(gdf["entrance_longitude"],
+                                                errors="coerce")
+    gdf = gdf[gdf["entrance_latitude"].notna()
+              & gdf["entrance_longitude"].notna()].copy()
+    return gdf.reset_index(drop=True)
+def _entrances_near(lat: float, lon: float, radius_m: float):
+    gdf = _load_entrances()
+    # Coarse bbox prefilter to avoid haversine on 2120 rows every call.
+    deg = radius_m / 90_000  # generous degree padding at NYC latitude
+    sub = gdf[
+        (gdf["entrance_latitude"].between(lat - deg, lat + deg))
+        & (gdf["entrance_longitude"].between(lon - deg, lon + deg))
+    ].copy()
+    if sub.empty:
+        return sub
+    sub["distance_m"] = sub.apply(
+        lambda r: _haversine_m(lat, lon, r["entrance_latitude"],
+                                 r["entrance_longitude"]),
+        axis=1,
+    )
+    sub = sub[sub["distance_m"] <= radius_m].sort_values("distance_m")
+    return sub
+def _sample_raster(raster_path: Path, lat: float, lon: float) -> float | None:
+    """Read one pixel from a raster at (lat, lon). Returns None if the
+    point is outside the raster or the raster is missing.
+    The cached NYC rasters are all EPSG:4326. rasterio.sample handles
+    coordinate-to-pixel translation directly — simpler than building
+    a windowed read."""
+    if not raster_path.exists():
+        return None
+    try:
+        import rasterio
+        with rasterio.open(raster_path) as src:
+            v = next(src.sample([(lon, lat)]))[0]
+            if v is None:
+                return None
+            v = float(v)
+            if math.isnan(v) or v == src.nodata:
+                return None
+            return v
+    except Exception:
+        log.exception("raster sample failed for %s", raster_path)
+        return None
+from app.registers._footprint import (
+    BUFFER_MTA_ENTRANCE_M,
+    dep_class_buffered,
+    inside_sandy_buffered,
+)
+def _inside_sandy(lat: float, lon: float) -> bool:
+    """Sandy join with a small (8 m) buffer to capture entrances at the
+    polygon edge — the entrance point is the centerline of the stair
+    well, the actual opening is wider."""
+    return inside_sandy_buffered(lat, lon, BUFFER_MTA_ENTRANCE_M)
+def _dep_class(lat: float, lon: float, scenario: str) -> tuple[int | None, str | None]:
+    return dep_class_buffered(lat, lon, BUFFER_MTA_ENTRANCE_M, scenario)
+def summary_for_point(lat: float, lon: float,
+                      radius_m: float = DEFAULT_RADIUS_M,
+                      max_entrances: int = DEFAULT_MAX_PER_QUERY) -> dict:
+    """Return all subway entrances within `radius_m` of (lat, lon),
+    enriched with flood-exposure fields. Empty list when no entrances
+    are nearby (silence over confabulation)."""
+    near = _entrances_near(lat, lon, radius_m)
+    if near.empty:
+        return {"available": False,
+                "n_entrances": 0,
+                "radius_m": radius_m,
+                "entrances": []}
+    near = near.head(max_entrances)
+    findings: list[EntranceFinding] = []
+    for _, row in near.iterrows():
+        elat, elon = float(row["entrance_latitude"]), float(row["entrance_longitude"])
+        ada = str(row["entrance_type"]) in ADA_ACCESSIBLE_TYPES
+        elev = _sample_raster(DATA / "nyc_dem_30m.tif", elat, elon)
+        hand = _sample_raster(DATA / "hand.tif", elat, elon)
+        in_sandy = _inside_sandy(elat, elon)
+        dep_2080_class, dep_2080_label = _dep_class(elat, elon, "dep_extreme_2080")
+        dep_2050_class, dep_2050_label = _dep_class(elat, elon, "dep_moderate_2050")
+        findings.append(EntranceFinding(
+            station_id=str(row["station_id"]),
+            station_name=str(row["stop_name"]),
+            daytime_routes=str(row["daytime_routes"]),
+            borough=str(row["borough"]),
+            entrance_type=str(row["entrance_type"]),
+            entrance_lat=elat, entrance_lon=elon,
+            distance_m=round(float(row["distance_m"]), 1),
+            ada_accessible=ada,
+            elevation_m=round(elev, 2) if elev is not None else None,
+            hand_m=round(hand, 2) if hand is not None else None,
+            inside_sandy_2012=in_sandy,
+            dep_extreme_2080_class=dep_2080_class,
+            dep_extreme_2080_label=dep_2080_label,
+            dep_moderate_2050_class=dep_2050_class,
+            dep_moderate_2050_label=dep_2050_label,
+        ))
+    # Citywide rollups across the returned entrances.
+    n_in_sandy = sum(1 for f in findings if f.inside_sandy_2012)
+    n_in_dep_2080 = sum(1 for f in findings
+                          if (f.dep_extreme_2080_class or 0) > 0)
+    n_ada = sum(1 for f in findings if f.ada_accessible)
+    return {
+        "available": True,
+        "n_entrances": len(findings),
+        "radius_m": radius_m,
+        "footprint_buffer_m": BUFFER_MTA_ENTRANCE_M,
+        "n_inside_sandy_2012": n_in_sandy,
+        "n_in_dep_extreme_2080": n_in_dep_2080,
+        "n_ada_accessible": n_ada,
+        "entrances": [vars(f) for f in findings],
+        "citation": ("MTA Open Data subway entrances + NYC OEM Sandy 2012 "
+                     "Inundation Zone (5xsi-dfpx) + NYC DEP Stormwater "
+                     "Flood Maps + USGS 3DEP DEM"),
+    }
+def main() -> int:
+    """CLI smoke test."""
+    import argparse
+    ap = argparse.ArgumentParser()
+    ap.add_argument("--lat", type=float, required=True)
+    ap.add_argument("--lon", type=float, required=True)
+    ap.add_argument("--radius", type=float, default=DEFAULT_RADIUS_M)
+    ap.add_argument("--max", type=int, default=DEFAULT_MAX_PER_QUERY)
+    args = ap.parse_args()
+    s = summary_for_point(args.lat, args.lon, args.radius, args.max)
+    print(json.dumps(s, indent=2, default=str))
+    return 0
+if __name__ == "__main__":
+    import sys
+    sys.exit(main())

app/registers/nycha.py ADDED Viewed

	@@ -0,0 +1,270 @@

+"""nycha_development_exposure — flood-exposure briefing per NYCHA development.
+Same pattern as the MTA-entrance specialist, but NYCHA developments are
+*polygons* not points, so the metrics shift to overlap fractions:
+  - % of footprint inside the 2012 Sandy Inundation Zone (empirical)
+  - % of footprint inside DEP Extreme-2080 / Moderate-2050 scenarios
+    (modeled, broken out by depth class)
+  - Representative-point elevation, HAND, TWI (proxy)
+  - Footprint area (km²)
+  - Distance from query point to development boundary
+Joins:
+  - data/nycha.geojson  (NYC Open Data, 218 NYCHA developments)
+  - data/sandy_inundation.geojson
+  - DEP Stormwater Flood Map polygons (3 scenarios)
+  - data/nyc_dem_30m.tif, data/hand.tif
+Per queried (lat, lon), returns developments whose centroid is within
+the radius (default 2000 m — NYCHA developments are sparser than
+subway entrances, so the radius is wider).
+Honest scope:
+  - This is exposure, not damage forecast. We say "85% of this
+    development's footprint is inside the 2012 Sandy zone" — not
+    "this development will flood next storm".
+  - All overlap fractions are computed in EPSG:2263 (NYC State Plane,
+    feet) for accurate area arithmetic in the city.
+"""
+from __future__ import annotations
+import json
+import logging
+import math
+import sys
+from dataclasses import dataclass
+from functools import lru_cache
+from pathlib import Path
+_ROOT = Path(__file__).resolve().parents[2]
+if str(_ROOT) not in sys.path:
+    sys.path.insert(0, str(_ROOT))
+log = logging.getLogger("riprap.nycha")
+DATA = _ROOT / "data"
+NYCHA = DATA / "nycha.geojson"
+DEFAULT_RADIUS_M = 2000
+DEFAULT_MAX_PER_QUERY = 5
+@dataclass
+class DevelopmentFinding:
+    development: str
+    tds_num: str
+    borough: str
+    centroid_lat: float
+    centroid_lon: float
+    distance_m: float
+    footprint_km2: float
+    rep_elevation_m: float | None
+    rep_hand_m: float | None
+    pct_inside_sandy_2012: float
+    pct_in_dep_extreme_2080: float       # any-depth (class>=1)
+    pct_in_dep_extreme_2080_deep: float  # class==3 only ("Deep Contiguous")
+    pct_in_dep_moderate_2050: float
+@lru_cache(maxsize=1)
+def _load_nycha():
+    import geopandas as gpd
+    gdf = gpd.read_file(NYCHA).to_crs("EPSG:2263")  # feet, accurate areas
+    gdf["centroid_2263"] = gdf.geometry.centroid
+    return gdf.reset_index(drop=True)
+@lru_cache(maxsize=1)
+def _load_sandy_2263():
+    """Load the Sandy zone in EPSG:2263 once. Already used by
+    app.flood_layers.sandy_inundation but we want the geometry directly
+    for overlap-fraction math."""
+    import geopandas as gpd
+    g = gpd.read_file(DATA / "sandy_inundation.geojson").to_crs("EPSG:2263")
+    # Some NYC OEM Sandy polygons have hole-orientation issues that
+    # blow up unary_union. buffer(0) fixes self-intersections without
+    # changing the footprint at sub-foot precision.
+    g["geometry"] = g.geometry.buffer(0)
+    return g.geometry.union_all()
+@lru_cache(maxsize=4)
+def _load_dep_2263(scenario: str):
+    """DEP scenario polygons in EPSG:2263, with depth_class column."""
+    import geopandas as gpd
+    p = DATA / "dep" / f"{scenario}.geojson"
+    if not p.exists():
+        # Fallback to whatever the existing dep_stormwater module loaded.
+        from app.flood_layers import dep_stormwater
+        gdf = dep_stormwater.load(scenario)
+        return gdf.to_crs("EPSG:2263") if gdf.crs is not None else gdf
+    return gpd.read_file(p).to_crs("EPSG:2263")
+def _haversine_m(lat1, lon1, lat2, lon2) -> float:
+    R = 6371000.0
+    p1, p2 = math.radians(lat1), math.radians(lat2)
+    dp = math.radians(lat2 - lat1); dl = math.radians(lon2 - lon1)
+    a = math.sin(dp / 2) ** 2 + math.cos(p1) * math.cos(p2) * math.sin(dl / 2) ** 2
+    return 2 * R * math.asin(math.sqrt(a))
+def _sample_raster(raster_path: Path, lat: float, lon: float) -> float | None:
+    if not raster_path.exists():
+        return None
+    try:
+        import rasterio
+        with rasterio.open(raster_path) as src:
+            v = next(src.sample([(lon, lat)]))[0]
+            v = float(v)
+            if math.isnan(v) or v == src.nodata:
+                return None
+            return v
+    except Exception:
+        log.exception("raster sample failed for %s", raster_path)
+        return None
+def _developments_near(lat: float, lon: float, radius_m: float):
+    """Return developments whose centroid is within `radius_m` of
+    (lat, lon). Uses haversine on centroids re-projected back to
+    EPSG:4326 — the bbox prefilter gets us close, then exact distance."""
+    import geopandas as gpd
+    gdf = _load_nycha()
+    # Re-project centroids to 4326 for haversine
+    cents_4326 = gpd.GeoSeries(gdf["centroid_2263"], crs="EPSG:2263").to_crs("EPSG:4326")
+    deg = radius_m / 90_000
+    cent_lat = cents_4326.y
+    cent_lon = cents_4326.x
+    mask = ((cent_lat >= lat - deg) & (cent_lat <= lat + deg)
+            & (cent_lon >= lon - deg) & (cent_lon <= lon + deg))
+    sub = gdf[mask].copy()
+    if sub.empty:
+        return sub, []
+    sub["clat"] = cent_lat[mask].values
+    sub["clon"] = cent_lon[mask].values
+    sub["distance_m"] = sub.apply(
+        lambda r: _haversine_m(lat, lon, r["clat"], r["clon"]),
+        axis=1,
+    )
+    sub = sub[sub["distance_m"] <= radius_m].sort_values("distance_m")
+    return sub, sub.index.tolist()
+def _overlap_pct(geom_2263, mask_geom_2263) -> float:
+    """% of geom_2263's area that intersects mask_geom_2263."""
+    if mask_geom_2263 is None or mask_geom_2263.is_empty:
+        return 0.0
+    inter = geom_2263.intersection(mask_geom_2263)
+    if inter.is_empty:
+        return 0.0
+    return round(100.0 * inter.area / max(geom_2263.area, 1e-9), 2)
+def _dep_overlap(geom_2263, scenario: str) -> tuple[float, float]:
+    """Return (pct_any_depth, pct_deep_contiguous) of a polygon's area
+    inside the DEP scenario."""
+    try:
+        gdf = _load_dep_2263(scenario)
+    except Exception:
+        log.exception("DEP load failed for %s", scenario)
+        return 0.0, 0.0
+    if gdf is None or gdf.empty:
+        return 0.0, 0.0
+    # Bbox-prefilter the DEP polygons to those near our development.
+    minx, miny, maxx, maxy = geom_2263.bounds
+    cand = gdf.cx[minx:maxx, miny:maxy]
+    if cand.empty:
+        return 0.0, 0.0
+    # DEP NYC stormwater FGDB uses `Flooding_Category` (int16):
+    # 1=nuisance, 2=shallow, 3=deep contiguous (>4 ft).
+    cat_col = "Flooding_Category" if "Flooding_Category" in cand.columns else None
+    any_geom = cand.geometry.buffer(0).union_all()
+    if cat_col:
+        deep = cand[cand[cat_col] == 3]
+        deep_geom = deep.geometry.buffer(0).union_all() if not deep.empty else None
+    else:
+        deep_geom = None
+    pct_any = _overlap_pct(geom_2263, any_geom)
+    pct_deep = _overlap_pct(geom_2263, deep_geom) if deep_geom is not None else 0.0
+    return pct_any, pct_deep
+def summary_for_point(lat: float, lon: float,
+                       radius_m: float = DEFAULT_RADIUS_M,
+                       max_developments: int = DEFAULT_MAX_PER_QUERY) -> dict:
+    near, _ = _developments_near(lat, lon, radius_m)
+    if near.empty:
+        return {"available": False,
+                "n_developments": 0,
+                "radius_m": radius_m,
+                "developments": []}
+    near = near.head(max_developments)
+    sandy_2263 = _load_sandy_2263()
+    findings: list[DevelopmentFinding] = []
+    for _, row in near.iterrows():
+        geom = row.geometry
+        # Representative interior point gives a more meaningful elevation
+        # than the centroid for irregular development footprints.
+        rep = geom.representative_point()
+        # Re-project the rep point to 4326 for raster sampling
+        import geopandas as gpd
+        rep_4326 = gpd.GeoSeries([rep], crs="EPSG:2263").to_crs("EPSG:4326").iloc[0]
+        rep_lat, rep_lon = rep_4326.y, rep_4326.x
+        elev = _sample_raster(DATA / "nyc_dem_30m.tif", rep_lat, rep_lon)
+        hand = _sample_raster(DATA / "hand.tif", rep_lat, rep_lon)
+        pct_sandy = _overlap_pct(geom, sandy_2263)
+        pct_2080_any, pct_2080_deep = _dep_overlap(geom, "dep_extreme_2080")
+        pct_2050_any, _ = _dep_overlap(geom, "dep_moderate_2050")
+        findings.append(DevelopmentFinding(
+            development=str(row["developmen"]),
+            tds_num=str(row["tds_num"]),
+            borough=str(row["borough"]),
+            centroid_lat=round(float(row["clat"]), 5),
+            centroid_lon=round(float(row["clon"]), 5),
+            distance_m=round(float(row["distance_m"]), 1),
+            footprint_km2=round(geom.area / 10.7639 / 1_000_000, 4),  # sq-ft -> km²
+            rep_elevation_m=round(elev, 2) if elev is not None else None,
+            rep_hand_m=round(hand, 2) if hand is not None else None,
+            pct_inside_sandy_2012=pct_sandy,
+            pct_in_dep_extreme_2080=pct_2080_any,
+            pct_in_dep_extreme_2080_deep=pct_2080_deep,
+            pct_in_dep_moderate_2050=pct_2050_any,
+        ))
+    n_majority_sandy = sum(1 for f in findings if f.pct_inside_sandy_2012 >= 50)
+    n_any_2080 = sum(1 for f in findings if f.pct_in_dep_extreme_2080 > 0)
+    return {
+        "available": True,
+        "n_developments": len(findings),
+        "radius_m": radius_m,
+        "n_majority_inside_sandy_2012": n_majority_sandy,
+        "n_with_dep_2080_overlap": n_any_2080,
+        "developments": [vars(f) for f in findings],
+        "citation": ("NYC Open Data NYCHA Developments (phvi-damg) + "
+                     "NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) + "
+                     "NYC DEP Stormwater Flood Maps + USGS 3DEP DEM"),
+    }
+def main() -> int:
+    import argparse
+    ap = argparse.ArgumentParser()
+    ap.add_argument("--lat", type=float, required=True)
+    ap.add_argument("--lon", type=float, required=True)
+    ap.add_argument("--radius", type=float, default=DEFAULT_RADIUS_M)
+    ap.add_argument("--max", type=int, default=DEFAULT_MAX_PER_QUERY)
+    args = ap.parse_args()
+    s = summary_for_point(args.lat, args.lon, args.radius, args.max)
+    print(json.dumps(s, indent=2, default=str))
+    return 0
+if __name__ == "__main__":
+    sys.exit(main())

app/score.py CHANGED Viewed

@@ -1,47 +1,345 @@
-"""Transparent exposure scoring rubric. Published, not a black box.
-Each signal contributes a small integer; sum -> tier 1..4.
 """
 from __future__ import annotations
 import pandas as pd
-WEIGHTS = {
-    "sandy": 3,                 # empirical Sandy 2012 inundation
-    "dep_extreme_2080": 2,      # pluvial scenario, 3.66 in/hr + 2080 SLR
-    "dep_moderate_2050": 2,     # pluvial scenario, 2.13 in/hr + 2050 SLR
-    "dep_moderate_current": 1,  # pluvial scenario, 2.13 in/hr current
-    "complaints_3plus": 1,      # >=3 flood-related 311s within 200m, last 5 years
-    "floodnet_trigger": 1,      # FloodNet sensor within 400m with >=1 trigger event
-    "policy_named": 1,          # named in HMP/NPCC4/agency plan paragraph (RAG hit)
 }
 def tier(score: int) -> int:
-    if score >= 6:
-        return 1
-    if score >= 4:
-        return 2
-    if score >= 2:
-        return 3
-    if score >= 1:
-        return 4
     return 0
 def score_row(signals: dict) -> tuple[int, int]:
-    s = 0
-    for k, w in WEIGHTS.items():
-        if signals.get(k):
-            s += w
-    return s, tier(s)
 def score_frame(df: pd.DataFrame) -> pd.DataFrame:
     out = df.copy()
-    out["score"] = 0
-    for k, w in WEIGHTS.items():
-        if k in out.columns:
-            out["score"] += out[k].astype(bool).astype(int) * w
-    out["tier"] = out["score"].map(tier)
     return out

+"""Riprap exposure scoring — research-grounded deterministic rubric.
+This is an EXPOSURE index, not a damage probability. It produces a tier
+1-4 from a thematic additive composite over min-max-normalized indicators
+within sub-indices. The same input always produces the same tier; live
+signals (NWS alerts, surge residual, hourly precip) are NOT in this
+score — they are surfaced as a separate "current conditions" badge per
+NPCC4 / IPCC AR6 WG II's distinction between exposure (quasi-stationary
+property of place) and event occurrence (time-varying).
+Methodology:
+- Cutter, Boruff & Shirley, 2003. "Social Vulnerability to Environmental
+  Hazards." Social Science Quarterly 84(2): 242-261. — hazards-of-place
+  composite construction.
+- Tate, 2012. "Social Vulnerability Indices: A Comparative Assessment
+  Using Uncertainty and Sensitivity Analysis." Natural Hazards 63: 325-
+  347. — equal weights within thematic groups are the most rank-stable
+  default; differential weighting is hard to defend.
+- Balica, Wright & van der Meulen, 2012. "A Flood Vulnerability Index
+  for Coastal Cities." Natural Hazards 64: 73-105. — multiplicative
+  override behaviour; we recover the important part as a "max-empirical
+  floor" rather than a full multiplicative form.
+Per-indicator citations:
+- HAND breakpoints: Nobre et al., 2011. "Height Above the Nearest
+  Drainage." J. Hydrology 404: 13-29.
+- TWI: Beven & Kirkby, 1979. Hydrological Sciences Bulletin 24; Sørensen,
+  Zinko & Seibert, 2006. HESS 10: 101-112. (Half-weight because TWI is
+  noisier than HAND in flat urban DEMs; we percentile-bin rather than
+  use absolute cutoffs.)
+- Zone hierarchy: NYC NPCC4 (2024) Ch. 3; NYC Hazard Mitigation Plan 2024.
+- USGS HWM proximity floor: USGS HWM positional uncertainty is typically
+  5-30 m horizontal, so 100 m gives ~3σ headroom for a true "this
+  address was inundated" signal.
+Scope limit: We have no labeled flood-damage outcomes. The tier is a
+literature-grounded exposure prior, not a calibrated loss prediction.
+For insurance pricing, use FEMA Risk Rating 2.0 (claims-driven GLM).
 """
 from __future__ import annotations
 import pandas as pd
+# ---------- Indicator schemas ----------------------------------------------
+#
+# Each sub-index is a mapping {indicator_name: weight}. Within a sub-index,
+# the weighted sum is normalized by the maximum possible weight, giving a
+# 0-1 score per sub-index. The composite is the sum of the three sub-index
+# scores (range 0-3), then mapped to tiers.
+#
+# Why equal weights within thematic groups: Tate 2012's uncertainty
+# analysis showed that differential weighting is the most-attacked axis
+# of any composite vulnerability/exposure index. Equal weights are the
+# safest default; agency tiering (which puts FEMA 1% above 0.2%, Sandy
+# above modeled scenarios) supplies the remaining structure.
+REGULATORY = {
+    # FEMA NFHL — regulatory baseline. SFHA (1%) is the mandate threshold.
+    "fema_1pct":            1.00,
+    "fema_02pct":           0.50,
+    # NYC DEP Stormwater Maps (2021) — modeled pluvial scenarios.
+    # Moderate-2050 is treated heavier than Extreme-2080 because NPCC4
+    # explicitly designates 2080 SLR + 7 in/hr as a TAIL scenario.
+    "dep_moderate_2050":    0.75,
+    "dep_extreme_2080":     0.50,
+    "dep_tidal_2050":       0.75,
+}
+HYDROLOGICAL = {
+    # HAND (Height Above Nearest Drainage), banded per Nobre et al. 2011.
+    # Bands: <1 m (channel/floodplain near-certain wet) → 1.0
+    #        1-3 m (floodplain)                         → 0.66
+    #        3-10 m (transitional)                      → 0.33
+    #        >10 m (hillslope, dry)                     → 0
+    "hand_band":            1.00,
+    # TWI quartile (top quartile = saturation-prone). Half-weight
+    # because TWI is noisier than HAND in urban DEMs; we percentile-bin
+    # within NYC rather than using absolute cutoffs.
+    "twi_quartile":         0.50,
+    # Local-relief inversions: low percentile = topographic low point.
+    # Bins: <10th=1.0, 10-25th=0.66, 25-50th=0.33, ≥50th=0.
+    "elev_pct_200m_inv":    0.50,
+    "elev_pct_750m_inv":    0.50,
+    # Basin relief contributes a small additional terrain term.
+    "basin_relief_band":    0.25,
+}
+EMPIRICAL = {
+    # Sandy 2012 inundation — empirical post-event extent. Also triggers
+    # the max-empirical FLOOR rule below.
+    "sandy":                1.00,
+    # USGS Hurricane Ida 2021 high-water marks. Within 100 m → "direct"
+    # (also triggers the floor); 100-800 m → "neighborhood proximity".
+    "ida_hwm_within_100m":  1.00,
+    "ida_hwm_within_800m":  0.50,
+    # Prithvi-EO 2.0 satellite-derived inundation polygon (Hurricane Ida
+    # pre/post diff) — semi-empirical because model-derived but
+    # conditioned on observed Sentinel-2 imagery.
+    "prithvi_polygon":      0.75,
+    # NYC 311 flood-related complaint count, banded over 5-year window:
+    #   ≥10 → 1.0, 3-9 → 0.66, 1-2 → 0.33, 0 → 0
+    # Weight capped at 0.75 because 311 has documented socio-economic
+    # reporting bias (engagement varies by neighborhood).
+    "complaints_band":      0.75,
+    # FloodNet trigger flag (any labeled flood event at any sensor
+    # within 600 m, last 3 years). Same 0.75 cap as 311 since both have
+    # spatial coverage bias.
+    "floodnet_trigger":     0.75,
+}
+def _hand_band(hand_m: float | None) -> float:
+    """Nobre et al. 2011 HAND classes adapted for NYC's flat urban terrain."""
+    if hand_m is None:
+        return 0.0
+    if hand_m < 1.0:
+        return 1.0
+    if hand_m < 3.0:
+        return 0.66
+    if hand_m < 10.0:
+        return 0.33
+    return 0.0
+def _percentile_inv_band(pct: float | None) -> float:
+    """Inverted relief percentile: lower = more exposed (water pools here)."""
+    if pct is None:
+        return 0.0
+    if pct < 10:
+        return 1.0
+    if pct < 25:
+        return 0.66
+    if pct < 50:
+        return 0.33
+    return 0.0
+def _twi_quartile(twi: float | None) -> float:
+    """TWI thresholds calibrated to NYC's flat 30 m DEM. Top quartile
+    cutoff comes from the NYC-wide TWI distribution; here we approximate
+    with literature-typical breakpoints (Sørensen 2006 site-specific
+    advice)."""
+    if twi is None:
+        return 0.0
+    if twi >= 12:
+        return 1.0
+    if twi >= 10:
+        return 0.66
+    if twi >= 8:
+        return 0.33
+    return 0.0
+def _basin_relief_band(relief_m: float | None) -> float:
+    if relief_m is None:
+        return 0.0
+    # Higher basin relief in a flat area means the address sits in a real
+    # depression. Banding is empirical for NYC.
+    if relief_m >= 8:
+        return 1.0
+    if relief_m >= 4:
+        return 0.66
+    if relief_m >= 2:
+        return 0.33
+    return 0.0
+def _complaints_band(n: int | None) -> float:
+    if not n:
+        return 0.0
+    if n >= 10:
+        return 1.0
+    if n >= 3:
+        return 0.66
+    if n >= 1:
+        return 0.33
+    return 0.0
+# ---------- Sub-index computation ------------------------------------------
+def _normalize(weighted: float, weights: dict[str, float]) -> float:
+    max_w = sum(weights.values())
+    return weighted / max_w if max_w else 0.0
+def regulatory_subindex(s: dict) -> float:
+    """0..1. All inputs are binary (inside zone or not)."""
+    w = REGULATORY
+    raw = sum(w[k] * (1.0 if s.get(k) else 0.0) for k in w)
+    return _normalize(raw, w)
+def hydrological_subindex(s: dict) -> float:
+    """0..1. Inputs are continuous; convert to ordinal bands first."""
+    w = HYDROLOGICAL
+    bands = {
+        "hand_band":          _hand_band(s.get("hand_m")),
+        "twi_quartile":       _twi_quartile(s.get("twi")),
+        "elev_pct_200m_inv":  _percentile_inv_band(s.get("rel_elev_pct_200m")),
+        "elev_pct_750m_inv":  _percentile_inv_band(s.get("rel_elev_pct_750m")),
+        "basin_relief_band":  _basin_relief_band(s.get("basin_relief_m")),
+    }
+    raw = sum(w[k] * bands[k] for k in w)
+    return _normalize(raw, w)
+def empirical_subindex(s: dict) -> float:
+    """0..1. Mix of binary and banded count signals."""
+    w = EMPIRICAL
+    vals = {
+        "sandy":               1.0 if s.get("sandy") else 0.0,
+        "ida_hwm_within_100m": 1.0 if s.get("ida_hwm_within_100m") else 0.0,
+        "ida_hwm_within_800m": 1.0 if s.get("ida_hwm_within_800m") else 0.0,
+        "prithvi_polygon":     1.0 if s.get("prithvi_polygon") else 0.0,
+        "complaints_band":     _complaints_band(s.get("complaints_count")),
+        "floodnet_trigger":    1.0 if s.get("floodnet_trigger") else 0.0,
+    }
+    raw = sum(w[k] * vals[k] for k in w)
+    return _normalize(raw, w)
+# ---------- Composite + tier mapping ---------------------------------------
+# Tier breakpoints over the composite (range 0-3, since each sub-index is
+# 0-1). Tuned so that "Sandy + DEP-2050 + HAND<1m" lands in Tier 1, and a
+# single positive signal lands in Tier 4. Documented in METHODOLOGY.md.
+TIER_BREAKPOINTS = [
+    (1.50, 1),   # high — multiple sub-indices saturated
+    (1.00, 2),   # elevated — at least one strong sub-index
+    (0.50, 3),   # moderate — partial signals across categories
+    (0.01, 4),   # limited — a single contextual signal
+]
+TIER_LABELS = {
+    1: ("High exposure",       "Multiple sub-indices saturated; empirical and/or "
+                               "modeled scenarios both indicate substantial exposure."),
+    2: ("Elevated exposure",   "At least one sub-index near saturation; significant "
+                               "overlap with empirical or modeled scenarios."),
+    3: ("Moderate exposure",   "Partial signals across categories; scenario- or "
+                               "neighborhood-specific exposure."),
+    4: ("Limited exposure",    "A single contextual signal; no positive scenario hits."),
+    0: ("No flagged exposure", "No positive flood signal across the assessed sources."),
 }
+def composite(signals: dict) -> dict:
+    """Compute sub-indices, composite score, and tier with the floor rule.
+    Returns: {
+        'subindices': {'regulatory': 0..1, 'hydrological': 0..1, 'empirical': 0..1},
+        'composite': 0..3,
+        'tier': 0..4,
+        'floor_applied': bool,
+    }
+    Max-empirical floor: if Sandy 2012 inundation OR a USGS Ida HWM within
+    100 m fired, the tier is capped at 2 (cannot be worse). This recovers
+    the multiplicative behavior — empirical evidence overrides terrain or
+    modeled scenarios — without giving up additive transparency.
+    """
+    reg = regulatory_subindex(signals)
+    hyd = hydrological_subindex(signals)
+    emp = empirical_subindex(signals)
+    composite_score = reg + hyd + emp
+    raw_tier = 0
+    for breakpoint, t in TIER_BREAKPOINTS:
+        if composite_score >= breakpoint:
+            raw_tier = t
+            break
+    floor_applied = bool(signals.get("sandy") or signals.get("ida_hwm_within_100m"))
+    if floor_applied and (raw_tier == 0 or raw_tier > 2):
+        final_tier = 2
+    else:
+        final_tier = raw_tier
+    return {
+        "subindices": {
+            "regulatory":   round(reg, 3),
+            "hydrological": round(hyd, 3),
+            "empirical":    round(emp, 3),
+        },
+        "composite":     round(composite_score, 3),
+        "tier":          final_tier,
+        "floor_applied": floor_applied,
+    }
+# ---------- Backward-compat shims ------------------------------------------
+# Register CLI and register_builder consume a flat `tier` column on a
+# DataFrame. The shim materializes composite() over rows and writes back
+# `score` (composite scaled 0-100) and `tier`.
 def tier(score: int) -> int:
+    """Legacy bridge for callers that still pass a small-integer score.
+    Maps the OLD additive-integer score to the new tier breakpoints by
+    scaling. Prefer composite() for new code."""
+    if score >= 6: return 1
+    if score >= 4: return 2
+    if score >= 2: return 3
+    if score >= 1: return 4
     return 0
+# Legacy WEIGHTS map kept so riprap.py and any external consumer
+# continue to import without breaking. The new composite() is the
+# authoritative scorer.
+WEIGHTS = {
+    "sandy":                 3,
+    "dep_extreme_2080":      2,
+    "dep_moderate_2050":     2,
+    "dep_moderate_current":  1,
+    "complaints_3plus":      1,
+    "floodnet_trigger":      1,
+    "policy_named":          1,
+}
 def score_row(signals: dict) -> tuple[int, int]:
+    """Legacy-shape wrapper around composite(). Returns (composite_x100, tier)."""
+    c = composite(signals)
+    return int(round(c["composite"] * 100)), c["tier"]
 def score_frame(df: pd.DataFrame) -> pd.DataFrame:
+    """Vectorized composite over a DataFrame whose columns name our
+    indicators. Missing columns are treated as 0 / None.
+    Adds columns: subindex_regulatory, subindex_hydrological,
+    subindex_empirical, composite, score, tier, floor_applied.
+    `score` is the composite scaled 0-100 for register CSV legibility.
+    """
     out = df.copy()
+    rows = out.to_dict(orient="records")
+    results = [composite(r) for r in rows]
+    out["subindex_regulatory"]   = [r["subindices"]["regulatory"]   for r in results]
+    out["subindex_hydrological"] = [r["subindices"]["hydrological"] for r in results]
+    out["subindex_empirical"]    = [r["subindices"]["empirical"]    for r in results]
+    out["composite"]    = [r["composite"]     for r in results]
+    out["score"]        = (out["composite"] * 100).round().astype(int)
+    out["tier"]         = [r["tier"]          for r in results]
+    out["floor_applied"] = [r["floor_applied"] for r in results]
     return out

audit/2026-05-03-evening-audit.md ADDED Viewed

	@@ -0,0 +1,183 @@

+# Riprap Hackathon Week Audit — 2026-05-03 Evening
+## TL;DR
+The four register specialists (MTA entrances, NYCHA, DOE schools, DOH hospitals) are **shipped, FSM-wired, and validated end-to-end** as of commit `86861be` — the "subway-entrance specialist drift" hypothesis is wrong; it landed Sunday afternoon. The TerraMind-NYC fine-tune is running in its dedicated session (eval spec v2 in place, v1 postmortemed). The biggest real drift is in **pitch artifacts**: `pitch/cold_open.md` was deleted by commit `1cb5ee6` (Sunday 18:59 ET) along with the entire `pitch/` directory — `MONDAY.md` still shows it as ✓. The Build-in-Public posts, methodology paper PDF, ASCE materials, historical-event mode, and the four extra TTM specialists are **not started**. Visual identity v0.4.1/v0.4.2 is largely landed in `web/sveltekit/`. The planner-level refusal shim from Phase 6 is **documented as shipping in the FSM but is not actually wired into `app/planner.py`**.
+## Specialist roster
+| Specialist | Exists | Wired into FSM | Tested | Tier | Last touched | Notes |
+|---|---|---|---|---|---|---|
+| `geocode` | ✓ | ✓ | ✓ (integration) | reference | baseline | `app/geocode.py` |
+| `sandy_inundation` | ✓ | ✓ | ✓ | empirical | baseline | NYC-only gated |
+| `dep_stormwater` | ✓ | ✓ | ✓ | modeled | baseline | 3 scenarios |
+| `floodnet` | ✓ | ✓ | ✓ | empirical | baseline | |
+| `nyc311` | ✓ | ✓ | ✓ | empirical | baseline | |
+| `noaa_tides` | ✓ | ✓ | ✓ | empirical | baseline | |
+| `nws_alerts` | ✓ | ✓ | ✓ | empirical | baseline | |
+| `nws_obs` | ✓ | ✓ | ✓ | empirical | baseline | |
+| `ttm_forecast` (Battery surge) | ✓ | ✓ | ✓ | modeled | baseline | TTM r2 |
+| `ttm_311_forecast` | ✓ | ✓ | ✓ | modeled | baseline | per-address TTM r2 |
+| `microtopo_lidar` | ✓ | ✓ | ✓ | proxy | baseline | |
+| `ida_hwm_2021` | ✓ | ✓ | ✓ | empirical | baseline | |
+| `prithvi_eo_v2` (baked Ida polys) | ✓ | ✓ | ✓ | empirical | baseline | |
+| `prithvi_eo_live` (Sentinel-2) | ✓ | ✓ (heavy) | ✓ | empirical | baseline | gated by `RIPRAP_HEAVY_SPECIALISTS` |
+| `terramind_synthesis` (DEM→LULC) | ✓ | ✓ (heavy) | — | synthetic | baseline | |
+| `rag_granite_embedding` | ✓ | ✓ | ✓ | reference | baseline | |
+| `gliner_extract` | ✓ | ✓ | ✓ | reference | baseline | |
+| **`mta_entrance_exposure`** | ✓ | ✓ | ✗ (no per-specialist test) | mixed | 2026-05-03 | first output Sheepshead Bay |
+| **`nycha_development_exposure`** | ✓ | ✓ (heavy) | ✗ | mixed | 2026-05-03 | first output Red Hook |
+| **`doe_school_exposure`** | ✓ | ✓ (heavy) | ✗ | mixed | 2026-05-03 | first output Coney Island |
+| **`doh_hospital_exposure`** | ✓ | ✓ (heavy) | ✗ | mixed | 2026-05-03 | first output Coney Island |
+| FEMA OpenFEMA NFIP claims | ✗ | ✗ | ✗ | — | — | not started |
+| NWS NWPS reach forecast | ✗ | ✗ | ✗ | — | — | not started |
+| USGS NWIS streamgages | ✗ | ✗ | ✗ | — | — | not started |
+| NYC DEP CSO/Bluebelt/GI | ✗ | ✗ | ✗ | — | — | not started |
+| TTM streamgage stage forecast | ✗ | ✗ | ✗ | — | — | not started |
+| TTM FloodNet sensor depth | ✗ | ✗ | ✗ | — | — | not started |
+| TTM NWS rainfall accumulation | ✗ | ✗ | ✗ | — | — | not started |
+| TTM citywide 311 sewer-backup | ✗ | ✗ | ✗ | — | — | not started |
+| Granite Guardian terminal check | ✗ | ✗ (pivoted) | — | — | — | replaced by planner-level refusal in design — but see Anomalies |
+## Foundation models
+| Model | Imported | Instantiated | Called | Routed via LiteLLM | Notes |
+|---|---|---|---|---|---|
+| Granite 4.1:3b (planner) | ✓ | ✓ | ✓ | ✓ | `app/planner.py` via `app.llm.chat` |
+| Granite 4.1:8b (reconciler) | ✓ | ✓ | ✓ | ✓ | `app/reconcile.py`, `app/mellea_validator.py` |
+| Granite Embedding 278M | ✓ | ✓ | ✓ | n/a (HF transformers) | `app/rag.py` |
+| Granite Reranker R2 | ✓ | ✓ | ✓ (when enabled) | n/a | gated; see test_phase3 |
+| GLiNER medium v2.1 | ✓ | ✓ | ✓ | n/a | `app/context/gliner_extract.py` |
+| Prithvi-EO 2.0 Sen1Floods11 | ✓ | ✓ (heavy) | ✓ (live) | n/a | `app/flood_layers/prithvi_live.py` |
+| TerraMind 1.0 base | ✓ | ✓ (heavy) | ✓ | n/a | `app/context/terramind_synthesis.py` |
+| Granite TTM r2 (surge + 311) | ✓ | ✓ | ✓ | n/a | `app/live/ttm_forecast.py` — only 2 of planned 6 instances |
+| Granite Guardian 3.2 3B-A800M | ✗ | ✗ | ✗ | — | dropped per Phase 6 pivot |
+## Data sources
+| Source | Status | Consumer | Notes |
+|---|---|---|---|
+| Sandy Inundation 2012 (NYC OEM) | implemented | `sandy_inundation`, all 4 registers | `data/sandy_inundation.geojson` |
+| NYC DEP Stormwater Flood Map | implemented | `dep_stormwater`, registers | 3 scenarios |
+| FloodNet sensors | implemented | `floodnet` | |
+| NYC 311 service requests | implemented | `nyc311`, `ttm_311_forecast` | |
+| NOAA CO-OPS tides | implemented | `noaa_tides`, `ttm_forecast` | |
+| NWS alerts + obs | implemented | `nws_alerts`, `nws_obs` | |
+| Hurricane Ida HWMs (USGS) | implemented | `ida_hwm` | |
+| Prithvi-EO Ida polygons (baked) | implemented | `prithvi_water` | |
+| Sentinel-2 via Planetary Computer | implemented | `prithvi_live` | heavy |
+| MTA Subway Entrances 2024 | implemented | `mta_entrances` | `data/mta_entrances.geojson` (2120 entrances) |
+| USGS 3DEP DEM 1m / HAND | implemented | registers | `data/nyc_dem_30m.tif`, `data/hand.tif` |
+| NYCHA Development Data Book | implemented | `nycha` | per `b196bd8`+ |
+| NYC DOE School Locations | implemented | `doe_schools` | |
+| NYS DOH / NYC hospitals | implemented | `doh_hospitals` | |
+| MTA Sandy-recovery report | not started | (queued) | Monday plan — `[mta_recovery_<station_id>]` doc messages |
+| FEMA OpenFEMA NFIP claims | not started | — | |
+| NWS NWPS reach forecast API | not started | — | |
+| USGS NWIS streamgages | not started | — | |
+| NYC DEP CSO outfalls | not started | — | |
+| NYC DEP Bluebelt | not started | — | |
+| NYC DEP Green Infrastructure DB | not started | — | |
+| PLUTO building footprints | not started | (queued — fixes centroid-edge) | NYU Langone/Stuyvesant/P.S. 89 false-negatives |
+## Design system v0.4.1/v0.4.2 integration
+| Item | Status | Notes |
+|---|---|---|
+| Carto Positron / Voyager basemap | ✓ | `web/sveltekit/src/lib/components/map/baseStyle.ts` |
+| IBM Plex Sans/Mono/Serif | ✓ | `tokens.css` |
+| Four-tier color palette (CSS vars, WCAG-fixed values) | ✓ | `tokens.css` matches the spec hex codes |
+| Epistemic-tier glyph SVG | ✓ | `lib/components/glyphs/TierGlyph.svelte` |
+| Per-claim margin glyph rendering | ✓ | `Briefing.svelte` + `Claim.svelte` |
+| Section-head tier badges | ✓ | `SectionHead.svelte`, `TierBadge.svelte` |
+| Hoverable inline citations + drawer | ✓ | `Cite.svelte`, `CitationDrawer.svelte` |
+| Trace UI as `<details>` tree with tier badges | ✓ | `TraceUI.svelte`, `TraceRow.svelte` |
+| Layers panel with tier badges | ✓ | `MapLegend.svelte` (4 layer entries hit by demo deck) |
+| Cold-start with sample queries | ✓ | `ColdStart.svelte` |
+| Trust-signal footer | ✓ | `AppFooter.svelte` |
+| WeasyPrint PDF template | ✗ | only routed at `/print/{query_id}` (browser print); no WeasyPrint dep in `requirements.txt` |
+| Browser print stylesheet | ✓ | `lib/print.css` |
+| Loading / skeleton states | ✓ | `SkeletonBriefing.svelte` |
+| Error states | ✓ | `ErrorCard.svelte` |
+| Refusal state UI | ✓ component exists | `GuardianRefusal.svelte` — but back-end refusal classifier not wired (see Anomalies) |
+| Reroll banner | ✓ | `RerollBanner.svelte` |
+| Synthetic-stripe SVG pattern | ✓ | `synStripe.ts`, `ThumbStripe.svelte` |
+| Granite version string = 4.1 | ✓ (sampled) | |
+| RegisterCard evidence format | ✓ | `RegisterCard.svelte` (rendered in `nyu-langone` demo run) |
+| Dark mode | unverified | not searched |
+## Accessibility
+| WCAG 2.2 AA item | Status | Notes |
+|---|---|---|
+| Tier color contrast verified | ✓ | tokens.css comments document per-color ratios + AA/AAA passes |
+| Color independence (glyph shape) | ✓ | TierGlyph component exists |
+| Skip-links | ✓ | `SkipLinks.svelte` |
+| Focus rings | unverified | `--accent-graphical: #D17C00` token exists but per-element outline rules not audited |
+| Heading hierarchy | unverified | not audited |
+| Touch-target sizing | unverified | not audited |
+| `role="log"` aria-live polite for streaming | ✓ | found in `agentStream.ts`, `Briefing.svelte`, `RerollBanner.svelte`, `SkeletonBriefing.svelte` |
+| Map `role="application"` + alt-text | unverified | grep didn't surface — needs walk through `RipMap.svelte` |
+| `prefers-reduced-motion` respected | ✓ (partial) | rules in `tokens.css` and `styles.css`; per-component coverage unverified |
+| Plain-language redirect for resident queries | unverified | `ColdStart.svelte` mentions FloodHelpNY redirect per spec — not visually verified |
+| Glyph alt-text (`role="img"`) | unverified | not audited |
+## Keep-list and pitch artifacts
+- ✓ `experiments/05_terramind_nyc_finetune/eval/eval_spec.md` — present (also `eval_spec_v2.md` with v1 postmortem at `eval/v1_synth_sar_postmortem.md`)
+- ⚠ `pitch/cold_open.md` — **DELETED** by commit `1cb5ee6` (2026-05-03 18:59 ET, "Demo deck: 10/10 live SSE tests"). MONDAY.md still says ✓. Last good content is in commit `b4239de`.
+- ✓ `experiments/06_granite_guardian/adversarial_queries.jsonl` — present + planner-pivot results in `planner_refusal_summary.md` and `RESULTS.md`
+- ✗ `experiments/07_historical_event_mode/` — does not exist
+- ✗ Methodology paper draft (6-8 page PDF) — only `METHODOLOGY.md` (264 lines, scoring-methodology only, not the publication paper draft)
+- ✗ `pitch/` directory — gone (deleted with cold_open.md). Demo-side artifacts now live in `web/sveltekit/tests/e2e/demo-script.md` and the (gitignored) `pitch/screenshots-2026-05-03/`
+- ✗ `asce/` — does not exist
+- ✗ Build-in-Public posts — no `posts/`, `build_in_public/`, or comparable directory
+## Integration tests
+- 26 tests collected in `tests/test_integration.py` (parametrized over `brighton`, `hollis`, `hunts`); plus `test_agent_e2e.py`, `test_agent_full.py`, `test_sample_queries.py`. Not executed (would exceed 30 s budget — they hit the live SSE stream).
+- The 4 new register specialists have **no per-specialist integration test** in `tests/`. Coverage is via the e2e `demo-queries.spec.ts` Playwright suite (`web/sveltekit/tests/e2e/`), which runs them in the FSM during `nyu-langone`, `red-hook-houses-nycha`, `coney-island`, `sheepshead-bay` queries.
+- Frontend Playwright suites: `coldstart`, `demo-queries`, `layers`, `print`, `sample`, `states`, `sticky-map` (7 spec files).
+## experiments/ directory
+- `00_endpoints` — completed (RESULTS.md, 8/8 endpoint smokes)
+- `01_prithvi_live_water` — completed
+- `02_gliner_extraction` — completed
+- `03_granite_reranker` — completed
+- `04_terramind_synthetic_sar` — parked-as-research per commit `271e673`
+- `05_sam2_promptable` — empty directory (mid-flight or abandoned scaffolding)
+- `05_terramind_finetune` — early micro-FT scaffold (`micro.py` + `RESULTS.md`); superseded by `05_terramind_nyc_finetune/`
+- `05_terramind_nyc_finetune` — **active in another session**; eval_spec_v2 in place, training subdir present
+- `06_chronos_bolt_forecast` — empty directory (not started)
+- `06_granite_guardian` — completed-as-pivot (Guardian → planner shim; `planner_refusal_summary.md` documents FAIL on 5% FP gate)
+- `07_mta_entrances` — completed and migrated to `app/registers/mta_entrances.py`
+- `08_nycha_developments` — completed and migrated
+- `09_doe_schools` — completed and migrated
+- `10_doh_hospitals` — completed and migrated
+## Anomalies and weird things
+- `experiments/05_sam2_promptable/` and `experiments/06_chronos_bolt_forecast/` are **empty directories** — either abandoned scaffolds or interrupted sessions. There is also `experiments/05_terramind_finetune/` (early micro-FT) sitting next to `experiments/05_terramind_nyc_finetune/` (current).
+- **Numbering collision at `05_*` and `06_*`** between the empty/legacy dirs and the active ones.
+- **Planner-level refusal shim is documented as shipping but is not in `app/`.** `experiments/06_granite_guardian/RESULTS.md` and MONDAY.md both say "the planner-level refusal shim still ships in the FSM as a polite-refusal layer." A grep for `refusal|guardian` across `app/` (including `app/planner.py`) returns no hits. The frontend `GuardianRefusal.svelte` component exists but has no backend signal to display.
+- **`pitch/cold_open.md` deletion** by `1cb5ee6` is almost certainly accidental — that commit's message describes adding 6 demo queries and a `demo-script.md`; deleting the cold-open is unrelated and not mentioned. Likely casualty of moving screenshots into a gitignored path.
+- **`Riprap.zip` at repo root** is untracked — leftover archive.
+- **CLAUDE.md / MONDAY.md disagree on AMD droplet IP**: CLAUDE.md never mentions an IP (uses `<droplet-ip>` placeholders); MONDAY.md explicitly says CLAUDE.md is wrong (cites `165.245.134.44`) and that `129.212.182.52` is production. CLAUDE.md grep doesn't surface the wrong IP, so the MONDAY.md note may itself be stale.
+- **MONDAY.md status table out-of-sync** with the deletion of `pitch/cold_open.md`.
+- No TODO/FIXME/XXX comments in `app/` Python or in `web/sveltekit/src/`.
+- No imports from `experiments.*` inside `app/` or `web/`.
+- No specialists registered in `app/fsm.py` are missing from `app/registers/` or `app/context/` (vice-versa clean).
+- WeasyPrint is referenced in MONDAY.md / spec but is **not in `requirements.txt`** — `/print/{query_id}` route serves a browser-print page only.
+## The single most important gap
+The originally-suspected subway-entrance specialist gap is not real — that work shipped Sunday afternoon and is wired through the FSM, the reconciler, and the demo Playwright suite. The single most important *actual* gap is the **deletion of `pitch/cold_open.md` (and the entire `pitch/` directory) in commit `1cb5ee6`**. The cold-open phrasing was an explicit Sunday keep-list item ("seven-tunnels framing, no inflated dollar figure"), Sunday's MONDAY.md handoff still treats it as ✓, and the AMD demo on May 10 will require it. The content is recoverable from `git show b4239de:pitch/cold_open.md` and should be restored before any other Monday work begins.
+## Recommended next-session priorities
+1. **Restore `pitch/cold_open.md`** — `git show b4239de:pitch/cold_open.md > pitch/cold_open.md` and commit. ~5 min. Done = file present, MONDAY.md row still accurate, content matches the seven-tunnels framing.
+2. **Wire the planner-level refusal shim into `app/planner.py`** — the documented contract from Phase 6 (FN=0% safety-critical) is not actually live. ~30–60 min. Done = planner returns `refusal_reason` field on the 50 should-refuse adversarial queries; `GuardianRefusal.svelte` renders end-to-end on at least one out-of-scope query in the e2e suite.
+3. **PLUTO building-footprint join for register centroid-edge cases** — single change unlocks NYU Langone / Stuyvesant / P.S. 89 flipping to `inside_sandy_2012=true` across all four register specialists. Pre-existing queue from MONDAY.md. ~2–3 hr. Done = the three known-failing addresses each show `inside_sandy_2012=true` in the FSM trace and the briefing cites `[doh_hospital_*]` / `[doe_school_*]` accordingly.
+4. **MapLibre rendering for the 4 register specialists** — entrance points coloured by Sandy/DEP, NYCHA polygon fills graded by `pct_inside_sandy`, school + hospital points in the same color ramp. The data is in state; the map layers aren't yet drawing them. ~2–4 hr. Done = layers panel shows ≥4 new layer entries on `red-hook-houses-nycha` and `nyu-langone`; e2e screenshot diff captures them.
+5. **Remove dead/empty experiment dirs and clarify numbering** — delete `experiments/05_sam2_promptable/`, `experiments/06_chronos_bolt_forecast/`, and decide whether `experiments/05_terramind_finetune/` should be folded into the NYC fine-tune dir or kept as a separate phase artifact. ~15 min. Done = no empty dirs; numbering collision documented or resolved.

audit/2026-05-04-morning-handoff.md ADDED Viewed

	@@ -0,0 +1,165 @@

+# Riprap overnight handoff — Monday 2026-05-04
+Continuation point for the wake-up session. Read this first; everything
+points outwards from here.
+## TL;DR
+All eight priorities from the overnight wiring pass landed. The
+audit-flagged drift items are closed: cold-open restored, Guardian
+gone, trace UI now clickable, register-specialist Sandy false-negatives
+fixed, register pins on the map, FloodNet TTM forecast wired,
+TTM specialists grouped in trace, `experiments/` cleaned. The single
+load-bearing UX feature to verify in the morning is the trace-UI
+drilldown — clicking any specialist row reveals its raw structured
+output, which is the auditability contract for the entire system.
+## Commits landed (overnight)
+| Commit  | Priority | What |
+|---------|----------|------|
+| `a2143fc` | P1 | Restore `pitch/cold_open.md` from `b4239de` (accidentally deleted in `1cb5ee6`). |
+| `4b9e55e` | P2 | Remove `GuardianRefusal.svelte`, `RefusalCategory` type, `.guardian-*` CSS, Playwright assertion. Mellea is the sole grounding mechanism. |
+| `3e4f922` | P3 | **Trace UI clickable drilldown.** Click any row → raw structured output panel (formatted JSON, copy button, status-aware label, max-height + scroll). |
+| `47ed3fb` | P4 | Buffered-footprint overlap (`app/registers/_footprint.py`) — MTA 8m / DOE 50m / DOH 100m. NYU Langone, Stuyvesant HS, P.S. 89 flip to `inside_sandy_2012=true`. |
+| `792f4ee` | P5 | Map: register-asset pins (subway 4px / school 5px / hospital 6px / NYCHA-centroid 7px), colored by Sandy exposure, click popup with name + `[doc_id]`. |
+| `3d991e9` | P6 | **`floodnet_forecast` specialist.** TTM r2 (512, 96) forecast on nearest FloodNet sensor's daily flood-event series — reuses the existing model singleton, no new model class loaded. |
+| `90644e4` | P7 | Trace UI groups TTM specialists under `forecasting.granite-timeseries-ttm-r2 [N instances]`. `leafSteps` walks recursively so children still count toward fired/silent/errors. |
+| `36e28d1` | P8 | Drop `Riprap.zip`, empty `05_sam2_promptable/`, empty `06_chronos_bolt_forecast/`. Rename `05_terramind_finetune` → `05a_terramind_finetune_micro` (dedupe with active NYC fine-tune dir). |
+Two further commits update MONDAY.md and add this handoff.
+## Verify first when you wake
+Run a Red Hook query (rich output, exercises everything) and check:
+```bash
+.venv/bin/uvicorn web.main:app --host 127.0.0.1 --port 7860 --log-level info
+# then visit http://127.0.0.1:7860/q/red%20hook%20houses
+```
+1. **Trace drilldown.** Click any specialist row in the run-trace
+   panel. You should see a structured output panel with formatted
+   JSON, a "Copy" button, and a status-coloured label
+   (Output / Silent reason / Error). Multiple rows can be expanded
+   simultaneously. *This is the load-bearing feature.* If clicking
+   doesn't expand the row, check the browser console; the build is
+   committed in `web/sveltekit/build/`.
+2. **TTM grouping.** The trace should show
+   `forecasting.granite-timeseries-ttm-r2 · 3 instances` (or 2 if
+   floodnet_forecast finds no usable sensor) as a single
+   auto-expanded parent with the TTM children nested under it. The
+   top-of-trace fired/silent/errors counters should still include
+   the TTM children — that's the recursion fix in `TraceUI.svelte`.
+3. **Register pins on the map.** Click a subway/school/hospital pin.
+   Popup should show name, kind, `inside_sandy_2012`, and
+   `[mta_entrance_…]` / `[doe_school_…]` / `[nyc_hospital_…]` /
+   `[nycha_dev_…]` doc-id, the same one cited in the briefing.
+4. **Buffered Sandy join.** Run the NYU Langone single-address query
+   (`570 First Ave Manhattan` or similar). The hospital row should
+   show `inside_sandy_2012=true` in its trace drilldown panel and
+   the briefing should cite `[nyc_hospital_…]` accordingly.
+5. **No Guardian card anywhere.** No `GuardianRefusal.svelte`, no
+   `.guardian-*` CSS class, no `RefusalCategory` import. Mellea
+   reroll banner is the only integrity-narration UI.
+## What's queued next (Monday morning, in priority order)
+1. **NYCHA polygon-fill on the map.** Add `geometry_geojson` field
+   to `app/registers/nycha.py:DevelopmentFinding` (serialise the
+   polygon as GeoJSON). The frontend `RipMap.svelte` already has
+   the `register-polygons` source + fill/line layers wired and
+   waiting for non-empty data. ~30 min. Done = NYCHA developments
+   render as graded fills (denser if more of the footprint inside
+   Sandy) at the Red Hook query.
+2. **TerraMind-NYC fine-tune morning routine.** From MONDAY.md:
+   refresh PC signed URLs on the AMD droplet, then proceed with the
+   eval-spec gates. That session is independent of overnight work.
+3. **MTA Sandy-recovery citation layer** (per MONDAY.md: parse the
+   "Hurricane Sandy: Three Years Later" report into per-station-id
+   facts → emit `[mta_recovery_<station_id>]` doc messages). 1–2 hr.
+4. **PLUTO + NYC Building Footprints** for the very-large-campus
+   register cases that the buffered-overlap doesn't catch
+   (Stuyvesant Town in particular — it's not in `nycha.geojson`
+   because it's privately owned post-Met-Life). Either a new
+   "large_residential_complex" register or an actual footprint join.
+5. **3 more TTM r2 specialists**: USGS streamgage stage, NWS rainfall
+   accumulation, citywide 311 sewer-backup rate. Each one reuses the
+   same singleton — same architectural template as
+   `floodnet_forecast`.
+## What was deliberately kept out of scope tonight
+Per the wiring-pass priorities document:
+- USGS NWIS Bronx/Saw Mill/Hutchinson river forecasts.
+- FEMA OpenFEMA NFIP claims tract-aggregated specialist.
+- DEP CSO outfalls / Bluebelt / Green Infrastructure specialist.
+- WCAG 2.2 AA full audit.
+- Methodology paper draft (Saturday work).
+- Historical-event mode (Saturday work).
+- Build-in-Public posts.
+- ASCE talk materials.
+- Dark mode (explicit defer to v0.5).
+- WeasyPrint server-side PDF (browser print is sufficient for demo).
+- Per-specialist Python integration tests for the 4 register
+  specialists (e2e Playwright covers them).
+## Sharp edges to remember
+- **`floodnet_forecast` silent floor.** Sensors with <5 historical
+  events skip the forecast entirely (output is dominated by
+  quantization noise around zero — exactly the kind of
+  pseudo-quantitative claim the four-tier discipline guards
+  against). Trace shows `silent` with reason
+  "sensor has only N historical events; forecast omitted".
+  Don't lower the threshold without revisiting the calibration.
+- **Buffer choice in `app/registers/_footprint.py` is per-asset-class.**
+  100m hospital buffer catches NYU Langone but not the entire NYU
+  Langone Tisch Center (campus extends ~250m). Calibrated against
+  the three canonical addresses. Document any future change in the
+  same module's docstring.
+- **NYCHA polygons not yet on the map.** Centroid-pin rendering is
+  shipped; polygon-fill needs the dataclass change above.
+- **Trace UI `output` field carries the raw object.** Don't
+  re-stringify it in `q/[queryId]/+page.svelte` — the panel
+  formatter does that. The 240-char truncation that used to happen
+  in onStep is gone; if you're inspecting a giant payload, the
+  panel scrolls.
+- **TTM grouping uses `status='fan'` as the auto-expand marker.**
+  The recursive `leafSteps` walker in `TraceUI.svelte` excludes
+  fan/merge nodes from counts but recurses into their children.
+  Don't add another structural-only status without updating the
+  recursion.
+## Files touched (overnight, by area)
+- `app/registers/_footprint.py` (new)
+- `app/registers/{mta_entrances,doe_schools,doh_hospitals}.py`
+- `app/live/floodnet_forecast.py` (new)
+- `app/fsm.py`, `app/reconcile.py`
+- `web/sveltekit/src/lib/types/{trace,states,tier}.ts`
+- `web/sveltekit/src/lib/components/trace/{TraceUI,TraceRow}.svelte`
+- `web/sveltekit/src/lib/components/map/RipMap.svelte`
+- `web/sveltekit/src/lib/styles.css`
+- `web/sveltekit/src/routes/q/[queryId]/+page.svelte`
+- `web/sveltekit/build/*` (rebuilt artefacts, committed)
+- `web/static/agent.js` (legacy bundle: STEP_LABELS / SOURCE_LABELS
+  for `floodnet_forecast`)
+- `MONDAY.md`, `pitch/cold_open.md`, `experiments/shared/licenses.md`
+- Deletions: `Riprap.zip`,
+  `experiments/05_sam2_promptable/`,
+  `experiments/06_chronos_bolt_forecast/`,
+  `experiments/05_terramind_finetune/{micro.py, RESULTS.md}`
+  (renamed to `05a_terramind_finetune_micro/`),
+  `web/sveltekit/src/lib/components/states/GuardianRefusal.svelte`.
+## Tests run
+- 18-test static Playwright suite passes after every UI change.
+- Python smoke probes verified the buffered-footprint Sandy join
+  on the canonical addresses (NYU Langone, Stuyvesant HS, P.S. 89).
+- Did NOT run `pytest tests/` (requires uvicorn + live SSE; the
+  morning verification routine above hits all the same code paths).
+- Did NOT push to either remote — `git push && git push huggingface main`
+  when ready to deploy. HF rebuild ~10 min.

data/hospitals.geojson ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:164004fd98b88200014cb2da8cdb03f76bb456ec5d60820dd4e0da4da2e5f679
+size 30022

data/nyc_ntas_2020.geojson ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bb5f1759872c8fa7ed18f7430c971b3b238a68927e0046d06d1ddb4ce90fc26b
+size 4589872

experiments/00_endpoints/RESULTS.md ADDED Viewed

	@@ -0,0 +1,51 @@

+# Phase 0 — Endpoints smoke tests
+8/8 endpoints reachable from local dev machine. Run with:
+```bash
+/Users/amsrahman/riprap-nyc/.venv/bin/python run_all.py
+```
+| Endpoint | Status | Latency | Notes |
+|----------|:------:|--------:|-------|
+| Microsoft PC STAC (Sentinel-2 L2A search) | PASS | 1.2 s | keyless; 3 items in S2 Brooklyn bbox |
+| NYC Open Data Socrata (311, PLUTO, Sandy) | PASS | 5.8 s | each dataset returns its row keys |
+| USGS NWIS (Bronx River at NYBG) | PASS | 0.3 s | 2 series, 190 obs/24h |
+| NOAA Tides (Battery 8518750) | PASS | 0.1 s | latest WL=1.056 ft |
+| NOAA NWPS (gauges in NY+PA bbox) | PASS | 4.1 s | 750 gauges; **needs `srid=EPSG_4326`** |
+| NWS API (NY active alerts) | PASS | 0.5 s | 10 active alerts (cold day) |
+| FEMA OpenFEMA (FimaNfipClaims, NY) | PASS | 0.1 s | 1-row probe; aggregated only per project policy |
+| HF Hub (small Apache-2.0 model) | PASS | 0.3 s | sentence-transformers/all-MiniLM-L6-v2 metadata |
+## Sharp edges discovered
+1. **NWPS silently empty without `srid=EPSG_4326`.** Default `srid` is
+   apparently a non-WGS84 system; bbox in geographic coords matches no
+   gauges. Endpoint returns `200 OK` with an empty array — no error
+   signal. Recorded in the smoke test comments so the next person
+   doesn't lose 20 minutes.
+2. **NYC Open Data Socrata is slow on cold connection.** ~6 s for three
+   sequential single-row fetches. Probably PoP-routing or DNS warmup.
+   Cache aggressively and batch.
+3. **NWS API requires User-Agent.** The smoke test sets one; without a
+   UA you'd get HTTP 403 (NWS docs say so but it's a quiet failure
+   mode in production).
+4. **OpenFEMA FimaNfipClaims schema is wide.** First-row keys include
+   `amountPaidOnBuildingClaim`, `amountPaidOnContentsClaim`,
+   `baseFloodElevation`, etc — these are the property-level fields we
+   are NOT allowed to surface. Specialists using OpenFEMA must
+   aggregate (e.g., `$select=count(*)&$filter=...&$apply=...`) before
+   ingesting, never store property-level rows.
+## Cache contents
+`.cache/*.json` — one per smoke test, holds the parsed first row /
+metadata so subsequent dev iterations don't re-hit the endpoint.
+## Conclusion
+All eight data sources are usable. Proceed to Phase 1 (Prithvi-EO
+live water segmentation). No blocking issues.

experiments/00_endpoints/_runner.py ADDED Viewed

	@@ -0,0 +1,51 @@

+"""Tiny harness shared by the 8 smoke tests.
+Each test exposes a `probe()` callable that returns (ok, summary, payload).
+Cache hits are kept in .cache/ as JSON or raw bytes; tests are idempotent.
+"""
+from __future__ import annotations
+import json
+import sys
+import time
+import traceback
+from pathlib import Path
+CACHE = Path(__file__).parent / ".cache"
+CACHE.mkdir(exist_ok=True)
+def cache_path(key: str, ext: str = "json") -> Path:
+    return CACHE / f"{key}.{ext}"
+def write_cache(key: str, obj, ext: str = "json") -> None:
+    p = cache_path(key, ext)
+    if ext == "json":
+        p.write_text(json.dumps(obj, default=str)[:200_000])
+    else:
+        p.write_bytes(obj if isinstance(obj, bytes) else str(obj).encode())
+def run(name: str, fn) -> tuple[bool, str, float]:
+    t0 = time.time()
+    try:
+        ok, summary, payload = fn()
+        dt = time.time() - t0
+        return ok, summary, dt
+    except Exception as e:
+        traceback.print_exc()
+        dt = time.time() - t0
+        return False, f"exception: {type(e).__name__}: {e}", dt
+def cli(name: str, fn) -> int:
+    ok, summary, dt = run(name, fn)
+    badge = "PASS" if ok else "FAIL"
+    print(f"{badge}  {name}  ({dt:.2f}s)  {summary}")
+    return 0 if ok else 1
+if __name__ == "__main__":
+    sys.exit(0)