seriffic commited on
Commit
6a82282
·
1 Parent(s): 9e70214

Backend evolution: Phases 1-10 specialists + agentic FSM + Mellea + LiteLLM router

Browse files

Telescoped from ~50 spine commits across May 1-4 into one coherent
backend snapshot — what was built between v0.1's 9-specialist
sequential pass and the post-overhaul demo state.

Major additions:
- Planner-routed FSM with 5 intents (live_now, single_address,
neighborhood, development_check, compare). Granite 4.1:3b picks
the intent + targets; Granite 4.1:8b reconciles.
- Mellea-validated reconciliation (rejection sampling with 4
grounding checks: numerics_grounded, no_placeholder_tokens,
citations_dense sentence-scoped, citations_resolve).
- LiteLLM Router for backend abstraction — Ollama default, vLLM
on AMD MI300X via env vars, automatic fallback.
- Phase 1: Prithvi-EO live water segmentation (lazy import).
- Phase 2: GLiNER structured extraction.
- Phase 3: Granite Embedding Reranker R2.
- Phase 4: TerraMind synthetic SAR (parked, lazy import).
- Phase 5: TerraMind micro-finetune (AMD MI300X experiment).
- Phase 6: Refusal classifier (Guardian, later removed).
- Phases 7-10: Register specialists — MTA entrances, NYCHA,
NYC DOE schools, NYS DOH hospitals — with buffered footprint
overlap.
- Live: NOAA tides, NWS alerts/observations, FloodNet TTM forecast.
- Areas: NTA polygons for neighborhood-mode.
- Context: DOB permits scraper for development_check.
- Granite TimeSeries TTM r2 nowcast for surge.

Plus: tests/, experiments/ source + RESULTS, audit/, pitch/cold_open,
ARCHITECTURE.md / METHODOLOGY.md / MONDAY.md / CLAUDE.md docs,
pyproject.toml + requirements-experiments.txt + riprap.py CLI.

Granular spine history is preserved in /Users/amsrahman/riprap-nyc;
this commit is the demo-week consolidated landing.

This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitignore +19 -0
  2. ARCHITECTURE.md +696 -0
  3. CLAUDE.md +546 -0
  4. METHODOLOGY.md +264 -0
  5. MONDAY.md +224 -0
  6. agent.py +1 -1
  7. app/areas/__init__.py +0 -0
  8. app/areas/nta.py +224 -0
  9. app/assets/mta_entrances.py +0 -1
  10. app/context/dob_permits.py +258 -0
  11. app/context/gliner_extract.py +125 -0
  12. app/context/microtopo.py +70 -4
  13. app/context/noaa_tides.py +110 -0
  14. app/context/nws_alerts.py +71 -0
  15. app/context/nws_obs.py +108 -0
  16. app/context/nyc311.py +80 -4
  17. app/context/terramind_synthesis.py +357 -0
  18. app/flood_layers/dep_stormwater.py +39 -1
  19. app/flood_layers/ida_hwm.py +13 -0
  20. app/flood_layers/prithvi_live.py +299 -0
  21. app/flood_layers/sandy_inundation.py +29 -1
  22. app/fsm.py +762 -69
  23. app/geocode.py +102 -10
  24. app/intents/__init__.py +3 -0
  25. app/intents/development_check.py +324 -0
  26. app/intents/live_now.py +231 -0
  27. app/intents/neighborhood.py +492 -0
  28. app/intents/single_address.py +73 -0
  29. app/live/__init__.py +0 -0
  30. app/live/floodnet_forecast.py +184 -0
  31. app/live/ttm_forecast.py +363 -0
  32. app/llm.py +296 -0
  33. app/mellea_validator.py +501 -0
  34. app/planner.py +250 -0
  35. app/rag.py +80 -9
  36. app/reconcile.py +749 -82
  37. app/register_builder.py +4 -3
  38. app/registers/__init__.py +0 -0
  39. app/registers/_footprint.py +84 -0
  40. app/registers/doe_schools.py +200 -0
  41. app/registers/doh_hospitals.py +194 -0
  42. app/registers/mta_entrances.py +244 -0
  43. app/registers/nycha.py +270 -0
  44. app/score.py +326 -28
  45. audit/2026-05-03-evening-audit.md +183 -0
  46. audit/2026-05-04-morning-handoff.md +165 -0
  47. data/hospitals.geojson +3 -0
  48. data/nyc_ntas_2020.geojson +3 -0
  49. experiments/00_endpoints/RESULTS.md +51 -0
  50. experiments/00_endpoints/_runner.py +51 -0
.gitignore CHANGED
@@ -10,7 +10,26 @@ node_modules/
10
  .ruff_cache/
11
  .pytest_cache/
12
 
 
 
 
 
13
  # legacy / intermediate Prithvi artifacts (not shipped)
14
  data/hls_stack_*.tif
15
  data/prithvi_runs/
16
  data/*.legacy_*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  .ruff_cache/
11
  .pytest_cache/
12
 
13
+ # Session-local Claude Code context (per-machine, not for the public repo)
14
+ CLAUDE.local.md
15
+ .claude/
16
+
17
  # legacy / intermediate Prithvi artifacts (not shipped)
18
  data/hls_stack_*.tif
19
  data/prithvi_runs/
20
  data/*.legacy_*
21
+ web/svelte/node_modules/
22
+ web/sveltekit/node_modules/
23
+ web/sveltekit/.svelte-kit/
24
+
25
+ # Experiments — cached HF model downloads, training artifacts, intermediate
26
+ # fixtures. RESULTS.md, NOTES.md, and source code stay tracked.
27
+ experiments/**/.cache/
28
+ experiments/**/restore/
29
+ experiments/**/publish/
30
+ experiments/**/*.tif
31
+ experiments/**/*.png
32
+ experiments/**/*.jpg
33
+ experiments/**/*.parquet
34
+ experiments/**/*.npy
35
+ pitch/screenshots-*/
ARCHITECTURE.md ADDED
@@ -0,0 +1,696 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Riprap — Architecture
2
+
3
+ > **What it is.** A web tool that takes any NYC address and produces a
4
+ > short, citation-grounded **flood-exposure briefing** — a tier (1–4)
5
+ > with a paragraph of evidence, where every numeric claim links back to
6
+ > the specific dataset, agency report, or model output it came from.
7
+ >
8
+ > **Who it's for.** Urban planners, journalists on deadline, NYCEM
9
+ > grant writers filing FEMA BRIC sub-applications, agency capital
10
+ > planners, researchers under FOIL/IRB constraints — *not* consumers
11
+ > shopping for flood insurance.
12
+ >
13
+ > **Why local foundation models.** A newsroom with FOIL'd documents
14
+ > can't paste them into a vendor LLM. We run Granite 4.1 (3 B-param
15
+ > chat model), Granite Embedding 278M (RAG), Prithvi-EO 2.0 (300 M-param
16
+ > Earth-observation model, offline pre-compute) and Granite TimeSeries
17
+ > TTM r2 (1.5 M-param zero-shot forecaster) inside one container. No
18
+ > vendor LLM is contacted at runtime.
19
+
20
+ ---
21
+
22
+ ## 1. A 60-second primer on NYC flooding
23
+
24
+ Skip if you already know this. Most architecture docs assume you do —
25
+ this one doesn't.
26
+
27
+ ### 1.1 Three kinds of flood
28
+
29
+ NYC gets hit by three flood mechanisms that look completely different
30
+ on a map and are caused by different physics:
31
+
32
+ - **Coastal / surge flooding** — The ocean rises into the city.
33
+ Driven by storm surge (wind pushing water against the coast),
34
+ astronomical high tide, and wave run-up. Affects the **shoreline:**
35
+ Brighton Beach, Coney Island, Red Hook, Lower Manhattan, the
36
+ Rockaways, Staten Island east shore. **Hurricane Sandy 2012** is
37
+ the canonical event — water came over the seawall and flooded
38
+ subway tunnels, hospitals, and electrical substations. Affects
39
+ buildings that were dry that morning.
40
+ - **Pluvial / stormwater flooding** — Rain falls faster than the
41
+ drainage system can carry it away. Affects **inland low points,
42
+ basement apartments, and chronically under-sewered neighborhoods**:
43
+ Hollis (Queens), Carroll Gardens (Brooklyn), Jamaica. **Hurricane
44
+ Ida 2021** is the canonical event for NYC — most of the deaths
45
+ were in basement apartments far from any coast. Optical satellites
46
+ largely *can't see* this kind of flooding because the water drains
47
+ fast and is often sub-surface.
48
+ - **Compound flooding** — Coastal + pluvial happening at the same
49
+ time, with groundwater rising too. Currently the active research
50
+ frontier (NPCC4 Ch. 3 calls it out explicitly). Most agencies model
51
+ these mechanisms separately; reality combines them.
52
+
53
+ A good civic flood tool has to cover all three and be honest about
54
+ what each signal can and cannot see. Riprap surfaces evidence for all
55
+ three but **doesn't predict damage** — see scope below.
56
+
57
+ ### 1.2 Empirical vs modeled vs proxy
58
+
59
+ Each piece of flood evidence falls into one of three classes, and the
60
+ distinction matters for how much weight to give it:
61
+
62
+ - **Empirical** — Something flooded a place and was measured. USGS
63
+ high-water marks (people went out after Hurricane Ida and surveyed
64
+ where water reached on building walls). The 2012 Sandy Inundation
65
+ Zone (mapped by the city after the storm). FloodNet ultrasonic
66
+ sensors that recorded an actual depth. **Highest-confidence**: this
67
+ flood happened here.
68
+ - **Modeled scenarios** — Hydraulic models simulate "what if" cases.
69
+ FEMA's regulatory floodplains (1 % and 0.2 % annual chance). NYC
70
+ DEP's Stormwater Maps (modeled water depth under three rainfall
71
+ scenarios with varying sea-level-rise assumptions). **Useful but
72
+ scenario-bounded**: this could happen here under those conditions.
73
+ - **Proxy signals** — Indirect indicators of flooding. NYC 311
74
+ complaints ("street flooding", "sewer backup") clustering around an
75
+ address. Topographic indices (HAND, TWI) suggesting water *would*
76
+ pool here based on terrain. **Useful but biased**: 311 reflects
77
+ civic engagement as well as flooding; terrain says nothing about
78
+ drainage capacity.
79
+
80
+ Riprap surfaces all three classes. The score weights them in that
81
+ order (empirical > modeled > proxy), with empirical hits granted a
82
+ **floor rule** — see [§5](#5-the-scoring-rubric).
83
+
84
+ ### 1.3 Hydrology indices used in this app
85
+
86
+ Two terrain-derived numbers come up repeatedly. They're cheap to
87
+ compute from a Digital Elevation Model (DEM) and they're the
88
+ hydrological literature's canonical exposure proxies:
89
+
90
+ - **HAND (Height Above Nearest Drainage)** — Vertical distance from
91
+ the address up to the nearest river/drainage channel. **<1 m** = at
92
+ drainage level (water *will* reach here in flood). **>10 m** =
93
+ hillslope (very dry). Nobre et al. 2011.
94
+ - **TWI (Topographic Wetness Index)** — `ln(catchment_area / tan
95
+ slope)`. **High TWI** = water tends to accumulate here (large
96
+ contributing area, gentle slope). Beven & Kirkby 1979.
97
+
98
+ Neither is a flood prediction; both are exposure indicators that say
99
+ "water *would* pool here based on terrain alone."
100
+
101
+ ---
102
+
103
+ ## 2. What Riprap actually produces
104
+
105
+ For a given address (or any of three modes — see [§4](#4-three-user-modes)),
106
+ Riprap returns:
107
+
108
+ 1. **A tier 1–4** computed by a deterministic, published rubric
109
+ ([§5](#5-the-scoring-rubric)). Tier 1 = "high exposure"; Tier 4 =
110
+ "limited exposure"; Tier 0 = "no flagged exposure."
111
+ 2. **A 4-section briefing paragraph** synthesised by Granite 4.1 with
112
+ `[doc_id]` citations after every numeric claim. Sections:
113
+ *Status*, *Empirical evidence*, *Modeled scenarios*, *Policy
114
+ context*. A section is omitted entirely if no specialist fired for
115
+ it (silence-over-confabulation contract).
116
+ 3. **Evidence cards** — one per fired specialist, with the raw values
117
+ and a link to the source dataset.
118
+ 4. **Map overlay** — the address pinned, with the empirical and
119
+ modeled flood extents that overlap it.
120
+ 5. **Live "right now" signals** — active NWS flood alerts, current
121
+ tide residual at the Battery, recent precipitation at the nearest
122
+ ASOS, and a Granite TTM short-horizon forecast of the surge
123
+ residual. **These do not modify the tier** (per IPCC AR6 WG II's
124
+ distinction between exposure and event occurrence).
125
+
126
+ The full output is a JSON blob with all specialist outputs preserved,
127
+ so a journalist or planner can audit every number that appears in the
128
+ prose.
129
+
130
+ ---
131
+
132
+ ## 3. The Burr FSM and how the specialists chain
133
+
134
+ Riprap is a **state machine** — a Burr FSM (DAGWorks) — that walks
135
+ through a fixed list of "specialist" functions in order. Each
136
+ specialist either produces a structured fact or stays silent. At the
137
+ end, the reconciler reads all the produced facts and writes the
138
+ paragraph.
139
+
140
+ The full chain, in execution order:
141
+
142
+ ```
143
+ ┌─────────────────────────────┐
144
+ query ──► │ 1. geocode (DCP Geosearch) │ address text → lat/lon, BBL, borough
145
+ └────────────┬────────────────┘
146
+
147
+ ┌─────────────────────────────────────────────┐
148
+ │ STATIC EMPIRICAL + REGULATORY LAYERS │
149
+ │ (snapshot of city-published flood layers) │
150
+ ├─────────────────────────────────────────────┤
151
+ │ 2. sandy in 2012 Sandy zone? Y/N │ empirical
152
+ │ 3. dep_stormwater in 3 modeled scenarios? │ modeled
153
+ │ 4. floodnet live sensor history │ empirical
154
+ │ 5. nyc311 flood complaints in 200m │ proxy
155
+ └────────────┬────────────────────────────────┘
156
+
157
+ ┌─────────────────────────────────────────────┐
158
+ │ LIVE "RIGHT NOW" LAYER │
159
+ │ (out of static score; reported separately) │
160
+ ├─────────────────────────────────────────────┤
161
+ │ 6. noaa_tides Battery / Kings Pt level │ live, 6-min
162
+ │ 7. nws_alerts active flood-relevant │ live
163
+ │ 8. nws_obs nearest ASOS recent precip │ live
164
+ │ 9. ttm_forecast 9.6h surge-residual nowcast│ Granite TTM r2
165
+ └────────────┬────────────────────────────────┘
166
+
167
+ ┌─────────────────────────────────────────────┐
168
+ │ TERRAIN + EVENT-LEVEL EMPIRICAL LAYERS │
169
+ ├─────────────────────────────────────────────┤
170
+ │ 10. microtopo DEM + TWI + HAND at point │ proxy
171
+ │ 11. ida_hwm USGS Ida 2021 HWM proximity│ empirical
172
+ │ 12. prithvi Prithvi-EO Ida flood polys │ empirical (model-derived)
173
+ └────────────┬────────────────────────────────┘
174
+
175
+ ┌─────────────────────────────────────────────┐
176
+ │ 13. rag (Granite Embedding 278M) │ retrieves policy paragraphs
177
+ │ query corpus of 5 NYC agency PDFs │ relevant to this address
178
+ └────────────┬────────────────────────────────┘
179
+
180
+ ┌─────────────────────────────────────────────┐
181
+ │ 14. reconcile (Granite 4.1 :3b on Ollama) │ document-grounded synthesis
182
+ │ reads all "documents" produced by 1-13 │ → 4-section cited paragraph
183
+ │ drops sentences with ungrounded numbers │ → audit trail
184
+ └────────────┬────────────────────────────────┘
185
+
186
+ cited briefing
187
+ + tier badge + evidence cards + map
188
+ ```
189
+
190
+ Each step is implemented as a `@action` in `app/fsm.py`. The Burr
191
+ runtime handles the state-passing between actions and emits a trace
192
+ record per step (timing, ok/err, summary fields) which the front-end
193
+ shows live as the FSM runs.
194
+
195
+ ### 3.1 What every specialist does, plain language
196
+
197
+ | # | Specialist | Plain-language description | Class |
198
+ |---|---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|
199
+ | 1 | **geocode** | Resolve the user's text ("116-50 Sutphin Blvd, Queens") to a (lat, lon) and a NYC tax-lot ID (BBL). Uses NYC Planning's free Geosearch API. | n/a |
200
+ | 2 | **sandy** | Did the address get flooded by Hurricane Sandy in 2012? Point-in-polygon over the official NYC Sandy Inundation Zone. | empirical |
201
+ | 3 | **dep_stormwater** | Three modeled stormwater-flooding scenarios from NYC DEP: Moderate-2050, Extreme-2080, Tidal-2050. Each tells you depth (none / 0.4–0.8 ft / etc.) at this point. | modeled |
202
+ | 4 | **floodnet** | NYC's ultrasonic flood-sensor network. How many sensors are within 600 m, and have any of them registered a flood event in the last 3 years? | empirical |
203
+ | 5 | **nyc311** | The 311 service-request archive. How many flood-related complaints (street flooding, sewer backup, catch-basin clogged) within 200 m of the address over the last 5 years? | proxy |
204
+ | 6 | **noaa_tides** *(live)* | Current tide observation at the nearest of three NOAA gauges (Battery / Kings Pt / Sandy Hook). Reports observed water level, predicted astronomical tide, and the **residual** (≈ surge). | live |
205
+ | 7 | **nws_alerts** *(live)* | Are there active NWS flood-relevant alerts at this point right now? Flash Flood Warnings, Coastal Flood Advisories, etc. | live |
206
+ | 8 | **nws_obs** *(live)* | Recent precipitation from the nearest airport ASOS station (KNYC / KLGA / KJFK / KEWR / KFRG). | live |
207
+ | 9 | **ttm_forecast** *(live)* | Granite TTM r2 zero-shot forecast of the surge **residual** at the Battery for the next ~9.6 h. NOAA already publishes the astronomical tide; TTM forecasts the part NOAA doesn't. | live (model-derived) |
208
+ | 10 | **microtopo** | LiDAR-derived terrain features at the point: elevation, HAND, TWI, local relief percentile. | proxy |
209
+ | 11 | **ida_hwm** | USGS Hurricane Ida 2021 high-water marks — actual measured water heights surveyed in the days after the storm. | empirical |
210
+ | 12 | **prithvi** | NASA/IBM Prithvi-EO 2.0 segmentation of Sentinel-2 imagery for the Ida pre/post pair. Pre-computed offline; serves point-in-polygon queries against the resulting 166 polygons. | empirical (model-derived) |
211
+ | 13 | **rag** | Granite Embedding 278M retrieves the most-relevant paragraphs from 5 NYC policy PDFs (Comptroller, NPCC4, MTA, NYCHA, ConEd) given the address's borough + which scenarios fired. | policy |
212
+ | 14 | **reconcile** | Granite 4.1 :3b reads all the documents produced by steps 1–13 and writes the cited briefing paragraph. See [§6](#6-document-grounded-reconciliation). | LLM synthesis |
213
+
214
+ ### 3.2 Worked example: 2940 Brighton 3rd St, Brooklyn
215
+
216
+ To make the chain concrete, here's what fires for a Brighton Beach
217
+ address:
218
+
219
+ | Step | What it returns |
220
+ |---|---|
221
+ | geocode | `(40.5780, -73.9617)`, BBL `3-08660-0001`, Brooklyn |
222
+ | sandy | **YES** — inside the 2012 Sandy Inundation Zone |
223
+ | dep_stormwater | `dep_moderate_2050`: depth 0.4-0.8 ft; `dep_extreme_2080`: depth 0.8-2.0 ft |
224
+ | floodnet | 2 sensors within 600 m; 1 trigger event in last 3 yr (peak 14 cm) |
225
+ | nyc311 | 11 flood-related complaints in 200 m, 5-yr window |
226
+ | noaa_tides | Sandy Hook gauge, +0.49 ft residual *(today's reading)* |
227
+ | nws_alerts | 0 active alerts |
228
+ | nws_obs | KJFK ASOS, no recent precipitation |
229
+ | ttm_forecast | Forecast peak residual +0.6 ft in 4.2 h *(today's run)* |
230
+ | microtopo | Elevation 2.36 m, HAND 0.7 m, TWI 11.3, percentile 8 (very low) |
231
+ | ida_hwm | 0 USGS HWMs within 800 m (Ida hit Queens hardest, not Brighton) |
232
+ | prithvi | Inside an Ida-attributable polygon? **NO** (Ida was pluvial-inland) |
233
+ | rag | Top hits: NPCC4 Ch.3 (coastal), MTA Resilience (Coney Island D-train), Comptroller |
234
+ | reconcile | (see below) |
235
+ | **Tier** | **1 (High exposure)** with empirical floor applied |
236
+
237
+ The reconciler then writes:
238
+
239
+ ```
240
+ **Status.** This Brighton Beach address sits **inside the 2012 Sandy
241
+ Inundation Zone** [sandy], on relatively low ground with HAND of 0.7 m
242
+ [microtopo].
243
+
244
+ **Empirical evidence.** NYC 311 records show **11 flood-related
245
+ complaints** within 200 m over the last 5 years [nyc311]; 2 FloodNet
246
+ sensors are within 600 m and one logged a 14 cm event in the last 3
247
+ years [floodnet].
248
+
249
+ **Modeled scenarios.** The address sits inside **DEP Moderate-2050**
250
+ with depth class 0.4-0.8 ft and **DEP Extreme-2080** with depth class
251
+ 0.8-2.0 ft [dep_moderate_2050][dep_extreme_2080].
252
+
253
+ **Policy context.** **NPCC4 Ch. 3** documents accelerating coastal-
254
+ flood frequency along this stretch [rag_npcc4].
255
+ ```
256
+
257
+ Note what *didn't* fire: no Ida HWM doc (Ida didn't flood here), no
258
+ Prithvi doc (no Ida-attributable polygon), no NWS alerts (clear day),
259
+ no TTM doc (forecast residual under threshold). The reconciler never
260
+ saw those headers and didn't invent them.
261
+
262
+ ---
263
+
264
+ ## 4. Three user modes
265
+
266
+ | Path | Mode | What it does |
267
+ |---------------------------------------|------------------|---|
268
+ | `/` | **Single address** | Geocode → run the full FSM → cited paragraph + map. Live demo path. |
269
+ | `/compare` | **Compare** | Two addresses side by side; parallel FSM runs (`asyncio.to_thread`, `OLLAMA_NUM_PARALLEL=2`). Useful for "this site vs the alternative". |
270
+ | `/register/{schools,nycha,mta_entrances}` | **Register** | Pre-computed bulk runs over NYC public-asset registries — 126 schools, 45 NYCHA developments, ~1,900 MTA subway entrances. Loaded from `data/registers/*.json` at boot. |
271
+
272
+ Single-address is the live path. Registers are pre-computed because
273
+ running 1,900 reconciler calls at request time is a non-starter; the
274
+ registers job runs offline (see `scripts/build_*_register.py`) and
275
+ the result is served from cache.
276
+
277
+ ---
278
+
279
+ ## 5. The scoring rubric
280
+
281
+ This is the part of the system that produces the tier 1–4. It is
282
+ **deterministic, published, and not done by the language model**.
283
+ See `METHODOLOGY.md` for the full citation list; here's the
284
+ high-level structure.
285
+
286
+ ### 5.1 Three thematic sub-indices
287
+
288
+ Following Cutter et al. 2003 (SoVI hazards-of-place) and Tate 2012
289
+ (uncertainty analysis), indicators are grouped into thematic sub-
290
+ indices, equal-weighted within each group, normalized to [0, 1]:
291
+
292
+ | Sub-index | What it captures | Top weights |
293
+ |-----------------|----------------------------------------------------------|-------------|
294
+ | **Regulatory** | Inside FEMA / DEP / NPCC4 modeled or regulated zones | FEMA 1 %; DEP-2050; DEP Tidal |
295
+ | **Hydrological**| Terrain-based exposure (HAND, TWI, percentile, relief) | HAND (Nobre 2011); TWI half-weighted (urban DEM noise) |
296
+ | **Empirical** | Did flooding actually happen here (Sandy, Ida HWMs, 311) | Sandy + HWM<100m → also trigger floor |
297
+
298
+ The **composite** is the sum of the three sub-indices (range 0–3).
299
+ Tier breakpoints: ≥1.5 → Tier 1, ≥1.0 → Tier 2, ≥0.5 → Tier 3, >0 →
300
+ Tier 4, 0 → Tier 0.
301
+
302
+ ### 5.2 Max-empirical floor
303
+
304
+ If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
305
+ the tier is capped at **2 (Elevated)** — it cannot be worse,
306
+ regardless of the additive composite.
307
+
308
+ This recovers the *important* multiplicative behaviour Balica 2012
309
+ argues for (empirical observations should not be cancelled by
310
+ terrain or modeled scenarios) without giving up additive transparency.
311
+ The 100 m radius is chosen because USGS HWM positional uncertainty is
312
+ typically 5–30 m — 100 m gives ~3σ headroom for a confident "this
313
+ address was inundated" signal.
314
+
315
+ ### 5.3 Live signals stay out
316
+
317
+ NWS alerts, NOAA tide residual, and NWS hourly precipitation are
318
+ **not** in the static tier. Per IPCC AR6 WG II glossary and NPCC4
319
+ Ch. 3, exposure is a quasi-stationary property of place; event
320
+ occurrence is time-varying. They appear separately as live evidence
321
+ cards.
322
+
323
+ ---
324
+
325
+ ## 6. Document-grounded reconciliation
326
+
327
+ `app/reconcile.py` builds a list of OpenAI-style chat messages where
328
+ each specialist's emission is its own message with a stable `doc_id`
329
+ ride-along on the role. Granite 4.1's Ollama chat template recognises
330
+ any `role: "document <doc_id>"` message and lifts it into a
331
+ `<documents>` block, prepending IBM's official grounded-generation
332
+ system message ("Write the response by strictly aligning with the
333
+ facts in the provided documents").
334
+
335
+ Example packet for the Brighton Beach address (abbreviated):
336
+
337
+ ```python
338
+ [
339
+ {"role": "system", "content": "<citation-discipline + 4-section skeleton>"},
340
+ {"role": "document sandy", "content": "Address is INSIDE the 2012 Sandy zone. ..."},
341
+ {"role": "document dep_extreme_2080", "content": "Depth class 0.8-2.0 ft. ..."},
342
+ {"role": "document floodnet", "content": "2 sensors; peak 14 cm. ..."},
343
+ {"role": "document nyc311", "content": "11 flood complaints in 200 m. ..."},
344
+ {"role": "document microtopo", "content": "Elev 2.36 m, HAND 0.7 m, TWI 11.3. ..."},
345
+ {"role": "document rag_npcc4", "content": "<retrieved paragraph>"},
346
+ {"role": "user", "content": "Write the cited briefing now."},
347
+ ]
348
+ ```
349
+
350
+ The four-section structure (`**Status.** / **Empirical evidence.** /
351
+ **Modeled scenarios.** / **Policy context.**`) is enforced by the
352
+ `EXTRA_SYSTEM_PROMPT`. Sections without supporting documents are
353
+ omitted entirely.
354
+
355
+ ### 6.1 Two reconciler models
356
+
357
+ - **`granite4.1:3b`** runs the planner and `live_now` (short outputs,
358
+ routing decisions). Always streamed.
359
+ - **`granite4.1:8b`** runs the synthesis path for `single_address`,
360
+ `neighborhood`, and `development_check` (long outputs, dense
361
+ citations). Pre-warmed into VRAM in `entrypoint.sh` so the first
362
+ query doesn't pay the model-load tax. Both fit warm on the T4 with
363
+ `OLLAMA_MAX_LOADED_MODELS=2` and `OLLAMA_KEEP_ALIVE=24h`.
364
+
365
+ ### 6.2 Mellea-validated rejection sampling
366
+
367
+ `app/mellea_validator.py` wraps the Granite-via-Ollama call in IBM
368
+ Research's [Mellea](https://github.com/generative-computing/mellea)
369
+ framework — instruct, validate, repair. The synthesis intents call
370
+ `reconcile_strict_streaming(...)` which:
371
+
372
+ 1. **Streams** each generation attempt's tokens to the user (via the
373
+ FSM threadlocal `set_token_callback` for `single_address` or a
374
+ `progress_q` for the polygon intents).
375
+ 2. After each attempt, runs **four deterministic checks** on the
376
+ accumulated paragraph:
377
+ - **`numerics_grounded`** — every non-trivial number in the output
378
+ appears verbatim in a source document.
379
+ - **`no_placeholder_tokens`** — output contains no leaked
380
+ `[source]` / `<document>` template markup.
381
+ - **`citations_dense`** — every non-trivial number has a
382
+ `[doc_id]` citation **somewhere in the same sentence** (sentence
383
+ boundaries: `. ` / `.\n` / end-of-text).
384
+ - **`citations_resolve`** — cited `doc_id`s are a subset of the
385
+ input doc_ids.
386
+ 3. If any check fails, fires a `mellea_attempt` SSE event with the
387
+ failed-requirement names, then **rerolls** with a feedback prompt
388
+ that names the specific failing sentences (the model usually
389
+ responds well to surgical corrections). Loop budget: 3 attempts.
390
+
391
+ The frontend renders an inline banner above the briefing — amber on
392
+ reroll (with the failed-req list), green on first-try pass. The final
393
+ reconcile step in the trace shows the `passed: N/4 · rerolls: M`
394
+ metadata for full audit transparency.
395
+
396
+ ### 6.3 Number recognition is identifier-aware
397
+
398
+ The numeric guardrail uses `\b-?\d[\d,]*(?:\.\d+)?\b` so that
399
+ identifier codes embedded in prose (`QN1206` NTA codes, `BBL
400
+ 3-00589-0003` parcels, `BIN`, `B12` community boards) are *not*
401
+ treated as numeric claims demanding citation. This was the dominant
402
+ false-positive in early probing; without it, almost every neighborhood
403
+ briefing failed `citations_dense` because the opening sentence
404
+ typically reads "*X (NTA QN1206) in Queens…*".
405
+
406
+ ### 6.4 Why no native Granite 4.x inline citations
407
+
408
+ We investigated using Granite's native `<|start_of_cite|>{document_id:
409
+ X}fact<|end_of_cite|>` mode. **It's deprecated in 4.x.** Verified:
410
+
411
+ - The official Ollama chat template for `granite4.x` has no citation
412
+ branch (the 3.3 / 4.0-preview templates did).
413
+ - `granite_common` ships only `granite3/granite32` and
414
+ `granite3/granite33` subdirs — no 4.x equivalent.
415
+ - `granite-io` has only `granite_3_2/` and `granite_3_3/` processor
416
+ dirs.
417
+
418
+ The base 4.1 weights still contain the cite tokens (training residue),
419
+ so the model emits them as real tokens when nudged — but only as an
420
+ end-of-response list, not inline in prose. IBM's published 4.x
421
+ grounding path is a separate **Citation Generation LoRA** (built on
422
+ `granite-4.0-micro`, not 4.1) requiring HF transformers + LoRA
423
+ loading. Mellea's `OllamaBackend` explicitly raises
424
+ `NotImplementedError` for activated LoRAs. So our hand-rolled
425
+ `[doc_id]` regex + reroll **is** the right pattern for our setup
426
+ (Granite 4.1 via Ollama, inline placement).
427
+
428
+ ---
429
+
430
+ ## 7. The four foundation models
431
+
432
+ | Model | Params | Runtime | Role |
433
+ |-------|--------|---------|------|
434
+ | **Granite 4.1 :3b** | 3 B | Ollama (GPU on T4) | Planner (intent + specialist routing) + `live_now` reconciler. |
435
+ | **Granite 4.1 :8b** | 8 B | Ollama (GPU on T4) | Synthesis reconciler for `single_address`, `neighborhood`, `development_check`. Validated by Mellea (4 grounding requirements + reroll). |
436
+ | **Granite Embedding 278M** | 278 M | sentence-transformers (CPU) | RAG retrieval over 5 policy PDFs at query time. |
437
+ | **Prithvi-EO 2.0** | 300 M | TerraTorch (offline pre-compute) | Sen1Floods11 fine-tune; segmented Hurricane Ida 2021 pre/post Sentinel-2 polygons baked into `data/`. |
438
+ | **Granite TimeSeries TTM r2** | 1.5 M | granite-tsfm (CPU) | Zero-shot forecast of the Battery surge residual, ~9.6 h horizon. |
439
+
440
+ **Granite 4.1 ≠ Granite Time Series.** Granite 4.1 is IBM's chat-LLM
441
+ family. Granite TimeSeries TTM is a separate IBM Research product
442
+ line (Ekambaram et al. 2024, NeurIPS). Both happen to share the
443
+ "Granite" brand but have different architectures, training data, and
444
+ authors.
445
+
446
+ ### 7.1 Why Prithvi runs offline
447
+
448
+ Prithvi-EO 2.0 with TerraTorch needs a GPU and minutes per HLS tile.
449
+ We segmented Hurricane Ida 2021 once (pre: 2021-08-25, post:
450
+ 2021-09-02 ~12 h after peak), filtered the output (>30 000 sqft to
451
+ drop noise, <1 km² to drop tidal artifacts) into **166 polygons**
452
+ baked into `data/prithvi_ida_2021.geojson`. The runtime FSM does a
453
+ point-in-polygon test, not fresh inference. This is honest about
454
+ where foundation models earn their keep: **once, to produce a
455
+ defensible event-level signal — not per request**.
456
+
457
+ ### 7.2 Why TTM r2 runs live
458
+
459
+ TTM r2 is **1.5 M params** — vastly smaller than Prithvi or Granite
460
+ 4.1. Inference is millisecond-scale even on CPU. It forecasts only
461
+ the residual (surge component) at the Battery, which complements the
462
+ NOAA snapshot specialist; it does **not** try to forecast the
463
+ astronomical tide (NOAA already publishes that exactly).
464
+
465
+ ---
466
+
467
+ ## 8. Live signals separation
468
+
469
+ Live data (steps 6–9 in the FSM diagram) is fundamentally different
470
+ from static layers and is handled separately:
471
+
472
+ - **Surface**: in evidence cards and a "Right now" section in the UI.
473
+ - **Score**: explicitly excluded. Tier is reproducible across queries
474
+ unless source data changed.
475
+ - **Cadence**: NOAA tides update every 6 min; NWS alerts on push;
476
+ NWS obs ~hourly; TTM is computed per query (cheap).
477
+ - **Failure mode**: graceful — if NOAA times out, no `noaa_tides`
478
+ doc is emitted; the reconciler simply doesn't see it.
479
+
480
+ This mirrors how First Street separates Flood Factor (static, 30-yr)
481
+ from event-day Flood Lab products, and how Fathom separates Global
482
+ Flood Map from real-time intelligence.
483
+
484
+ ---
485
+
486
+ ## 9. Repository layout
487
+
488
+ ```
489
+ riprap-nyc/
490
+ ARCHITECTURE.md this file
491
+ METHODOLOGY.md scoring methodology + full citations
492
+ README.md HF Spaces frontmatter + user-facing summary
493
+ Dockerfile nvidia/cuda:12.4 base + Ollama + Granite
494
+ entrypoint.sh Ollama daemon + uvicorn launcher
495
+ requirements.txt runtime deps (FastAPI, geopandas, sentence-transformers, ollama, burr, granite-tsfm)
496
+ pyproject.toml ruff + vulture config
497
+ riprap.py CLI driver for register builds
498
+ agent.py single-address CLI
499
+
500
+ app/
501
+ fsm.py Burr FSM (14 actions; Mellea hooks via threadlocal)
502
+ planner.py Granite 4.1:3b intent router (5 intents)
503
+ geocode.py NYC DCP Geosearch + borough-hint filter
504
+ reconcile.py Granite 4.1 grounded reconciler + numeric guardrail
505
+ mellea_validator.py streaming rejection sampler + 4 grounding checks
506
+ rag.py Granite Embedding 278M retrieval
507
+ score.py deterministic exposure rubric (3 sub-indices, floor)
508
+ spatial.py geopandas join helpers
509
+ energy.py per-query inference Wh accounting
510
+ register_builder.py bulk-mode runner (offline)
511
+
512
+ intents/ per-intent orchestration on top of fsm.py
513
+ live_now.py shoreline tide + alerts (cheap, non-strict)
514
+ single_address.py drives the linear FSM with strict reconcile
515
+ neighborhood.py polygon-aggregated specialists
516
+ development_check.py DOB permit overlap with flood polygons
517
+ compare.py two-address side-by-side
518
+ areas/
519
+ nta.py NYC NTA 2020 polygon resolver
520
+
521
+ flood_layers/
522
+ sandy_inundation.py NYC OD 5xsi-dfpx
523
+ dep_stormwater.py 9i7c-xyvv (3 scenarios)
524
+ ida_hwm.py USGS STN Event 312
525
+ prithvi_water.py Ida pre/post diff polygons (offline-built)
526
+
527
+ context/
528
+ microtopo.py DEM + TWI + HAND raster sampling
529
+ nyc311.py erm2-nwe9 buffer aggregation
530
+ floodnet.py api.floodnet.nyc Hasura GraphQL
531
+ noaa_tides.py live water level + residual
532
+ nws_alerts.py live alerts at point
533
+ nws_obs.py nearest ASOS hourly METAR
534
+
535
+ live/
536
+ ttm_forecast.py Granite TTM r2 surge-residual nowcast
537
+
538
+ assets/
539
+ schools.py DCP FacDB
540
+ nycha.py phvi-damg
541
+ mta_entrances.py i9wp-a4ja
542
+
543
+ web/
544
+ main.py FastAPI (5 pages, JSON endpoints, 2 SSE streams)
545
+ static/
546
+ index.html classic single-address report (compatibility)
547
+ agent.html primary UI: planner + live trace + briefing
548
+ agent.js EventSource client; sets properties on
549
+ <r-briefing> / <r-trace> / <r-sources-footer>
550
+ report.html / .js auditable PDF-formatted export view
551
+ compare.html / .js two-address side-by-side
552
+ register.html / .js bulk register browser
553
+ style.css IBM Plex Sans, Planning Labs idiom
554
+ dist/ Svelte 5 custom-element bundle (committed —
555
+ HF Spaces doesn't run a Node build).
556
+ Built from web/svelte/ via `npm run build`.
557
+
558
+ web/svelte/ Svelte 5 source. Build → web/static/dist/.
559
+ package.json vite + @sveltejs/vite-plugin-svelte
560
+ vite.config.js lib mode; customElement: true globally
561
+ src/main.js registers <r-briefing>, <r-trace>,
562
+ <r-sources-footer>; re-exports stores
563
+ src/lib/stores.js highlightedDocId, citeIndex (writable)
564
+ src/lib/Briefing.svelte
565
+ src/lib/Trace.svelte
566
+ src/lib/SourcesFooter.svelte
567
+
568
+ scripts/ offline pre-compute + diagnostic probes
569
+ run_prithvi_ida.py
570
+ compute_hydrology_indices.py
571
+ fetch_nyc_dem.py
572
+ fetch_ida_hwms.py
573
+ build_schools_register.py
574
+ build_nycha_register.py
575
+ build_mta_entrances_register.py
576
+ probe_mellea.py drives the SSE stream N times, dumps
577
+ per-attempt pass/fail to CSV
578
+
579
+ corpus/ 5 LFS-tracked NYC policy PDFs
580
+ data/ LFS-tracked baked fixtures
581
+ sandy_inundation.geojson
582
+ prithvi_ida_2021.geojson 166 Hurricane Ida polygons
583
+ ida_2021_hwms_ny.geojson
584
+ nyc_dem_30m.tif, twi.tif, hand.tif
585
+ schools.geojson, nycha.geojson, mta_entrances.geojson
586
+ dep/ Esri FileGDBs (DEP scenarios)
587
+ registers/ pre-computed register outputs
588
+ ```
589
+
590
+ ---
591
+
592
+ ## 10. Honest scope (what Riprap does NOT do)
593
+
594
+ - **Not a damage probability.** Riprap is exposure triage. We have no
595
+ labeled flood-damage outcomes (claim records, insurance loss data),
596
+ so we cannot calibrate. The tier is a literature-grounded prior,
597
+ not a prediction.
598
+ - **Not a flood insurance rating.** For that, see FEMA Risk Rating 2.0
599
+ (claims-driven GLM over decades of labeled outcomes).
600
+ - **Not a vulnerability assessment.** Engineering fragility (foundation
601
+ type, electrical hardening, drainage condition), social capacity,
602
+ and financial absorption are out of scope.
603
+ - **No sub-surface flooding.** Optical satellites can't see basement
604
+ apartments or subway entrances — the dominant Hurricane Ida damage
605
+ mode in NYC. Prithvi correctly emits no polygons for Hollis or
606
+ Carroll Gardens. That silence is a feature, not a bug.
607
+ - **Vintage-bounded.** FEMA NFHL is years stale; DEP Stormwater Maps
608
+ are 2021; corpus PDFs are point-in-time. All vintages are cited in
609
+ the methodology panel.
610
+ - **Public infrastructure only.** ConEd substations, water-supply
611
+ components, and other adversarially-sensitive registers are not
612
+ published. NYC OD has the same redaction posture; we follow it.
613
+
614
+ ---
615
+
616
+ ## 11. Why local foundation models
617
+
618
+ 1. **Data governance.** A newsroom with FOIL'd documents, an agency
619
+ capital planner with internal data, or a researcher under IRB
620
+ constraints can't paste organization context into a vendor LLM.
621
+ All four models run inside this container; the org boundary
622
+ holds. Public NYC and USGS services receive resolved address
623
+ coordinates only; no LLM vendor does.
624
+ 2. **Inference energy.** Granite 4.1 :3b draws roughly **0.03 Wh per
625
+ query** vs an estimated **~0.3 Wh per query** for GPT-4o-class
626
+ frontier models ([Epoch AI, 2025](https://epoch.ai/gradient-updates/how-much-energy-does-chatgpt-use)).
627
+ Order of magnitude lower per-query inference energy. The
628
+ methodology panel reports a per-query Wh estimate so users can
629
+ verify.
630
+ 3. **Reproducibility.** Apache-2.0 stack end to end; no commercial
631
+ licenses required to reproduce the system.
632
+
633
+ ---
634
+
635
+ ## 12. Deployment
636
+
637
+ ### 12.1 Hugging Face Spaces (production)
638
+
639
+ Docker SDK, base `nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04`
640
+ (Python 3.10), hardware `nvidia-t4-small` (1× T4, 16 GB VRAM,
641
+ 4 vCPU, 15 GB RAM). Ollama + **both** Granite 4.1 variants
642
+ (`:3b` for routing, `:8b` for synthesis) baked into the image at
643
+ build time (~10 GB total). Granite Embedding 278M and Granite TTM r2
644
+ download to `$HF_HOME` on first request (~280 MB and ~30 MB).
645
+
646
+ `entrypoint.sh` starts Ollama, then **pre-warms `granite4.1:8b`** with
647
+ a one-token generation so the first user reconcile doesn't pay the
648
+ ~30s VRAM-load tax. `OLLAMA_KEEP_ALIVE=24h` holds both models resident
649
+ through the demo. `OLLAMA_FLASH_ATTENTION=1` and
650
+ `OLLAMA_KV_CACHE_TYPE=q8_0` cut KV memory on the 8b path.
651
+
652
+ Cold-start (first query after container restart) takes ~60–90 s while
653
+ weights load and TTM downloads. Warm queries:
654
+ - `live_now` ~3–6 s
655
+ - `single_address` / `neighborhood` / `development_check` ~30–60 s
656
+ with Mellea (one streamed attempt + post-validation; one reroll
657
+ adds ~25 s)
658
+
659
+ The Svelte bundle in `web/static/dist/` is committed, so HF Spaces
660
+ runs no Node build step — only the Python deps + Ollama install.
661
+
662
+ ### 12.2 Local development
663
+
664
+ ```bash
665
+ uv venv --python 3.12
666
+ source .venv/bin/activate
667
+ uv pip install -r requirements.txt
668
+ ollama pull granite4.1:3b
669
+ ollama pull granite4.1:8b
670
+ uvicorn web.main:app --reload --port 8000
671
+
672
+ # Frontend (only when changing components)
673
+ cd web/svelte && npm install && npm run build
674
+ ```
675
+
676
+ The fixtures in `data/` and the policy PDFs in `corpus/` are LFS-
677
+ tracked. Granite Embedding and TTM download on first query.
678
+
679
+ ### 12.3 Diagnostic probes
680
+
681
+ ```bash
682
+ # Drive the live stream N times, dump per-attempt Mellea outcomes:
683
+ .venv/bin/python scripts/probe_mellea.py --query "Hollis" --runs 5
684
+ # Output: outputs/probe_*.csv with per-attempt pass/fail, paragraph,
685
+ # elapsed time, reroll count.
686
+ ```
687
+
688
+ ---
689
+
690
+ ## 13. License
691
+
692
+ Apache-2.0. All foundation models (Granite 4.1, Granite Embedding,
693
+ Prithvi-EO 2.0, Granite TimeSeries TTM r2) and all input datasets
694
+ (NYC OpenData, USGS, NOAA, NWS, FloodNet NYC, NASA/MS Planetary
695
+ Computer for HLS Sentinel-2) are public. Visual idiom adapted from
696
+ [NYC Planning Labs](https://planninglabs.nyc/).
CLAUDE.md ADDED
@@ -0,0 +1,546 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Riprap — Claude Code orientation
2
+
3
+ Citation-grounded NYC flood-exposure briefings. Granite 4.1 via a
4
+ LiteLLM Router (Ollama for local/T4, vLLM-on-ROCm for the AMD MI300X
5
+ demo path), Mellea-validated reconciliation, vanilla JS + Svelte 5
6
+ custom elements, FastAPI on T4 (HF Spaces).
7
+ **AMD hackathon demo: May 4–10, 2026.**
8
+
9
+ `ARCHITECTURE.md` is the source of truth for *what the system does*.
10
+ This file is for *how to work on it*.
11
+
12
+ ---
13
+
14
+ ## Critical constraints
15
+
16
+ - **HF Spaces base image is Python 3.10.** This pins:
17
+ - `mellea<0.4` (0.4+ requires 3.11+) — no `find_citations` /
18
+ `flag_hallucinated_content` intrinsics in production.
19
+ - `transformers>=4.55,<5` + `huggingface_hub>=0.34,<1` — coexistence
20
+ with `granite-tsfm 0.3.x` (which calls `transformers.utils.download_url`,
21
+ removed in transformers 5.x).
22
+ - Don't bump these without testing the full HF rebuild end-to-end.
23
+ - Local venv is Python 3.12 — Mellea 0.4.x is installed there but
24
+ its RAG intrinsics need a HuggingFace transformers backend (LoRA
25
+ loading); they don't work over Ollama. Don't accidentally rely on
26
+ them.
27
+
28
+ - **All LLM calls go through `app/llm.py`.** Never `import ollama`
29
+ in new code. The shim exposes `chat(model, messages, options,
30
+ stream, format)` with the same return shape as `ollama.chat`, and
31
+ routes through a LiteLLM Router. Two backends are wired:
32
+ - `RIPRAP_LLM_PRIMARY=ollama` (default) — local + HF Space path.
33
+ Quant override: `RIPRAP_OLLAMA_8B_TAG=granite4.1:8b-q3_K_M`
34
+ saves ~1 GB resident vs the default Q4_K_M.
35
+ - `RIPRAP_LLM_PRIMARY=vllm` + `RIPRAP_LLM_BASE_URL` +
36
+ `RIPRAP_LLM_API_KEY` — AMD MI300X demo path. Auto-fails over to
37
+ Ollama if vLLM is unreachable. Same env vars work for local dev,
38
+ HF Space → AMD, or AMD droplet → AMD self-host.
39
+
40
+ An mlx-lm-backed third backend was prototyped (Apple-Silicon-native
41
+ via `mlx_lm.server` with speculative decoding) but reverted — the
42
+ install bumped torch internals in a way that broke `terratorch`'s
43
+ Prithvi backbone with a `meta vs cpu` device mismatch. Stick with
44
+ Ollama on local; switch to vLLM for the AMD demo. mlx-lm can be
45
+ revisited once the EO toolchain isolates its torch state.
46
+
47
+ - **Ollama and vLLM use different chat templates.** Ollama's
48
+ Modelfile recognises `role: "document <doc_id>"` and bundles those
49
+ into a `<documents>` block. The HF tokenizer chat template (used
50
+ by vLLM) silently drops non-standard roles. `app/llm.py` papers
51
+ over this: extracts document-role messages into
52
+ `extra_body.documents` / `chat_template_kwargs.documents` for vLLM,
53
+ while leaving them in `messages` for the Ollama fallback. It also
54
+ normalizes vLLM's `[doc_id=X]` emissions back to `[X]` so Mellea
55
+ checks and frontend chips see the same format from both paths.
56
+
57
+ - **The vLLM deployment serves only the 8B.** One served-name per
58
+ vLLM process and we don't run two containers. The planner alias
59
+ (`granite-3b`) is mapped to the same served name as the reconciler
60
+ (`granite-4.1-8b`) when primary=vllm. On Ollama, 3B and 8B are
61
+ distinct. Override per-alias with `RIPRAP_LLM_VLLM_3B_NAME` /
62
+ `RIPRAP_LLM_VLLM_8B_NAME` if you stand up a second vLLM.
63
+
64
+ - **No LoRA / aLoRA / Granite Citation LoRA in production.** Even
65
+ with vLLM available, we don't load LoRAs at runtime — Mellea's
66
+ Ollama backend raises `NotImplementedError` for activated LoRAs,
67
+ and we deliberately keep the call path identical across backends.
68
+ Hand-rolled `[doc_id]` regex + reroll is the citation discipline
69
+ mechanism. See §6 of ARCHITECTURE.md.
70
+
71
+ - **Two committed JS bundles, two source dirs.** HF Spaces does not
72
+ run Node, so we ship pre-built artefacts:
73
+ - `web/sveltekit/build/` — **the new design-system UI** (SvelteKit +
74
+ adapter-static, IBM Plex, four-tier glyphs, MapLibre). Sources in
75
+ `web/sveltekit/src/`. Rebuild with `cd web/sveltekit && npm run
76
+ build`. FastAPI serves it at `/`, `/q/sample`, `/q/<query>`.
77
+ - `web/static/dist/riprap.js` — legacy custom-element bundle. Sources
78
+ in `web/svelte/src/`. Rebuild with `cd web/svelte && npm run
79
+ build`. FastAPI serves it at `/legacy`, `/single`, `/compare`,
80
+ `/register/*` while the new UI is being filled in.
81
+ Commit both build outputs after editing the corresponding sources.
82
+
83
+ - **Models baked into the Docker image.** Both `granite4.1:3b` and
84
+ `granite4.1:8b` are pulled at build time (~10 GB), so HF rebuilds
85
+ take ~10 min. `entrypoint.sh` pre-warms the 8b into VRAM after
86
+ Ollama is up so the first reconcile doesn't pay a cold-load.
87
+
88
+ ---
89
+
90
+ ## Run / build / test
91
+
92
+ ```bash
93
+ # Local server (default: routes to local Ollama)
94
+ cd /Users/amsrahman/riprap-nyc
95
+ .venv/bin/uvicorn web.main:app --host 127.0.0.1 --port 7860 --log-level info
96
+ # → http://127.0.0.1:7860/ (primary UI; agent.html is the canonical home)
97
+
98
+ # Local server pointed at AMD MI300X (vLLM primary, Ollama fallback)
99
+ RIPRAP_LLM_PRIMARY=vllm \
100
+ RIPRAP_LLM_BASE_URL=http://<droplet-ip>:8000/v1 \
101
+ RIPRAP_LLM_API_KEY=<token> \
102
+ .venv/bin/uvicorn web.main:app --host 127.0.0.1 --port 7860 --log-level info
103
+ # Pill in the top-right shows "AMD MI300X · Granite 4.1 / vLLM" when
104
+ # the primary is reachable; flips amber on Ollama fallback, red if
105
+ # everything is down. Backed by GET /api/backend.
106
+
107
+ # Frontend rebuilds (only when sources change)
108
+ cd web/sveltekit && npm run build # writes web/sveltekit/build/ (new UI)
109
+ cd web/svelte && npm run build # writes web/static/dist/riprap.js (legacy)
110
+
111
+ # Static checks (all should be clean)
112
+ .venv/bin/ruff check app/ web/ scripts/
113
+ .venv/bin/vulture app/ web/main.py --min-confidence 90
114
+ .venv/bin/radon cc app/ web/main.py -s -n C # complexity hotspots
115
+
116
+ # Programmatic Mellea probe (server must be running)
117
+ .venv/bin/python scripts/probe_mellea.py --query "Hollis" --runs 5
118
+ # Outputs outputs/probe_*.csv with per-attempt pass/fail, paragraph,
119
+ # elapsed time, reroll count.
120
+
121
+ # Smoke-test the streaming endpoint directly
122
+ curl -sN "http://127.0.0.1:7860/api/agent/stream?q=Hollis" --max-time 120
123
+
124
+ # Local-tuning env knobs (independent of backend):
125
+ # OLLAMA_KEEP_ALIVE=24h keep granite4.1:8b resident across requests
126
+ # OLLAMA_NUM_PARALLEL=1 stop Ollama loading a 2nd copy under contention
127
+ # RIPRAP_MELLEA_MAX_ATTEMPTS=2 cap rejection-sampling rerolls (default 2 local, 3 remote)
128
+ # RIPRAP_TRIM_DOCS=1 drop doc messages whose specialist isn't in plan (default on)
129
+ # RIPRAP_OLLAMA_8B_TAG=granite4.1:8b-q3_K_M ~1 GB lighter than default Q4_K_M
130
+ ```
131
+
132
+ **Don't restart uvicorn while a model is mid-generation** — Ollama will
133
+ keep the request alive but the FastAPI handler dies, leaving the user
134
+ staring at a dead stream. Pre-flight: `pkill -f "uvicorn web.main:app"`.
135
+
136
+ ---
137
+
138
+ ## Deploy
139
+
140
+ Single command for both remotes:
141
+
142
+ ```bash
143
+ git push && git push huggingface main
144
+ ```
145
+
146
+ GitHub remote = `origin` (msradam/riprap-nyc). HF Space remote =
147
+ `huggingface` (msradam/riprap-nyc on huggingface.co).
148
+
149
+ HF rebuild status:
150
+
151
+ ```bash
152
+ curl -sf "https://huggingface.co/api/spaces/msradam/riprap-nyc/runtime" \
153
+ | python3 -m json.tool
154
+ # stage: BUILDING | RUNNING_APP_STARTING | RUNNING
155
+ # sha: should match the latest local commit when RUNNING
156
+ ```
157
+
158
+ Live URL: <https://msradam-riprap-nyc.hf.space>
159
+
160
+ ---
161
+
162
+ ## Repo map (high-signal files)
163
+
164
+ ```
165
+ app/
166
+ llm.py LiteLLM Router shim. chat(model, messages, options,
167
+ stream, format) — drop-in for ollama.chat. Routes
168
+ to vLLM (AMD MI300X) when RIPRAP_LLM_PRIMARY=vllm,
169
+ with Ollama fallback. Extracts role="document <id>"
170
+ into extra_body.documents for vLLM's HF chat
171
+ template; normalizes [doc_id=X] -> [X]. backend_info()
172
+ powers the UI pill via web/main.py:/api/backend.
173
+ fsm.py Burr FSM. Threadlocal hooks: set_strict_mode,
174
+ set_token_callback, set_mellea_attempt_callback.
175
+ step_reconcile() routes to reconcile_strict_streaming
176
+ when strict mode is on.
177
+ reconcile.py EXTRA_SYSTEM_PROMPT (the 4-section skeleton + citation
178
+ discipline). build_documents() is the doc_id ride-along.
179
+ verify_paragraph() is the legacy non-strict guardrail.
180
+ mellea_validator.py reconcile_strict_streaming() — the streaming rejection
181
+ sampler with 4 grounding checks (numerics_grounded,
182
+ no_placeholder_tokens, citations_dense,
183
+ citations_resolve). Reroll feedback names the specific
184
+ failing sentences.
185
+ planner.py Granite 4.1:3b intent router → live_now / single_address
186
+ / neighborhood / development_check / compare.
187
+ intents/ Per-intent orchestration. Each run() takes
188
+ (plan, query, progress_q, strict). Strict path uses
189
+ reconcile_strict_streaming via either threadlocal
190
+ (single_address, fsm-based) or direct call (neighborhood,
191
+ dev_check).
192
+ rag.py Granite Embedding 278M retrieval over corpus/*.pdf.
193
+ flood_layers/ Sandy zone, DEP scenarios, Ida HWMs, Prithvi polygons.
194
+ context/ Microtopo (HAND/TWI), 311, FloodNet, NOAA, NWS, DOB.
195
+ live/ttm_forecast.py Granite TTM r2 surge residual nowcast.
196
+
197
+ web/
198
+ main.py FastAPI; SSE stream at /api/agent/stream emits
199
+ plan_token, plan, step, token, mellea_attempt,
200
+ final, error, done events.
201
+ static/
202
+ agent.html Primary UI. Mounts <r-briefing>, <r-trace>,
203
+ <r-sources-footer> (Svelte custom elements).
204
+ agent.js EventSource client. setBriefingText() sets the
205
+ .text property on <r-briefing>; pushTraceStep()
206
+ calls .pushStep() on <r-trace>. Form binding is
207
+ BEFORE ensureMap() so a WebGL throw doesn't
208
+ strand the Ask button.
209
+ dist/riprap.js Built Svelte bundle (committed).
210
+ components/ OLD Lit components — kept for reference but
211
+ not loaded by agent.html anymore.
212
+ main.py Adds GET /api/backend (live LLM-backend descriptor
213
+ + reachability ping for the pill). All other LLM
214
+ traffic goes through app/llm.py — don't add
215
+ ollama.chat calls here.
216
+ svelte/src/lib/ Svelte 5 sources. customElement: true globally
217
+ via vite.config.js.
218
+ stores.js highlightedDocId, citeIndex (writable). The
219
+ cross-component chip ↔ source-row highlight
220
+ reacts via these.
221
+
222
+ scripts/
223
+ probe_mellea.py Drives the SSE stream N times, dumps CSV.
224
+ run_prithvi_ida.py Offline Prithvi-EO 2.0 segmentation (one-shot).
225
+ build_*_register.py Bulk-mode register builders (offline).
226
+
227
+ corpus/ 5 LFS-tracked NYC policy PDFs (NPCC4 etc).
228
+ data/ LFS-tracked baked fixtures (Sandy, DEP, Prithvi
229
+ polygons, DEM/HAND/TWI rasters, registers).
230
+ ```
231
+
232
+ ---
233
+
234
+ ## Project conventions
235
+
236
+ ### Document message convention
237
+
238
+ Specialists emit data as chat messages with `role="document <doc_id>"`.
239
+ Granite 4.1's Ollama template recognises this prefix and bundles them
240
+ into a `<documents>` block + auto-injects IBM's grounded-generation
241
+ system message. Don't reinvent — `app/reconcile.py:build_documents()`
242
+ already wires it. `app/llm.py` additionally extracts the same messages
243
+ into `chat_template_kwargs.documents` so vLLM's HF tokenizer template
244
+ sees them too — both backends honour the same grounding contract from
245
+ identical caller code.
246
+
247
+ ### The four Mellea grounding requirements
248
+
249
+ 1. **`numerics_grounded`** — every non-trivial number in the output
250
+ appears verbatim in a source document.
251
+ 2. **`no_placeholder_tokens`** — output contains no leaked
252
+ `[source]` / `<document>` template fragments.
253
+ 3. **`citations_dense`** — every non-trivial number has a `[doc_id]`
254
+ citation **somewhere in the same sentence**. Sentence scope, not a
255
+ character window. Identifier codes (`QN1206`, BBL parcels, `B12`)
256
+ are skipped via `\b` word-boundary regex so they don't get treated
257
+ as numeric claims.
258
+ 4. **`citations_resolve`** — cited `doc_id`s ⊆ input `doc_id`s.
259
+
260
+ If you change the regex or sentence boundary, **re-run the probe**:
261
+
262
+ ```bash
263
+ .venv/bin/python scripts/probe_mellea.py --query "Hollis" --runs 5
264
+ .venv/bin/python scripts/probe_mellea.py --query "100 Gold St Manhattan" --runs 3
265
+ .venv/bin/python scripts/probe_mellea.py --query "what are they building in Gowanus and is it risky" --runs 3
266
+ ```
267
+
268
+ ### Threadlocal hooks in `app/fsm.py`
269
+
270
+ The FSM is sync code called from a threadpool executor. To plumb
271
+ streaming callbacks without changing every action signature, we use
272
+ threadlocals:
273
+ - `set_strict_mode(bool)` → `_current_strict_mode()` decides whether
274
+ `step_reconcile` routes to Mellea or the legacy reconciler.
275
+ - `set_token_callback(fn)` → `_current_token_callback()` for streaming
276
+ tokens out of the reconciler.
277
+ - `set_mellea_attempt_callback(fn)` → fires after each Mellea attempt
278
+ with `(attempt_idx, passed, failed)`.
279
+
280
+ **Always reset in a `finally:`.** `app/intents/single_address.py:run()`
281
+ is the canonical example.
282
+
283
+ ### SSE event vocabulary (`/api/agent/stream`)
284
+
285
+ | event | payload | when |
286
+ |-------|---------|------|
287
+ | `hello` | `{query}` | connection open |
288
+ | `plan_token` | `{delta}` | each token of the planner JSON |
289
+ | `plan` | `{intent, targets, specialists, rationale}` | planner finished |
290
+ | `step` | `{step, ok, started_at, elapsed_s, result?, err?}` | each FSM action |
291
+ | `token` | `{delta, attempt?}` | each Granite reconcile token |
292
+ | `mellea_attempt` | `{attempt, passed, failed}` | end of each Mellea attempt |
293
+ | `final` | full result dict (`paragraph`, `mellea`, `audit`, `tier`, `score`, ...) | reconcile done |
294
+ | `error` | `{err}` | exception in the runner |
295
+ | `done` | `{}` | stream closing |
296
+
297
+ Frontend resets the briefing buffer when `token.attempt` changes
298
+ (handles reroll cleanly).
299
+
300
+ ### Frontend property convention
301
+
302
+ Svelte custom elements take props via JS property setters:
303
+
304
+ ```js
305
+ const el = document.getElementById("paragraph"); // <r-briefing>
306
+ await customElements.whenDefined("r-briefing");
307
+ el.sourceLabels = SOURCE_LABELS;
308
+ el.text = "...streaming markdown...";
309
+ ```
310
+
311
+ `<r-trace>` exposes imperative methods on the host:
312
+
313
+ ```js
314
+ el.pushStep({ step: "geocode", ok: true, elapsed_s: 0.3, result: {...} });
315
+ el.clear();
316
+ ```
317
+
318
+ `<r-sources-footer>` reads `citeIndex` from the shared store; the
319
+ Briefing populates it whenever its `bodyHtml` is computed.
320
+
321
+ ---
322
+
323
+ ## Decisions worth remembering
324
+
325
+ These are paths we explored and either chose or ruled out. Don't
326
+ re-litigate them without new information.
327
+
328
+ - **Lit → Svelte (May 2026).** Three Lit components were live first
329
+ (`web/static/components/`) but the user wanted a full Svelte
330
+ rewrite. Migrated to Svelte 5 custom-element bundle (drop-in
331
+ replacement — same tag names, same property API). The Lit files
332
+ are still on disk for reference but not loaded.
333
+
334
+ - **Granite 4.x native inline citations are deprecated.** We
335
+ investigated the `<|start_of_cite|>...<|end_of_cite|>` mode. The
336
+ official Ollama template removed it for 4.x; `granite_common` ships
337
+ no `granite4/` package; `granite-io` has no 4.x processor.
338
+ 4.1 emits citation tokens only in an end-of-response list, never
339
+ inline. IBM's expected 4.x citation path is a separate LoRA on
340
+ granite-4.0-micro that produces post-hoc JSON — needs HF
341
+ transformers, not Ollama. **Hand-rolled `[doc_id]` regex + reroll
342
+ is the right pattern for our setup.**
343
+
344
+ - **Mellea 0.4 RAG intrinsics aren't reachable.** `find_citations`,
345
+ `flag_hallucinated_content`, `check_context_relevance` all route
346
+ through `GraniteCommonAdapter` → activated LoRA on the HF
347
+ transformers backend. `mellea/backends/ollama.py:357-359` literally
348
+ raises `NotImplementedError` for activated LoRAs. To use them we'd
349
+ swap the serving layer, eat ~5GB more RAM, lose Ollama's
350
+ optimizations. Not worth it for the demo.
351
+
352
+ - **CARTO Voyager basemap (not Stadia).** Tried Stadia Alidade
353
+ Smooth — looks great, but they 401 without an API key and
354
+ domain allowlist. Voyager is auth-free, retina-tiled, more
355
+ editorial than Positron.
356
+
357
+ - **Speculative streaming Mellea.** `reconcile_strict_streaming`
358
+ streams every attempt's tokens to the user (visible at t≈30s
359
+ instead of after t≈95s of validation silence). Inline banner
360
+ shows reroll status. Felt latency drops dramatically even when
361
+ total wall-clock is the same.
362
+
363
+ - **Sentence-scoped `citations_dense` + identifier-aware `\b` regex.**
364
+ The combo killed the chronic 3/4 reroll loop on neighborhood
365
+ queries. Hollis: was 3/4 with 2 rerolls every run; now 4/4 with
366
+ ≤1 reroll. Don't tighten the regex back to a fixed-width window
367
+ without re-running the probe across all three intent types.
368
+
369
+ - **LiteLLM Router for backend abstraction (May 2026).** Considered
370
+ hand-rolling an OpenAI-vs-Ollama dispatch ourselves. LiteLLM's
371
+ Router gives us model aliasing + fallback for free, and Mellea
372
+ has a litellm backend if we ever need it. The shim is ~250 lines
373
+ total (`app/llm.py`); the entire production code path stayed in
374
+ the `ollama.chat`-shaped call signature. Don't replace this with
375
+ the openai SDK directly — the failover behaviour is load-bearing.
376
+
377
+ - **Granite 4.1 is dense decoder-only.** Earlier confusion: the
378
+ hybrid Mamba variants are in **Granite 4.0-H**, not 4.1. vLLM
379
+ 0.17 serves 4.1 as a vanilla LLaMA-style model — no architecture
380
+ risk, no special flags. If a future bump introduces a hybrid 4.x,
381
+ re-verify vLLM compatibility before deploying.
382
+
383
+ - **vLLM HF chat template emits `[doc_id=X]`, Ollama Modelfile emits
384
+ `[X]`.** The rest of Riprap (Mellea regex, frontend chip parser,
385
+ citations footer) was written against `[X]`. `app/llm.py` runs a
386
+ one-line regex normalize on every response and stream chunk. Don't
387
+ remove it without changing every other consumer.
388
+
389
+ - **HF Space → AMD GPU as primary, T4 Ollama as fallback.** Considered
390
+ using the HF Space's bundled Ollama as a remote inference server
391
+ (proxy `/v1/chat/completions` from FastAPI to localhost:11434) so
392
+ that local dev could use the T4. Rejected: T4 is slower than
393
+ MI300X, surface area is bigger, and the AMD path already covers
394
+ the "fast remote inference" use case. The proxy idea is recoverable
395
+ in ~25 lines of FastAPI if we ever want it.
396
+
397
+ ---
398
+
399
+ ## Common tasks playbook
400
+
401
+ ### Add a new specialist
402
+
403
+ 1. Add a module under `app/context/` or `app/flood_layers/`.
404
+ 2. Add an action in `app/fsm.py` (`step_yourname`) with `@action(reads=[...], writes=[...])`.
405
+ 3. Wire it into the FSM graph in the `Application.with_actions(...)` chain.
406
+ 4. Add a doc message builder in `app/reconcile.py:build_documents()`.
407
+ 5. Update `STEP_LABELS` in `web/static/agent.js` for the trace label.
408
+ 6. Update `SOURCE_LABELS` / `SOURCE_URLS` / `SOURCE_VINTAGES` in
409
+ `web/static/agent.js` for the chip + footer rendering.
410
+ 7. Double-gate the new specialist: run the SSE probe against both
411
+ `RIPRAP_LLM_PRIMARY=ollama` and `=vllm` and confirm the briefing
412
+ cites the new doc_id with no Mellea regressions.
413
+
414
+ ### Prototype a new specialist (experimental)
415
+
416
+ For exploratory work that isn't yet ready to land in `app/`:
417
+
418
+ 1. Scaffold `experiments/<NN>_<name>/` with its own `RESULTS.md`,
419
+ smoke tests, and cached fixtures. Don't import from `app/` except
420
+ `app.llm.chat` — keeps the experiment portable.
421
+ 2. License-check the model: confirm Apache-2.0 / MIT / BSD on the
422
+ actual `LICENSE` file in the model repo (not the HF metadata
423
+ field — they sometimes disagree). Add a row to
424
+ `experiments/shared/licenses.md`.
425
+ 3. Validate against both `RIPRAP_LLM_PRIMARY=ollama` and
426
+ `=vllm` before proposing integration. Specialist behaviour must
427
+ be backend-independent — never branch on backend in specialist
428
+ code.
429
+ 4. Only after the experiment passes both gates and produces a
430
+ demo-safe trace UI rendering, propose a PR-style summary for
431
+ integration into `app/`.
432
+
433
+ ### Change the briefing markdown structure
434
+
435
+ 1. Edit `EXTRA_SYSTEM_PROMPT` in `app/reconcile.py`.
436
+ 2. Edit `renderMarkdownPure()` in `web/svelte/src/lib/Briefing.svelte`
437
+ if you add new block syntax.
438
+ 3. Rebuild Svelte: `cd web/svelte && npm run build`.
439
+ 4. Re-run the probe to confirm Mellea still passes.
440
+
441
+ ### Tune the Mellea checks
442
+
443
+ `app/mellea_validator.py`:
444
+ - `_NUM_RE` — number recognition. Use `\b` boundaries to skip
445
+ identifiers.
446
+ - `_TRIVIAL_NUMS` — set of numbers exempt from citation requirement
447
+ (small integers, NYC service line numbers like 311/911).
448
+ - `_check_every_claim_cited()` — sentence-scoped; uses `_SENT_END`
449
+ for boundaries.
450
+ - `_failing_sentences_for_citations()` — feeds the reroll feedback
451
+ prompt with surgical corrections.
452
+
453
+ After any change here: probe across 3 intent types (above).
454
+
455
+ ### Add a new Svelte component
456
+
457
+ 1. Create `web/svelte/src/lib/MyComponent.svelte` with
458
+ `<svelte:options customElement={{ tag: "r-mycomp", props: {...} }} />`.
459
+ 2. Side-effect import it in `web/svelte/src/main.js`.
460
+ 3. Mount `<r-mycomp>` in `agent.html`.
461
+ 4. `cd web/svelte && npm run build`.
462
+ 5. Commit `web/static/dist/riprap.js` and `riprap.js.map`.
463
+
464
+ ---
465
+
466
+ ## Known sharp edges
467
+
468
+ - **`build_documents` complexity (radon F=101).** It's a giant
469
+ `if`/`elif` per specialist. Don't refactor pre-demo; touching it
470
+ risks subtle regressions in doc message ordering, which Granite is
471
+ sensitive to.
472
+
473
+ - **Static assets cache hard.** When iterating on Svelte or `agent.js`,
474
+ the user must hard-reload (⌘⇧R). Cache-busting query strings are
475
+ not in place.
476
+
477
+ - **Ollama keeps stale models loaded across rebuilds locally.** If
478
+ you change a Modelfile or pull a new tag, restart `ollama serve`
479
+ to be sure.
480
+
481
+ - **Burr FSM `iter_steps` mutates global state.** Don't run two
482
+ concurrent `single_address` queries against the same uvicorn
483
+ worker — strict-mode threadlocal makes it safer than it was, but
484
+ there's no per-request isolation.
485
+
486
+ - **Mellea 0.3 vs 0.4 API differences.** Local venv has 0.4 (3.12),
487
+ HF has 0.3 (3.10). `start_session`, `RejectionSamplingStrategy`,
488
+ `MelleaSession.instruct(strategy, requirements,
489
+ return_sampling_results)` are stable across both. Don't import
490
+ anything from `mellea.stdlib.components.intrinsic.*` — that
491
+ package only exists in 0.4 and won't import on HF.
492
+
493
+ - **HF Space sleeps after idle.** Free tier; first request after
494
+ sleep is a 30–90 s cold start. Ping the space before a demo.
495
+
496
+ - **vLLM cold compile / first-call slowdown.** First few requests
497
+ against a fresh `vllm serve` container can log surprisingly low
498
+ throughput (single-digit tokens/s prompt, ~4 tokens/s gen on a
499
+ MI300X) while ROCm kernels JIT-compile and the prefix cache
500
+ warms. Subsequent requests are 30–50× faster. If a benchmark
501
+ reads "vLLM is slow" on the first run, run it three more times
502
+ before believing it.
503
+
504
+ - **Backend pill auto-detection.** `app/llm.py:_default_hardware_label`
505
+ picks `AMD MI300X` when `RIPRAP_LLM_PRIMARY=vllm`, `NVIDIA T4`
506
+ when `SPACE_ID` is set (HF Spaces injects this), `Local` otherwise.
507
+ Override with `RIPRAP_HARDWARE_LABEL` / `RIPRAP_ENGINE_LABEL`
508
+ if you bring up a different GPU.
509
+
510
+ ---
511
+
512
+ ## Useful one-liners
513
+
514
+ ```bash
515
+ # Tail the local server log
516
+ tail -f /tmp/riprap-local.log
517
+
518
+ # Inspect the live HF Space's deployed SHA + stage
519
+ curl -sf "https://huggingface.co/api/spaces/msradam/riprap-nyc/runtime" | python3 -m json.tool
520
+
521
+ # Confirm both remotes have the same HEAD
522
+ git log --oneline -1 && git ls-remote huggingface main | head -1
523
+
524
+ # Force-re-pull Granite weights locally if Ollama seems wrong
525
+ ollama rm granite4.1:8b && ollama pull granite4.1:8b
526
+
527
+ # What backend is the running server on? (live reachability + label)
528
+ curl -s http://127.0.0.1:7860/api/backend | python3 -m json.tool
529
+
530
+ # Bring up vLLM on a fresh AMD ROCm droplet (one-shot)
531
+ docker run -d --name vllm \
532
+ --device=/dev/kfd --device=/dev/dri --group-add=video \
533
+ --ipc=host --shm-size=16g -p 8000:8000 \
534
+ -v /root/hf-cache:/root/.cache/huggingface \
535
+ -e GLOO_SOCKET_IFNAME=eth0 -e VLLM_HOST_IP=127.0.0.1 \
536
+ vllm/vllm-openai-rocm:v0.17.1 \
537
+ --model ibm-granite/granite-4.1-8b \
538
+ --host 0.0.0.0 --port 8000 --api-key "$TOKEN" \
539
+ --max-model-len 8192 --served-model-name granite-4.1-8b
540
+ # Without GLOO_SOCKET_IFNAME, gloo fails to bind 0.0.0.0 and the
541
+ # engine core never initialises.
542
+
543
+ # Check what doc_ids the briefing should contain for an intent
544
+ .venv/bin/python -c "from app.reconcile import build_documents; \
545
+ print([m['role'] for m in build_documents({'sandy':{'inside':True}, 'nyc311':{'n':5}})])"
546
+ ```
METHODOLOGY.md ADDED
@@ -0,0 +1,264 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Riprap — Scoring Methodology
2
+
3
+ > Riprap produces a **flood-exposure tier (1–4) per NYC address**, not
4
+ > a calibrated damage probability. The tier is a deterministic
5
+ > literature-grounded composite of public-data signals; the language
6
+ > model writes the citing prose around it but does not score.
7
+
8
+ ## 1. Why this design
9
+
10
+ Closed-methodology scores (First Street, Jupiter, Fathom) are useful
11
+ products but uncitable in civic work — a NYCEM grant writer can't quote
12
+ "0.73" in a FEMA BRIC sub-application without a defensible audit trail.
13
+ At the same time, an LLM-emitted score would be non-reproducible and
14
+ uncalibrated, with documented LLM-as-judge pathologies (Zheng et al.
15
+ 2023; Wang et al. 2024). The honest middle: **a deterministic rubric a
16
+ planner can argue with**.
17
+
18
+ The tier is computed in `app/score.py` and mirrored in `web/static/app.js`.
19
+ Both implementations are kept in sync; the Python side is authoritative
20
+ for register builds and CLI exports.
21
+
22
+ ## 2. Methodology pedigree
23
+
24
+ The composite construction follows a well-trodden path in the multi-
25
+ indicator vulnerability/exposure literature:
26
+
27
+ - **Cutter, Boruff & Shirley (2003)**, *Social Science Quarterly* 84(2):
28
+ 242–261 — the SoVI hazards-of-place pattern: group indicators
29
+ thematically; sum factors with equal weights because there is no
30
+ defensible theoretical basis for differential weighting.
31
+ - **Tate (2012)**, *Natural Hazards* 63: 325–347 — explicit Monte Carlo
32
+ sensitivity analysis showing that hierarchical equal-weighted
33
+ composites are the most rank-stable. This is why we use equal weights
34
+ *within* sub-indices.
35
+ - **Balica, Wright & van der Meulen (2012)**, *Natural Hazards* 64:
36
+ 73–105 — Coastal City Flood Vulnerability Index, multiplicative
37
+ (Exposure × Susceptibility / Resilience). We adopt only the
38
+ override-behavior of multiplicative form, as a "max-empirical floor"
39
+ (§4 below), because we have no resilience term.
40
+ - **Kim et al. (2019)**, *Scientific Reports* 9:18564 — additive vs
41
+ geometric aggregation; additive is more transparent and reproducible
42
+ *if* sub-indices are pre-grouped thematically. Done.
43
+
44
+ NPCC4 (2024) Ch. 3 (Rosenzweig et al., *Annals of the New York Academy
45
+ of Sciences* 1539) and the NYC Hazard Mitigation Plan 2024 supply the
46
+ NYC-specific tiering hierarchy that informs which scenarios get higher
47
+ weights inside the Regulatory sub-index.
48
+
49
+ ## 3. Sub-index structure
50
+
51
+ Three thematic sub-indices, each normalized to [0, 1] by dividing the
52
+ weighted sum by the maximum possible weight in the group. The composite
53
+ is the simple sum of the three sub-indices (range 0–3).
54
+
55
+ ### 3.1 Regulatory sub-index
56
+
57
+ Binary "inside zone" indicators with weights ordered by agency tiering:
58
+
59
+ | Indicator | Weight | Citation |
60
+ |---------------------------------|-------:|----------|
61
+ | FEMA NFHL 1% (SFHA) | 1.00 | FEMA NFHL — regulatory mandate threshold |
62
+ | FEMA NFHL 0.2% | 0.50 | FEMA NFHL — tail scenario |
63
+ | NYC DEP Moderate-2050 + 2.5 ft | 0.75 | NYC DEP Stormwater Maps 2021; NPCC4 Ch.3 |
64
+ | NYC DEP Extreme-2080 + SLR | 0.50 | NYC DEP Stormwater Maps 2021 — explicitly tail |
65
+ | NYC DEP Tidal-2050 | 0.75 | NPCC4 Ch.3 coastal projection |
66
+
67
+ Why DEP-2050 outranks DEP-2080: NPCC4 designates the 2080 extreme
68
+ scenario as a **tail** projection. Closer-horizon coastal/pluvial
69
+ maps — those a current planner can act on — get the higher weight.
70
+
71
+ ### 3.2 Hydrological sub-index
72
+
73
+ Continuous terrain measures, banded into 4 levels (1.0 / 0.66 / 0.33 / 0):
74
+
75
+ | Indicator | Weight | Bands | Citation |
76
+ |---|---:|---|---|
77
+ | HAND (m) | 1.00 | <1, 1–3, 3–10, ≥10 | Nobre et al., 2011, *J. Hydrology* 404: 13–29 |
78
+ | TWI quartile | 0.50 | ≥12, 10–12, 8–10, <8 | Beven & Kirkby, 1979; Sørensen et al., 2006, *HESS* 10 |
79
+ | Elev pct (200 m, inv) | 0.50 | <10, 10–25, 25–50, ≥50 | Standard geomorphometric proxy |
80
+ | Elev pct (750 m, inv) | 0.50 | <10, 10–25, 25–50, ≥50 | Standard geomorphometric proxy |
81
+ | Basin relief (m) | 0.25 | ≥8, 4–8, 2–4, <2 | Supporting variable, Nobre 2011 |
82
+
83
+ TWI is half-weighted relative to HAND because TWI is documented as
84
+ noisier in flat urban DEMs (Sørensen 2006 explicitly states TWI is
85
+ site-specific and best percentile-binned). HAND remains the canonical
86
+ hydrology indicator (Aristizabal et al. 2023, *WRR* 59, NOAA NWM).
87
+
88
+ ### 3.3 Empirical sub-index
89
+
90
+ Mix of binary observed-extent flags and banded count signals:
91
+
92
+ | Indicator | Weight | Citation |
93
+ |----------------------------|-------:|----------|
94
+ | Sandy 2012 inundation | 1.00 + **floor** | NYC OD `5xsi-dfpx`; NYC HMP 2024 |
95
+ | USGS Ida HWM within 100 m | 1.00 + **floor** | USGS STN Event 312 |
96
+ | USGS Ida HWM within 800 m | 0.50 | USGS STN Event 312 |
97
+ | Prithvi-EO 2.0 Ida polygon | 0.75 | Jakubik et al., 2025 (NASA/IBM Prithvi-EO 2.0); semi-empirical |
98
+ | 311 complaint count band | 0.75 | NYC OD `erm2-nwe9`; NYC 311-as-flood-proxy literature |
99
+ | FloodNet trigger (3 yr) | 0.75 | FloodNet NYC; NPCC4 Ch.3 references |
100
+
101
+ The 311 and FloodNet weights are capped at 0.75 because both signals
102
+ have documented coverage and reporting bias — 311 reflects civic
103
+ engagement as well as flooding, FloodNet has uneven spatial coverage.
104
+ Sandy and HWMs are 1.0 because they're engineered ground-truth
105
+ observations.
106
+
107
+ Bands for 311 count (200 m buffer, 5-year window):
108
+
109
+ | Count | Value |
110
+ |---------|------:|
111
+ | ≥10 | 1.00 |
112
+ | 3–9 | 0.66 |
113
+ | 1–2 | 0.33 |
114
+ | 0 | 0 |
115
+
116
+ ## 4. Max-empirical floor
117
+
118
+ If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
119
+ the tier is capped at **2 (Elevated)** — it cannot be worse, regardless
120
+ of the additive composite.
121
+
122
+ This recovers the *important* multiplicative behaviour Balica 2012
123
+ argues for: empirical, ground-truth observations should not be
124
+ cancelled out by terrain or modeled scenarios. We implement it as a
125
+ floor (a `min(tier, 2)` after composition) rather than a full
126
+ multiplicative form so the composite remains additive and auditable.
127
+
128
+ The 100 m radius is chosen because USGS HWM positional uncertainty is
129
+ typically 5–30 m horizontal — 100 m gives ~3σ headroom for a confident
130
+ "this address was inundated" signal.
131
+
132
+ ## 5. Composite → tier mapping
133
+
134
+ The composite is the sum of the three normalized sub-indices (range 0–3):
135
+
136
+ | Composite | Tier | Label |
137
+ |-----------|-----:|----------------------|
138
+ | ≥ 1.50 | 1 | High exposure |
139
+ | ≥ 1.00 | 2 | Elevated exposure |
140
+ | ≥ 0.50 | 3 | Moderate exposure |
141
+ | > 0 | 4 | Limited exposure |
142
+ | 0 | 0 | No flagged exposure |
143
+
144
+ Then floor: `Sandy or HWM<100m → tier ≤ 2`.
145
+
146
+ ## 6. Live signals are NOT in the score
147
+
148
+ NWS active alerts, NOAA tide residual (surge), and NWS hourly precip
149
+ are **not** part of the static tier. Per **IPCC AR6 WG II** glossary
150
+ and **NPCC4** Ch. 3, exposure is a quasi-stationary property of place;
151
+ event occurrence is time-varying. Mixing the two would produce a tier
152
+ that flickers every six minutes and that residents could interpret as
153
+ neither "is my building exposed?" nor "is it flooding right now?".
154
+
155
+ Live signals are surfaced separately in the UI as a **"Current
156
+ conditions"** badge, with their own provenance (NOAA station ID, NWS
157
+ alert URL, ASOS station code), and they expire on their own cadence.
158
+ Static tier is unaffected.
159
+
160
+ This mirrors how First Street separates Flood Factor (static, 30-yr
161
+ horizon) from event-day Flood Lab products, and how Fathom separates
162
+ Global Flood Map from real-time intelligence.
163
+
164
+ ## 7. Honest scope
165
+
166
+ Riprap's tier is **not**:
167
+
168
+ - A flood-damage probability or expected loss.
169
+ - A flood-insurance rating. For that, see **FEMA Risk Rating 2.0**
170
+ (FEMA 2021), which uses claims-driven GLMs over decades of labeled
171
+ outcome data we do not have.
172
+ - A vulnerability assessment. Engineering fragility (foundation type,
173
+ electrical hardening, drainage), social capacity, and financial
174
+ absorption are out of scope.
175
+ - A prediction. Future-scenario layers (DEP 2050/2080, FEMA 0.2%) are
176
+ bounding scenarios, not forecasts.
177
+
178
+ It **is**:
179
+
180
+ - An exposure prior — a literature-grounded, deterministic, reproducible
181
+ index of how many publicly-documented flood signals overlap this
182
+ address.
183
+ - Auditable end-to-end: every term has a published source; every weight
184
+ has a rationale; the floor rule has a stated motivation; the tier
185
+ breakpoints are documented above.
186
+ - Forkable: a researcher who disagrees with any weight can edit
187
+ `app/score.py` and rerun. The UI methodology panel makes this
188
+ invitation explicit.
189
+
190
+ ## 8. Caveats foregrounded in UI copy
191
+
192
+ These appear next to the tier badge and in the methodology disclosure:
193
+
194
+ > **Riprap tiers are not flood-damage probabilities.** They reflect
195
+ > publicly-documented exposure indicators only.
196
+
197
+ > **311 counts are influenced by neighborhood reporting habits** and
198
+ > may under-represent flooding in lower-engagement areas
199
+ > (Agonafir et al. and the broader 311-as-civic-engagement literature).
200
+
201
+ > **DEP 2050/2080 and FEMA 0.2% are bounding scenarios, not forecasts.**
202
+ > The tier reads them as "if this scenario materialized, this address
203
+ > would be inside its footprint" — not "this is the expected future."
204
+
205
+ > **Compound flooding is not separately modeled.** Concurrence of rain
206
+ > + tide + groundwater is the residual research frontier (NPCC4 Ch. 3).
207
+
208
+ ## 9. Sensitivity / future work
209
+
210
+ - **Tate-style Monte Carlo perturbation** of weights to characterize
211
+ how sensitive each tier assignment is to weight choice. Not yet
212
+ implemented; would be a natural next research output.
213
+ - **Calibration exercise** if a labeled dataset emerges (FEMA assistance
214
+ records, building-level damage from Sandy/Ida insurance claims). Until
215
+ then, "calibrated" is a word we do not use.
216
+ - **Block- or NTA-level aggregation** for neighborhood-grade scoring,
217
+ with each indicator computed as an areal aggregate rather than a
218
+ point sample.
219
+
220
+ ## References
221
+
222
+ Aristizabal, F. et al. (2023). "Improving Continental Hydrologic
223
+ Modeling Using Height Above Nearest Drainage." *Water Resources
224
+ Research* 59.
225
+
226
+ Balica, S., Wright, N., & van der Meulen, F. (2012). "A Flood
227
+ Vulnerability Index for Coastal Cities and Its Use in Assessing
228
+ Climate Change Impacts." *Natural Hazards* 64: 73–105.
229
+
230
+ Beven, K. J., & Kirkby, M. J. (1979). "A Physically Based, Variable
231
+ Contributing Area Model of Basin Hydrology." *Hydrological Sciences
232
+ Bulletin* 24(1): 43–69.
233
+
234
+ Cutter, S. L., Boruff, B. J., & Shirley, W. L. (2003). "Social
235
+ Vulnerability to Environmental Hazards." *Social Science Quarterly*
236
+ 84(2): 242–261.
237
+
238
+ FEMA (2021). *NFIP Risk Rating 2.0 Methodology and Data Sources.*
239
+
240
+ Jakubik, J. et al. (2025). "Prithvi-EO 2.0: A Versatile Multi-Temporal
241
+ Foundation Model for Earth Observation Applications." NASA/IBM.
242
+
243
+ Kim, S. et al. (2019). "Assessment of Aggregation Frameworks for
244
+ Composite Indicators in Measuring Flood Vulnerability to Climate
245
+ Change." *Scientific Reports* 9:18564.
246
+
247
+ Nobre, A. D. et al. (2011). "Height Above the Nearest Drainage — A
248
+ Hydrologically Relevant New Terrain Model." *Journal of Hydrology*
249
+ 404(1–2): 13–29.
250
+
251
+ NYC HMP (2024). *NYC Hazard Mitigation Plan 2024.* NYC Emergency
252
+ Management.
253
+
254
+ NYC NPCC4 (2024). *4th NYC Climate Assessment.* New York City Panel
255
+ on Climate Change. Including Rosenzweig et al., Ch. 3, *Annals NYAS*
256
+ 1539.
257
+
258
+ Sørensen, R., Zinko, U., & Seibert, J. (2006). "On the Calculation of
259
+ the Topographic Wetness Index." *Hydrology and Earth System Sciences*
260
+ 10: 101–112.
261
+
262
+ Tate, E. (2012). "Social Vulnerability Indices: A Comparative
263
+ Assessment Using Uncertainty and Sensitivity Analysis." *Natural
264
+ Hazards* 63: 325–347.
MONDAY.md ADDED
@@ -0,0 +1,224 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Monday handoff (May 4, 2026)
2
+
3
+ State of the repo at end of Sunday May 3 / overnight into May 4.
4
+ Demo is **Sunday May 10**.
5
+
6
+ ## Overnight pass (Sunday evening → Monday)
7
+
8
+ Eight priorities closed against `audit/2026-05-03-evening-audit.md`:
9
+
10
+ 1. `pitch/cold_open.md` restored (was accidentally deleted in 1cb5ee6).
11
+ 2. Granite Guardian / refusal-classification leftovers removed —
12
+ Mellea is the sole grounding mechanism, period.
13
+ 3. **Trace UI is now clickable.** Click any specialist row to reveal
14
+ its raw structured output (formatted JSON, copy button,
15
+ max-height + scroll). This is the auditability contract: every
16
+ claim in the briefing is traceable to the specialist that produced
17
+ it directly inside the UI, not just the citation appendix.
18
+ 4. Buffered-footprint overlap for the three Point-geometry register
19
+ specialists. NYU Langone / Stuyvesant HS / P.S. 89 now correctly
20
+ register `inside_sandy_2012=true`. Each output records its
21
+ `footprint_buffer_m`.
22
+ 5. Map renders register-asset pins (subway / school / hospital /
23
+ NYCHA-centroid) coloured by Sandy exposure with click popups
24
+ showing name + `[doc_id]`. NYCHA polygon-fill is queued for when
25
+ `geometry_geojson` lands in the dataclass.
26
+ 6. **`floodnet_forecast` specialist** — TTM r2 forecast on the
27
+ nearest FloodNet sensor's flood-event recurrence. Reuses the
28
+ (512, 96) singleton already loaded for `ttm_311_forecast` —
29
+ *no new model class loaded into memory*. The strongest single
30
+ TTM win for the NYU CUSP audience.
31
+ 7. Trace UI groups TTM specialists under one parent node
32
+ `forecasting.granite-timeseries-ttm-r2 [N instances]` so the
33
+ "one foundation model, multiple data streams" architectural story
34
+ is legible without reading per-row metadata.
35
+ 8. `experiments/` cleanup: dropped two empty dirs (`05_sam2_promptable`,
36
+ `06_chronos_bolt_forecast`), renamed `05_terramind_finetune` →
37
+ `05a_terramind_finetune_micro` to dedupe with the active NYC
38
+ fine-tune dir, removed `Riprap.zip` from repo root.
39
+
40
+ Commit chain: `a2143fc` … through `ed6ae9d`. Morning handoff doc
41
+ at `audit/2026-05-04-morning-handoff.md` summarises what to verify
42
+ and what's queued next.
43
+
44
+ ## Where Sunday ended
45
+
46
+ All four keep-list items resolved + 4 register specialists shipped + AMD
47
+ fine-tune prep green.
48
+
49
+ | Item | Status | Path |
50
+ |---|---|---|
51
+ | Pitch cold-open locked | ✓ | `pitch/cold_open.md` |
52
+ | TerraMind-NYC fine-tune eval spec | ✓ | `experiments/05_terramind_nyc_finetune/eval/eval_spec.md` |
53
+ | 200-query adversarial set + refusal eval | ✓ (planner pivot) | `experiments/06_granite_guardian/` |
54
+ | Subway-entrance specialist (Sheepshead Bay) | ✓ | `experiments/07_mta_entrances/` |
55
+ | NYCHA-developments specialist (Red Hook) | ✓ | `experiments/08_nycha_developments/` |
56
+ | DOE-schools specialist (Coney Island) | ✓ | `experiments/09_doe_schools/` |
57
+ | DOH-hospitals specialist (Coney Island) | ✓ | `experiments/10_doh_hospitals/` |
58
+ | FSM integration of all 4 register specialists | ✓ | `app/registers/`, `app/fsm.py`, `app/reconcile.py`, `web/static/agent.js` |
59
+ | AMD droplet TerraMind smoke + STAC manifest | ✓ | `129.212.182.52:/root/terramind_nyc/` |
60
+
61
+ End-to-end smoke on "Coney Island Brooklyn" produced citations
62
+ `[mta_entrance_56]`, `[nycha_dev_239]`, `[nycha_dev_166]` alongside
63
+ `[rag_mta]` and `[nyc311]` — family-prefix chip routing works.
64
+
65
+ Last commit: `86861be` (FSM integration of 4 register specialists).
66
+
67
+ ## Decisions locked
68
+
69
+ - **Refusal classification dropped entirely.** Planner-level
70
+ classifier hit FN=0% but FP=7% (gate was <5%). Granite Guardian
71
+ itself was already abandoned (laptop-infeasible). After the audit
72
+ surfaced that the planner shim was documented-but-never-wired,
73
+ the decision is now Option C: drop refusal handling. Cold-start
74
+ framing scopes the audience; Mellea rejection sampling enforces
75
+ grounding integrity; the four-tier glyph margin carries the
76
+ epistemic-honesty signal. The `GuardianRefusal.svelte` component
77
+ is deleted (was only ever rendered on a documentation page).
78
+ Demo's integrity beat is the **Mellea grounding-failure reroll on
79
+ the curated Hollis 0.19% → 19% case**. `experiments/06_granite_guardian/`
80
+ is preserved as a "considered and rejected" artifact for the
81
+ methodology paper.
82
+ - **AMD path: `129.212.182.52` is production**, not `165.245.134.44`.
83
+ CLAUDE.md says the latter; **fix CLAUDE.md to match reality**.
84
+ Production vLLM is on `.52`. The TerraMind container shares the
85
+ GPU with vLLM; both fit on one MI300X.
86
+ - **TerraMind manifest is 1028 paired chips**, 2021-05 → 2026-04,
87
+ NYC 5-borough hull +5 km, S2-cloud <30%, ≤3-day pair window. One
88
+ year (2022-05 → 2023-04) returned 0 due to PC API intermittency —
89
+ acceptable for the micro-fine-tune.
90
+
91
+ ## First thing Monday morning
92
+
93
+ 1. **Refresh Microsoft Planetary Computer signed URLs.** They have
94
+ ~1 hr TTL; the manifest from Sunday evening is stale by morning.
95
+ On the droplet:
96
+ ```bash
97
+ ssh root@129.212.182.52
98
+ docker exec -it terramind bash
99
+ cd /root/terramind_nyc
100
+ python build_manifest.py --refresh-only manifest_train.jsonl
101
+ python build_manifest.py --refresh-only manifest_holdout.jsonl
102
+ ```
103
+ (Recipe is in `/root/terramind_nyc/NOTES.md` on the droplet.)
104
+
105
+ 2. **Kick off TerraMind-NYC fine-tune.** Spec at
106
+ `experiments/05_terramind_nyc_finetune/eval/eval_spec.md`. Budget
107
+ is 30 GPU-hours; alarm at 25 (set on the droplet). Predicted
108
+ actual: ~0.16 GPU-hours at bs=8 / 3 epochs. Don't run anything
109
+ experimental until eval-spec gates pass on the held-out set.
110
+
111
+ 3. **Decide bucket** (A ship-in-demo / B publish-only / C revert):
112
+ - A: ship the fine-tuned checkpoint as a Riprap specialist.
113
+ - B: publish to HF as `msradam/TerraMind-1.0-NYC` with model card,
114
+ don't ship in demo. **Bucket B is fully acceptable** per the
115
+ spec — civic-tech publication discipline is the durable goal.
116
+ - C: discard checkpoint, no public artefact.
117
+
118
+ ## Working on Monday
119
+
120
+ - TerraMind-NYC fine-tune (above).
121
+ - **Mellea grounding-failure demo prep.** The pitch demo is the
122
+ Hollis 0.19% → 19% case where Granite emits a number with the
123
+ wrong order of magnitude and Mellea catches it. Demo script
124
+ needs to:
125
+ - Show the failed first attempt (banner: "Mellea reroll: numerics
126
+ grounding failed").
127
+ - Show the second attempt with the corrected number.
128
+ - Show the audit panel with the pass/fail per-requirement.
129
+ - Show wall-clock for the reroll (target: under 30 s end-to-end).
130
+ - Currently reproducible via `scripts/probe_mellea.py --query
131
+ "Hollis" --runs 5`. The demo script is the *visual* version.
132
+ - **MTA Sandy-recovery citation layer.** Parse the MTA "Hurricane
133
+ Sandy: Three Years Later" report into per-station-id facts so
134
+ the subway-entrance specialist can emit
135
+ `[mta_recovery_<station_id>]` doc messages alongside the
136
+ exposure ones.
137
+ - **NYCHA polygon-fill on the map.** Overnight session shipped
138
+ NYCHA developments as centroid pins on the map (graded by
139
+ `pct_inside_sandy ≥ 50%`). The next tightening is to add a
140
+ `geometry_geojson` field to `app/registers/nycha.py`'s
141
+ `DevelopmentFinding` dataclass and route through SSE so
142
+ `register-polygons` actually renders graded fills (the layer +
143
+ source are already present in `RipMap.svelte`).
144
+ - **PLUTO/Building-Footprints join** for Stuyvesant Town etc.
145
+ Overnight pass shipped buffered-point overlap (NYU Langone,
146
+ Stuyvesant HS, P.S. 89 now correctly flip to
147
+ `inside_sandy_2012=true`). The 100m hospital buffer / 50m school
148
+ buffer is honest but coarse; PLUTO + actual building footprints
149
+ is the next step for the very-large-campus assets.
150
+
151
+ ## Outstanding through Friday
152
+
153
+ In rough priority order:
154
+
155
+ 1. **More specialists**:
156
+ - FEMA OpenFEMA NFIP claims tract-aggregated (pending).
157
+ - NWS NWPS reach-level forecast + USGS NWIS Bronx / Saw Mill /
158
+ Hutchinson rivers.
159
+ - NYC DEP CSO outfalls + Bluebelt + Green Infrastructure
160
+ specialist (CSS-vs-MS4 distinction for ASCE).
161
+ - Three more TTM r2 specialists (USGS streamgage stage, NWS
162
+ rainfall accumulation, NYC 311 sewer-backup citywide rate).
163
+ **FloodNet forecast already shipped in the overnight pass.**
164
+ 2. **Visual identity refresh**: Carto Positron, IBM Plex, four-tier
165
+ epistemic palette, WeasyPrint PDF export, trace UI as `<details>`
166
+ tree.
167
+ 3. **WCAG 2.2 AA pass.**
168
+ 4. **Methodology paper draft** (6-8 page PDF). Goal: Saturday May 9.
169
+ 5. **Historical-event mode** — vintage-cutoff queries. Saturday.
170
+ 6. **Five Build-in-Public posts** through the week.
171
+ 7. **5-minute hackathon pitch + 3 demo queries.** Friday rehearsal.
172
+ 8. **ASCE talk materials** — May 13 (post-hackathon).
173
+
174
+ ## Sharp edges to remember
175
+
176
+ - **Static assets cache hard.** When iterating on Svelte or
177
+ agent.js, hard-reload (⌘⇧R). No cache-busting in place.
178
+ - **HF Space sleeps after idle.** Free tier; first request after
179
+ sleep is a 30-90 s cold start. Ping the space before any demo.
180
+ - **vLLM cold compile.** First few requests against a fresh
181
+ `vllm serve` log surprisingly low throughput while ROCm kernels
182
+ JIT. Run benchmarks 3+ times before believing them.
183
+ - **Sandy GeoJSON has self-intersection issues** that blow up
184
+ `unary_union`. Use `buffer(0)` (caught and fixed for NYCHA;
185
+ may surface again for any new polygon-overlap specialist).
186
+ - **DEP column is `Flooding_Category` (int16)**, not `depth_class`.
187
+ Documented in NYCHA RESULTS.md.
188
+ - **Centroid-edge join false-negatives** on NYU Langone / Stuyvesant
189
+ / P.S. 89 because their centroid points lie just outside the OEM
190
+ Sandy polygon despite real 2012 basement flooding. PLUTO
191
+ footprint join is the queued fix.
192
+ - **Don't restart uvicorn while a model is mid-generation.** Ollama
193
+ keeps the request alive but the FastAPI handler dies, leaving
194
+ the user staring at a dead stream.
195
+
196
+ ## Files to read in order on Monday morning
197
+
198
+ 1. This file.
199
+ 2. `experiments/05_terramind_nyc_finetune/eval/eval_spec.md` — the
200
+ contract for what training output triggers ship/publish/revert.
201
+ 3. `experiments/06_granite_guardian/RESULTS.md` — the Guardian →
202
+ planner pivot decision record (so you know why Guardian is in
203
+ the repo but not on the demo path).
204
+ 4. `experiments/07_mta_entrances/RESULTS.md` — the canonical
205
+ register-specialist pattern (the other three follow it).
206
+ 5. `CLAUDE.md` — fix the AMD droplet IP (165.245.134.44 →
207
+ 129.212.182.52) at the same time as the first edit of the day.
208
+
209
+ ## Status as of 2026-05-03 ~12:50 ET
210
+
211
+ - Both git remotes (origin + huggingface) up-to-date through
212
+ `86861be`.
213
+ - HF Space rebuild was *not* triggered on the FSM-integration
214
+ commit; do `git push huggingface main` when you want to deploy.
215
+ (You may want to wait until Monday afternoon so a broken HF
216
+ rebuild doesn't eat morning time.)
217
+ - Local Ollama has both `granite4.1:3b` and `granite4.1:8b` warm.
218
+ - AMD droplet `129.212.182.52` has the `terramind` container
219
+ running with TerraTorch 1.2.7 + pystac-client + planetary-
220
+ computer installed in system Python; HF cache populated.
221
+ - 200-query adversarial set + planner-pivot eval results
222
+ reproducible from `experiments/06_granite_guardian/` in ~3 min.
223
+ - Mellea probe still works: `scripts/probe_mellea.py --query
224
+ "Hollis" --runs 5`.
agent.py CHANGED
@@ -1,4 +1,4 @@
1
- """HeliOS-NYC agent CLI: address -> cited paragraph via the Burr FSM.
2
 
3
  Usage:
4
  python agent.py "180 Beach 35 St, Queens"
 
1
+ """Riprap agent CLI address cited briefing via the Burr FSM.
2
 
3
  Usage:
4
  python agent.py "180 Beach 35 St, Queens"
app/areas/__init__.py ADDED
File without changes
app/areas/nta.py ADDED
@@ -0,0 +1,224 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """NYC Neighborhood Tabulation Area (NTA 2020) resolver.
2
+
3
+ NTAs are NYC Department of City Planning's official neighborhood unit:
4
+ ~262 polygons covering all 5 boroughs, including some park / airport
5
+ slivers. They are the canonical "neighborhood" unit for NYC civic data.
6
+
7
+ This module provides:
8
+ - load() → GeoDataFrame with all NTAs (cached)
9
+ - resolve(name) → list of matching NTAs by fuzzy name match, or by borough
10
+ - by_code(code) → exact lookup
11
+ - polygon_for(code) → shapely Polygon in EPSG:4326
12
+ """
13
+ from __future__ import annotations
14
+
15
+ import re
16
+ from functools import lru_cache
17
+ from pathlib import Path
18
+ from typing import Any
19
+
20
+ import geopandas as gpd
21
+ from shapely.geometry import Polygon
22
+
23
+ DATA_PATH = Path(__file__).resolve().parents[2] / "data" / "nyc_ntas_2020.geojson"
24
+
25
+ # Common alias map: user-typed strings → canonical NTA names. We don't need to
26
+ # be exhaustive here; the fuzzy matcher catches most cases. This handles the
27
+ # few hard ones where the official NTA name differs from local usage.
28
+ ALIASES = {
29
+ "the rockaways": "Rockaway Beach-Arverne-Edgemere",
30
+ "rockaway": "Rockaway Beach-Arverne-Edgemere",
31
+ "brighton": "Brighton Beach",
32
+ "lower east side": "Lower East Side",
33
+ "les": "Lower East Side",
34
+ "soho": "SoHo-Little Italy-Hudson Square",
35
+ "tribeca": "Tribeca-Civic Center",
36
+ "fidi": "Financial District-Battery Park City",
37
+ "downtown brooklyn":"Downtown Brooklyn-DUMBO-Boerum Hill",
38
+ "dumbo": "Downtown Brooklyn-DUMBO-Boerum Hill",
39
+ "park slope": "Park Slope",
40
+ "carroll gardens": "Carroll Gardens-Cobble Hill-Gowanus-Red Hook",
41
+ "red hook": "Carroll Gardens-Cobble Hill-Gowanus-Red Hook",
42
+ "gowanus": "Carroll Gardens-Cobble Hill-Gowanus-Red Hook",
43
+ "hollis": "Queens Village-Hollis-Bellerose",
44
+ "long island city": "Hunters Point-Sunnyside-West Maspeth",
45
+ "lic": "Hunters Point-Sunnyside-West Maspeth",
46
+ "astoria": "Astoria (Central)",
47
+ "flushing": "Flushing-Willets Point",
48
+ "harlem": "Central Harlem (North)",
49
+ "east harlem": "East Harlem (North)",
50
+ "washington heights":"Washington Heights (North)",
51
+ "midtown": "Midtown South-Flatiron-Union Square",
52
+ "upper east side": "Upper East Side-Carnegie Hill",
53
+ "ues": "Upper East Side-Carnegie Hill",
54
+ "upper west side": "Upper West Side-Lincoln Square",
55
+ "uws": "Upper West Side-Lincoln Square",
56
+ "coney island": "Coney Island-Sea Gate",
57
+ }
58
+
59
+ BOROUGH_NORMALIZE = {
60
+ "manhattan": "Manhattan", "mn": "Manhattan",
61
+ "brooklyn": "Brooklyn", "bk": "Brooklyn", "kings": "Brooklyn",
62
+ "queens": "Queens", "qn": "Queens",
63
+ "bronx": "Bronx", "the bronx": "Bronx", "bx": "Bronx",
64
+ "staten island": "Staten Island", "si": "Staten Island", "richmond": "Staten Island",
65
+ }
66
+
67
+
68
+ def _normalize(s: str) -> str:
69
+ return re.sub(r"[^a-z]+", "", (s or "").lower())
70
+
71
+
72
+ @lru_cache(maxsize=1)
73
+ def load() -> gpd.GeoDataFrame:
74
+ """Load the NTA 2020 GeoJSON; coerce CRS to EPSG:4326. Cached."""
75
+ g = gpd.read_file(DATA_PATH)
76
+ if g.crs is None or g.crs.to_string() != "EPSG:4326":
77
+ g = g.to_crs("EPSG:4326")
78
+ return g
79
+
80
+
81
+ def by_code(code: str) -> dict | None:
82
+ g = load()
83
+ hit = g[g["nta2020"] == code]
84
+ if hit.empty:
85
+ return None
86
+ return _row_to_dict(hit.iloc[0])
87
+
88
+
89
+ def _row_to_dict(row) -> dict:
90
+ return {
91
+ "nta_code": row["nta2020"],
92
+ "nta_name": row["ntaname"],
93
+ "borough": row["boroname"],
94
+ "cdta": row.get("cdtaname"),
95
+ "geometry": row["geometry"],
96
+ }
97
+
98
+
99
+ def borough_match(query: str) -> str | None:
100
+ """If query matches a borough name (or common abbreviation), return the
101
+ canonical name. Otherwise return None."""
102
+ q = query.strip().lower()
103
+ return BOROUGH_NORMALIZE.get(q)
104
+
105
+
106
+ def resolve(query: str) -> list[dict[str, Any]]:
107
+ """Resolve a free-text query to NTA(s).
108
+
109
+ Strategy (in priority order):
110
+ 1. Borough match → all NTAs in borough.
111
+ 2. Alias map → exact NTA name match.
112
+ 3. Case-insensitive EXACT name match (so 'Kew Gardens' wins over
113
+ 'Kew Gardens Hills' when both exist).
114
+ 4. Substring match on normalized NTA name. When multiple match,
115
+ prefer the one whose normalized name length is closest to the
116
+ query — avoids 'Kew Gardens' resolving to 'Kew Gardens Hills'.
117
+ 5. CDTA-name substring fallback.
118
+ """
119
+ g = load()
120
+ q = (query or "").strip()
121
+ if not q:
122
+ return []
123
+ boro = borough_match(q)
124
+ if boro:
125
+ hits = g[g["boroname"] == boro]
126
+ return [_row_to_dict(r) for _, r in hits.iterrows()]
127
+
128
+ alias = ALIASES.get(q.lower())
129
+ if alias:
130
+ hits = g[g["ntaname"] == alias]
131
+ if not hits.empty:
132
+ return [_row_to_dict(r) for _, r in hits.iterrows()]
133
+
134
+ # Exact (case-insensitive) — preferred over substring
135
+ name_lower = g["ntaname"].fillna("").str.lower()
136
+ exact = g[name_lower == q.lower()]
137
+ if not exact.empty:
138
+ return [_row_to_dict(r) for _, r in exact.iterrows()]
139
+
140
+ qn = _normalize(q)
141
+ if not qn:
142
+ return []
143
+ name_norm = g["ntaname"].fillna("").map(_normalize)
144
+ contains = g[name_norm.str.contains(qn, na=False)].copy()
145
+ if not contains.empty:
146
+ contains["_diff"] = contains["ntaname"].fillna("").map(
147
+ lambda s: abs(len(_normalize(s)) - len(qn))
148
+ )
149
+ contains = contains.sort_values("_diff")
150
+ return [_row_to_dict(r) for _, r in contains.iterrows()]
151
+
152
+ cdta_norm = g["cdtaname"].fillna("").map(_normalize)
153
+ contains = g[cdta_norm.str.contains(qn, na=False)]
154
+ if not contains.empty:
155
+ return [_row_to_dict(r) for _, r in contains.iterrows()]
156
+
157
+ return []
158
+
159
+
160
+ def polygon_for(code: str) -> Polygon | None:
161
+ hit = by_code(code)
162
+ return hit["geometry"] if hit else None
163
+
164
+
165
+ def resolve_from_text(text: str) -> list[dict[str, Any]]:
166
+ """Scan free-text (e.g. a full natural-language query) for any known NTA
167
+ name, alias, or borough. Returns the first match. This is the fallback
168
+ when the planner failed to extract a clean target.
169
+
170
+ Strategy: walk ALIASES first (cheap), then iterate NTA names and look
171
+ for the longest match contained in the text. We prefer the longest
172
+ match so 'Carroll Gardens' wins over 'Gardens'.
173
+ """
174
+ t = (text or "").lower()
175
+ if not t:
176
+ return []
177
+ # Boroughs first (whole-word-ish — avoid false hits inside "queensland" etc.)
178
+ for boro_key, canon in BOROUGH_NORMALIZE.items():
179
+ if f" {boro_key} " in f" {t} " or t.startswith(boro_key + " ") or t.endswith(" " + boro_key):
180
+ hits = resolve(canon)
181
+ if hits:
182
+ return hits
183
+ # Alias keys, longest first
184
+ for key in sorted(ALIASES.keys(), key=len, reverse=True):
185
+ if key in t:
186
+ hits = resolve(key)
187
+ if hits:
188
+ return hits
189
+ # NTA names. Order: longest first so multi-word names match before
190
+ # shorter substrings, AND preferring the WORD-BOUNDARY match so
191
+ # "Kew Gardens" in the query doesn't collide with "Kew Gardens Hills"
192
+ # (the latter is longer; without word-boundary checking it'd match
193
+ # nothing, but with substring-in-text it'd match if the query ever
194
+ # contained the longer phrase). Caller picks the closest-length match.
195
+ g = load()
196
+ names = sorted(set(g["ntaname"].dropna().str.lower().tolist()), key=len, reverse=True)
197
+ matches = []
198
+ for name in names:
199
+ if not name or len(name) < 4:
200
+ continue
201
+ # Word-boundary-ish check: name must appear bounded by start/end or
202
+ # whitespace/punct (so "kew gardens hills" matches but "kew gardens"
203
+ # alone doesn't trigger "kew gardens hills" because of the trailing
204
+ # space requirement).
205
+ padded_t = f" {t} "
206
+ if f" {name} " in padded_t or f" {name}." in padded_t or f" {name}," in padded_t or f" {name}?" in padded_t:
207
+ matches.append(name)
208
+ if matches:
209
+ # Prefer the longest word-boundary match — most specific.
210
+ best = sorted(matches, key=len, reverse=True)[0]
211
+ hits = resolve(best)
212
+ if hits:
213
+ return hits
214
+ # Fallback: any substring (no boundary). Less precise, but catches
215
+ # casual queries like "show me red hook" where "red hook" is a
216
+ # neighborhood-name fragment within a longer NTA name.
217
+ for name in names:
218
+ if not name or len(name) < 4:
219
+ continue
220
+ if name in t:
221
+ hits = resolve(name)
222
+ if hits:
223
+ return hits
224
+ return []
app/assets/mta_entrances.py CHANGED
@@ -8,7 +8,6 @@ register is built for.
8
  """
9
  from __future__ import annotations
10
 
11
- import json
12
  from pathlib import Path
13
 
14
  import geopandas as gpd
 
8
  """
9
  from __future__ import annotations
10
 
 
11
  from pathlib import Path
12
 
13
  import geopandas as gpd
app/context/dob_permits.py ADDED
@@ -0,0 +1,258 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """NYC DOB construction-permit specialist — "what are they building".
2
+
3
+ Pulls active NYC DOB Permit Issuance records (Socrata `ipu4-2q9a`)
4
+ inside a polygon, filtered to recent New Building (NB), major
5
+ Alteration (A1), and Demolition (DM) jobs. Each project is then
6
+ cross-referenced against the static flood layers (Sandy 2012, DEP
7
+ Stormwater scenarios) so the reconciler can write things like:
8
+
9
+ "12 active major construction projects in Gowanus. Of these,
10
+ 8 sit inside the DEP Extreme-2080 stormwater scenario."
11
+
12
+ The dataset uses separate gis_latitude / gis_longitude columns rather
13
+ than a Socrata Point, so we bbox-filter via SoQL then do exact
14
+ point-in-polygon containment client-side with shapely.
15
+ """
16
+ from __future__ import annotations
17
+
18
+ import logging
19
+ from collections import Counter
20
+ from dataclasses import asdict, dataclass
21
+ from datetime import date, datetime, timedelta
22
+ from typing import Any
23
+
24
+ import geopandas as gpd
25
+ import httpx
26
+ from shapely.geometry import Point
27
+
28
+ log = logging.getLogger("riprap.dob_permits")
29
+
30
+ URL = "https://data.cityofnewyork.us/resource/ipu4-2q9a.json"
31
+ DOC_ID = "dob_permits"
32
+ CITATION = ("NYC DOB Permit Issuance (NYC OpenData ipu4-2q9a) — "
33
+ "issued/in-progress construction permits")
34
+
35
+ JOB_TYPE_LABELS = {
36
+ "NB": "new building",
37
+ "A1": "major alteration (use/occupancy)",
38
+ "A2": "minor alteration",
39
+ "A3": "minor work / interior",
40
+ "DM": "demolition",
41
+ "SG": "sign",
42
+ "PL": "plumbing",
43
+ "EQ": "equipment",
44
+ }
45
+
46
+ # Default filter: focus on "what are they building" — new construction,
47
+ # major alterations, demolitions. Skip minor mechanical permits.
48
+ DEFAULT_JOB_TYPES = ("NB", "A1", "DM")
49
+
50
+
51
+ @dataclass
52
+ class Permit:
53
+ job_id: str
54
+ job_type: str
55
+ job_type_label: str
56
+ permit_status: str
57
+ issuance_date: str
58
+ expiration_date: str | None
59
+ address: str
60
+ borough: str
61
+ bbl: str | None
62
+ lat: float
63
+ lon: float
64
+ owner_business: str | None
65
+ permittee_business: str | None
66
+ nta_name: str | None
67
+
68
+
69
+ def permits_in_bbox(min_lat: float, min_lon: float,
70
+ max_lat: float, max_lon: float,
71
+ job_types: tuple[str, ...] = DEFAULT_JOB_TYPES,
72
+ since: date | None = None,
73
+ limit: int = 5000) -> list[Permit]:
74
+ """Pull DOB permits intersecting a bounding box, recently issued, with
75
+ matching job types. We expand from polygon to bbox and rely on the
76
+ caller to do exact point-in-polygon filtering."""
77
+ if since is None:
78
+ since = date.today() - timedelta(days=540) # ~18 months
79
+ # gis_latitude/gis_longitude are stored as text in this dataset; cast
80
+ # to number for the bbox compare. issuance_date is a floating timestamp
81
+ # surfaced as 'MM/DD/YYYY' string — cast explicitly to floating_timestamp
82
+ # so the comparator parses ISO dates correctly. BETWEEN is picky on text
83
+ # columns, so use explicit >= / <= operators.
84
+ where = (
85
+ f"job_type IN ({','.join(repr(t) for t in job_types)})"
86
+ f" AND issuance_date::floating_timestamp >= '{since.isoformat()}'"
87
+ f" AND gis_latitude::number >= {min_lat}"
88
+ f" AND gis_latitude::number <= {max_lat}"
89
+ f" AND gis_longitude::number >= {min_lon}"
90
+ f" AND gis_longitude::number <= {max_lon}"
91
+ )
92
+ r = httpx.get(URL, params={
93
+ "$select": ",".join([
94
+ "job__", "job_type", "permit_status", "issuance_date",
95
+ "expiration_date", "house__", "street_name", "borough",
96
+ "block", "lot",
97
+ "gis_latitude", "gis_longitude", "owner_s_business_name",
98
+ "permittee_s_business_name", "gis_nta_name",
99
+ ]),
100
+ "$where": where,
101
+ "$order": "issuance_date desc",
102
+ "$limit": str(limit),
103
+ }, timeout=60)
104
+ r.raise_for_status()
105
+ out: list[Permit] = []
106
+ for row in r.json():
107
+ try:
108
+ lat = float(row["gis_latitude"])
109
+ lon = float(row["gis_longitude"])
110
+ except (KeyError, ValueError, TypeError):
111
+ continue
112
+ addr = " ".join(filter(None, [
113
+ row.get("house__"),
114
+ (row.get("street_name") or "").title(),
115
+ ])).strip()
116
+ # DOB has no `bbl` column; compose from borough + block + lot.
117
+ # Borough codes: MAN=1, BX=2, BK=3, QN=4, SI=5.
118
+ boro_code = {"MANHATTAN": "1", "BRONX": "2", "BROOKLYN": "3",
119
+ "QUEENS": "4", "STATEN ISLAND": "5"}.get(
120
+ (row.get("borough") or "").upper())
121
+ block = (row.get("block") or "").lstrip("0")
122
+ lot = (row.get("lot") or "").lstrip("0")
123
+ bbl = (f"{boro_code}-{block.zfill(5)}-{lot.zfill(4)}"
124
+ if boro_code and block and lot else None)
125
+ out.append(Permit(
126
+ job_id=row.get("job__", ""),
127
+ job_type=row.get("job_type", ""),
128
+ job_type_label=JOB_TYPE_LABELS.get(row.get("job_type", ""), row.get("job_type", "")),
129
+ permit_status=row.get("permit_status", ""),
130
+ issuance_date=(row.get("issuance_date") or "")[:10],
131
+ expiration_date=(row.get("expiration_date") or "")[:10] or None,
132
+ address=addr,
133
+ borough=(row.get("borough") or "").title(),
134
+ bbl=bbl,
135
+ lat=lat,
136
+ lon=lon,
137
+ owner_business=row.get("owner_s_business_name"),
138
+ permittee_business=row.get("permittee_s_business_name"),
139
+ nta_name=row.get("gis_nta_name"),
140
+ ))
141
+ return out
142
+
143
+
144
+ def permits_in_polygon(polygon, polygon_crs: str = "EPSG:4326",
145
+ job_types: tuple[str, ...] = DEFAULT_JOB_TYPES,
146
+ since: date | None = None) -> list[Permit]:
147
+ """Permits inside a polygon. Uses bbox prefilter + shapely contains."""
148
+ g = gpd.GeoDataFrame(geometry=[polygon], crs=polygon_crs).to_crs("EPSG:4326")
149
+ geom = g.iloc[0].geometry
150
+ minx, miny, maxx, maxy = geom.bounds
151
+ raw = permits_in_bbox(miny, minx, maxy, maxx, job_types=job_types, since=since)
152
+ out: list[Permit] = []
153
+ for p in raw:
154
+ pt = Point(p.lon, p.lat)
155
+ if geom.contains(pt) or geom.intersects(pt):
156
+ out.append(p)
157
+ # Dedupe by job_id (one job can have multiple permits as work proceeds)
158
+ seen: dict[str, Permit] = {}
159
+ for p in out:
160
+ # Keep the most-recently-issued permit per job
161
+ cur = seen.get(p.job_id)
162
+ if cur is None or (p.issuance_date or "") > (cur.issuance_date or ""):
163
+ seen[p.job_id] = p
164
+ return list(seen.values())
165
+
166
+
167
+ def cross_reference_flood(permits: list[Permit]) -> list[dict[str, Any]]:
168
+ """Tag each permit with which flood layers cover its point.
169
+ Adds: in_sandy (bool), dep_class (highest depth class hit across DEP scenarios),
170
+ dep_scenarios (list of scenario ids that fired)."""
171
+ if not permits:
172
+ return []
173
+ from app.flood_layers import dep_stormwater, sandy_inundation
174
+ pts = gpd.GeoDataFrame(
175
+ geometry=[Point(p.lon, p.lat) for p in permits],
176
+ crs="EPSG:4326",
177
+ ).to_crs("EPSG:2263")
178
+ pts["_pid"] = list(range(len(pts)))
179
+
180
+ sandy_flags = sandy_inundation.join(pts).reset_index(drop=True).tolist()
181
+
182
+ dep_hits = {scen: dep_stormwater.join(pts, scen)["depth_class"].astype(int).tolist()
183
+ for scen in ("dep_extreme_2080", "dep_moderate_2050", "dep_moderate_current")}
184
+
185
+ out = []
186
+ for i, p in enumerate(permits):
187
+ scen_hits = {s: dep_hits[s][i] for s in dep_hits}
188
+ max_class = max(scen_hits.values(), default=0)
189
+ active_scens = [s for s, c in scen_hits.items() if c > 0]
190
+ out.append({
191
+ **asdict(p),
192
+ "in_sandy": bool(sandy_flags[i]),
193
+ "dep_max_class": max_class,
194
+ "dep_scenarios": active_scens,
195
+ "any_flood_layer_hit": bool(sandy_flags[i] or max_class > 0),
196
+ })
197
+ return out
198
+
199
+
200
+ def summary_for_polygon(polygon, polygon_crs: str = "EPSG:4326",
201
+ since_days: int = 540,
202
+ top_n: int = 8) -> dict:
203
+ """Full polygon-mode summary: list active permits, cross-reference each
204
+ with flood layers, return aggregate counts + a top-N projects-of-concern
205
+ list (those that hit at least one flood layer, ranked by max DEP class
206
+ + Sandy hit)."""
207
+ since = date.today() - timedelta(days=since_days)
208
+ permits = permits_in_polygon(polygon, polygon_crs=polygon_crs, since=since)
209
+ enriched = cross_reference_flood(permits)
210
+
211
+ by_type: Counter = Counter(e["job_type_label"] for e in enriched)
212
+ by_status: Counter = Counter(e["permit_status"] for e in enriched)
213
+ n_total = len(enriched)
214
+ n_sandy = sum(1 for e in enriched if e["in_sandy"])
215
+ n_dep_any = sum(1 for e in enriched if e["dep_max_class"] > 0)
216
+ n_dep_severe = sum(1 for e in enriched if e["dep_max_class"] >= 2)
217
+ n_any_flood = sum(1 for e in enriched if e["any_flood_layer_hit"])
218
+
219
+ # Rank: severity = (in_sandy * 3) + dep_max_class
220
+ def severity(e):
221
+ return (3 if e["in_sandy"] else 0) + e["dep_max_class"]
222
+ flagged = sorted(
223
+ [e for e in enriched if e["any_flood_layer_hit"]],
224
+ key=severity, reverse=True,
225
+ )[:top_n]
226
+
227
+ # Light projection of every permit for map pinning (no need to ship the
228
+ # full permit record for the not-flagged ones — the map only needs lat,
229
+ # lon, address, job_type_label, and the flood-flag fields).
230
+ all_pins = [
231
+ {
232
+ "lat": e["lat"],
233
+ "lon": e["lon"],
234
+ "address": e["address"],
235
+ "job_type": e["job_type"],
236
+ "in_sandy": e["in_sandy"],
237
+ "dep_max_class": e["dep_max_class"],
238
+ "any_flood": e["any_flood_layer_hit"],
239
+ }
240
+ for e in enriched
241
+ ]
242
+ return {
243
+ "since": since.isoformat(),
244
+ "n_total": n_total,
245
+ "n_in_sandy": n_sandy,
246
+ "n_in_dep_any": n_dep_any,
247
+ "n_in_dep_severe": n_dep_severe,
248
+ "n_any_flood": n_any_flood,
249
+ "by_job_type": dict(by_type.most_common()),
250
+ "by_permit_status":dict(by_status.most_common()),
251
+ "flagged_top": flagged,
252
+ "all_pins": all_pins,
253
+ "all_count": n_total,
254
+ }
255
+
256
+
257
+ def now_iso() -> str:
258
+ return datetime.utcnow().date().isoformat()
app/context/gliner_extract.py ADDED
@@ -0,0 +1,125 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """GLiNER (urchade/gliner_medium-v2.1) typed-entity extraction over the
2
+ RAG retriever's top paragraphs.
3
+
4
+ Adds structured fields to the reconciler's grounding context. For each
5
+ RAG chunk the specialist emits, GLiNER produces a list of typed spans
6
+ with one of five labels:
7
+
8
+ nyc_location (e.g. "Coney Island")
9
+ dollar_amount (e.g. "$5.6 million")
10
+ date_range (e.g. "fiscal year 2025-2027")
11
+ agency (e.g. "NYC DEP")
12
+ infrastructure_project (e.g. "Bluebelt expansion")
13
+
14
+ The doc_id for emission is `gliner_<source>` where `<source>` is the
15
+ RAG chunk's doc_id stripped of its `rag_` prefix. So `rag_comptroller`
16
+ becomes `gliner_comptroller`. The reconciler can then cite typed
17
+ fields with `[gliner_comptroller]`.
18
+
19
+ License: Apache-2.0 — `urchade/gliner_medium-v2.1` (NOT the
20
+ `gliner_base` variant, which is CC-BY-NC-4.0). See
21
+ experiments/shared/licenses.md.
22
+ """
23
+
24
+ from __future__ import annotations
25
+
26
+ import logging
27
+ import os
28
+ from dataclasses import dataclass
29
+
30
+ log = logging.getLogger("riprap.gliner")
31
+
32
+ ENTITY_LABELS = [
33
+ "nyc_location",
34
+ "dollar_amount",
35
+ "date_range",
36
+ "agency",
37
+ "infrastructure_project",
38
+ ]
39
+
40
+ DEFAULT_THRESHOLD = float(os.environ.get("RIPRAP_GLINER_THRESHOLD", "0.45"))
41
+ MODEL_NAME = os.environ.get("RIPRAP_GLINER_MODEL", "urchade/gliner_medium-v2.1")
42
+ ENABLE = os.environ.get("RIPRAP_GLINER_ENABLE", "1").lower() in ("1", "true", "yes")
43
+
44
+ _MODEL = None # lazy
45
+
46
+
47
+ @dataclass
48
+ class Extraction:
49
+ label: str
50
+ text: str
51
+ score: float
52
+
53
+
54
+ def _ensure_model():
55
+ """Lazy GLiNER load. Returns None if disabled or load fails so
56
+ callers can silently fall back to no-op."""
57
+ global _MODEL
58
+ if not ENABLE:
59
+ return None
60
+ if _MODEL is not None:
61
+ return _MODEL
62
+ try:
63
+ from gliner import GLiNER
64
+ log.info("gliner: loading %s", MODEL_NAME)
65
+ _MODEL = GLiNER.from_pretrained(MODEL_NAME)
66
+ except Exception:
67
+ log.exception("gliner: load failed; specialist will no-op")
68
+ _MODEL = False # sentinel
69
+ return _MODEL or None
70
+
71
+
72
+ def warm():
73
+ _ensure_model()
74
+
75
+
76
+ def _source_short(rag_doc_id: str) -> str:
77
+ """`rag_comptroller` -> `comptroller`. Anything not prefixed `rag_`
78
+ passes through unchanged."""
79
+ return rag_doc_id[4:] if rag_doc_id.startswith("rag_") else rag_doc_id
80
+
81
+
82
+ def extract_for_chunk(text: str, threshold: float = DEFAULT_THRESHOLD) -> list[Extraction]:
83
+ model = _ensure_model()
84
+ if model is None or not text:
85
+ return []
86
+ raw = model.predict_entities(text, ENTITY_LABELS, threshold=threshold)
87
+ return [Extraction(label=r["label"], text=r["text"],
88
+ score=float(r["score"])) for r in raw]
89
+
90
+
91
+ def extract_for_rag_hits(hits: list[dict],
92
+ threshold: float = DEFAULT_THRESHOLD,
93
+ max_hits: int = 3) -> dict[str, dict]:
94
+ """Run GLiNER on the top-`max_hits` RAG hits. Returns a dict keyed by
95
+ short source id (e.g. "comptroller") with the structured payload
96
+ that the FSM stores into state["gliner"] and that
97
+ reconcile.build_documents() consumes."""
98
+ out: dict[str, dict] = {}
99
+ if not hits:
100
+ return out
101
+ for h in hits[:max_hits]:
102
+ source = _source_short(h.get("doc_id", "rag_unknown"))
103
+ ents = extract_for_chunk(h.get("text", ""), threshold=threshold)
104
+ if not ents:
105
+ continue
106
+ # Dedup verbatim repeats (common in agency PDFs that repeat
107
+ # "DEP" 13 times in a methodology section).
108
+ seen = set()
109
+ deduped: list[Extraction] = []
110
+ for e in ents:
111
+ key = (e.label, e.text.lower())
112
+ if key in seen:
113
+ continue
114
+ seen.add(key)
115
+ deduped.append(e)
116
+ out[source] = {
117
+ "rag_doc_id": h.get("doc_id"),
118
+ "title": h.get("title"),
119
+ "paragraph_excerpt": h.get("text", "")[:240]
120
+ + ("…" if len(h.get("text", "")) > 240 else ""),
121
+ "n_entities": len(deduped),
122
+ "entities": [{"label": e.label, "text": e.text,
123
+ "score": round(e.score, 3)} for e in deduped],
124
+ }
125
+ return out
app/context/microtopo.py CHANGED
@@ -123,10 +123,9 @@ def _row_col(transform, lat: float, lon: float) -> tuple[int, int]:
123
  """Inverse-affine: WGS84 (lon,lat) -> raster (row, col).
124
  Mirrors rasterio.transform.rowcol but without holding a dataset handle.
125
  """
126
- # affine: x = a*col + b*row + c ; y = d*col + e*row + f
127
- # invert: col = (a_inv * (x - c)) approx — we have a diagonal affine
128
- a, b, c, d, e, f = transform.a, transform.b, transform.c, transform.d, transform.e, transform.f
129
- # diagonal case (b=d=0, common for north-up rasters):
130
  col = int(round((lon - c) / a))
131
  row = int(round((lat - f) / e))
132
  return row, col
@@ -206,3 +205,70 @@ def microtopo_at(lat: float, lon: float, radius_m: int = 750) -> Microtopo | Non
206
  twi=twi_v,
207
  hand_m=hand_v,
208
  )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
123
  """Inverse-affine: WGS84 (lon,lat) -> raster (row, col).
124
  Mirrors rasterio.transform.rowcol but without holding a dataset handle.
125
  """
126
+ # Diagonal affine (north-up raster): x = a*col + c, y = e*row + f.
127
+ a, c = transform.a, transform.c
128
+ e, f = transform.e, transform.f
 
129
  col = int(round((lon - c) / a))
130
  row = int(round((lat - f) / e))
131
  return row, col
 
205
  twi=twi_v,
206
  hand_m=hand_v,
207
  )
208
+
209
+
210
+ def microtopo_for_polygon(polygon, polygon_crs: str = "EPSG:4326") -> dict | None:
211
+ """Polygon-mode aggregation: distributional summary of the DEM/HAND/TWI
212
+ rasters clipped to the polygon. Returns medians + fraction of cells
213
+ in flood-prone bands. Used for neighborhood-mode queries."""
214
+ state = _load_dem()
215
+ if state is None:
216
+ return None
217
+ try:
218
+ import rasterio
219
+ from rasterio.mask import mask as rio_mask
220
+ except Exception:
221
+ return None
222
+ import geopandas as gpd
223
+
224
+ poly = gpd.GeoDataFrame(geometry=[polygon], crs=polygon_crs).to_crs("EPSG:4326")
225
+ geom = [poly.iloc[0].geometry.__geo_interface__]
226
+
227
+ def _stats(path: Path) -> dict | None:
228
+ if not path.exists():
229
+ return None
230
+ try:
231
+ with rasterio.open(path) as src:
232
+ clipped, _ = rio_mask(src, geom, crop=True, filled=False)
233
+ arr = clipped[0]
234
+ vals = arr.compressed() if hasattr(arr, "compressed") else arr.flatten()
235
+ vals = vals[np.isfinite(vals)]
236
+ if vals.size == 0:
237
+ return None
238
+ return {
239
+ "n_cells": int(vals.size),
240
+ "min": float(np.min(vals)),
241
+ "median": float(np.median(vals)),
242
+ "p10": float(np.percentile(vals, 10)),
243
+ "p90": float(np.percentile(vals, 90)),
244
+ "max": float(np.max(vals)),
245
+ "raw": vals,
246
+ }
247
+ except Exception as e:
248
+ log.warning("polygon raster mask failed for %s: %r", path.name, e)
249
+ return None
250
+
251
+ elev = _stats(DEM_PATH)
252
+ hand = _stats(HAND_PATH)
253
+ twi = _stats(TWI_PATH)
254
+ if elev is None:
255
+ return None
256
+
257
+ # Fraction of polygon cells in canonical flood-prone bands
258
+ frac_hand_lt1 = (
259
+ round(float((hand["raw"] < 1.0).mean()), 4) if hand else None
260
+ )
261
+ frac_twi_gt10 = (
262
+ round(float((twi["raw"] > 10.0).mean()), 4) if twi else None
263
+ )
264
+ return {
265
+ "n_cells": elev["n_cells"],
266
+ "elev_min_m": round(elev["min"], 2),
267
+ "elev_median_m": round(elev["median"], 2),
268
+ "elev_p10_m": round(elev["p10"], 2),
269
+ "elev_max_m": round(elev["max"], 2),
270
+ "hand_median_m": round(hand["median"], 2) if hand else None,
271
+ "twi_median": round(twi["median"], 2) if twi else None,
272
+ "frac_hand_lt1": frac_hand_lt1,
273
+ "frac_twi_gt10": frac_twi_gt10,
274
+ }
app/context/noaa_tides.py ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """NOAA CO-OPS Tides & Currents — live coastal water level.
2
+
3
+ api.tidesandcurrents.noaa.gov, no auth, 6-min cadence.
4
+
5
+ We pick the nearest of three NYC-region stations to the queried address:
6
+ - 8518750 The Battery, NY
7
+ - 8516945 Kings Point, NY (Long Island Sound entrance)
8
+ - 8531680 Sandy Hook, NJ (NY Harbor approach)
9
+
10
+ The verified-water-level API returns instantaneous water elevation
11
+ relative to MLLW (Mean Lower Low Water — the local tidal datum). To
12
+ distinguish "high tide" from "storm surge" we also fetch the published
13
+ predicted tide and report the residual.
14
+ """
15
+ from __future__ import annotations
16
+
17
+ from dataclasses import dataclass
18
+ from math import asin, cos, radians, sin, sqrt
19
+
20
+ import httpx
21
+
22
+ DOC_ID = "noaa_tides"
23
+ CITATION = "NOAA CO-OPS Tides & Currents (api.tidesandcurrents.noaa.gov)"
24
+ URL = "https://api.tidesandcurrents.noaa.gov/api/prod/datagetter"
25
+
26
+ STATIONS = [
27
+ # (id, name, lat, lon)
28
+ # NYC harbor + Long Island Sound
29
+ ("8518750", "The Battery, NY", 40.7006, -74.0142),
30
+ ("8516945", "Kings Point, NY", 40.8103, -73.7649),
31
+ ("8531680", "Sandy Hook, NJ", 40.4669, -74.0094),
32
+ # Hudson tidal corridor (head-of-tide is Troy / Albany; Hudson is tidal
33
+ # all the way up to the Federal Lock at Troy)
34
+ ("8518995", "Albany, NY (Hudson)", 42.6469, -73.7464),
35
+ ("8518962", "Turkey Point Hudson, NY", 41.7569, -73.9433),
36
+ ("8519483", "West Point, NY", 41.3845, -73.9536),
37
+ ]
38
+
39
+
40
+ @dataclass
41
+ class TideReading:
42
+ station_id: str
43
+ station_name: str
44
+ distance_km: float
45
+ observed_ft: float | None # current water level above MLLW
46
+ predicted_ft: float | None # astronomical prediction at same instant
47
+ residual_ft: float | None # observed - predicted (≈ storm surge)
48
+ obs_time: str | None
49
+ error: str | None = None
50
+
51
+
52
+ def _haversine_km(lat1, lon1, lat2, lon2) -> float:
53
+ R = 6371.0
54
+ p1, p2 = radians(lat1), radians(lat2)
55
+ dp = radians(lat2 - lat1); dl = radians(lon2 - lon1)
56
+ a = sin(dp/2)**2 + cos(p1)*cos(p2)*sin(dl/2)**2
57
+ return 2 * R * asin(sqrt(a))
58
+
59
+
60
+ def _nearest_station(lat: float, lon: float):
61
+ return min(STATIONS, key=lambda s: _haversine_km(lat, lon, s[2], s[3]))
62
+
63
+
64
+ def _fetch(station_id: str, product: str) -> dict:
65
+ r = httpx.get(URL, params={
66
+ "date": "latest", "station": station_id, "product": product,
67
+ "datum": "MLLW", "units": "english", "time_zone": "lst_ldt",
68
+ "format": "json",
69
+ }, timeout=8.0)
70
+ r.raise_for_status()
71
+ return r.json()
72
+
73
+
74
+ def reading_at(lat: float, lon: float) -> TideReading:
75
+ sid, name, slat, slon = _nearest_station(lat, lon)
76
+ dist_km = round(_haversine_km(lat, lon, slat, slon), 1)
77
+ out = TideReading(station_id=sid, station_name=name, distance_km=dist_km,
78
+ observed_ft=None, predicted_ft=None, residual_ft=None,
79
+ obs_time=None)
80
+ try:
81
+ obs = _fetch(sid, "water_level").get("data") or []
82
+ pred = _fetch(sid, "predictions").get("predictions") or []
83
+ if obs:
84
+ out.observed_ft = round(float(obs[0]["v"]), 2)
85
+ out.obs_time = obs[0].get("t")
86
+ if pred:
87
+ out.predicted_ft = round(float(pred[0]["v"]), 2)
88
+ if out.observed_ft is not None and out.predicted_ft is not None:
89
+ out.residual_ft = round(out.observed_ft - out.predicted_ft, 2)
90
+ except Exception as e:
91
+ out.error = str(e)
92
+ return out
93
+
94
+
95
+ def summary_for_point(lat: float, lon: float) -> dict:
96
+ r = reading_at(lat, lon)
97
+ # Look up station coords for the map marker.
98
+ sta = next((s for s in STATIONS if s[0] == r.station_id), None)
99
+ return {
100
+ "station_id": r.station_id,
101
+ "station_name": r.station_name,
102
+ "station_lat": sta[2] if sta else None,
103
+ "station_lon": sta[3] if sta else None,
104
+ "distance_km": r.distance_km,
105
+ "observed_ft_mllw": r.observed_ft,
106
+ "predicted_ft_mllw": r.predicted_ft,
107
+ "residual_ft": r.residual_ft,
108
+ "obs_time": r.obs_time,
109
+ "error": r.error,
110
+ }
app/context/nws_alerts.py ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """NWS API — active alerts at a point.
2
+
3
+ api.weather.gov/alerts/active?point={lat},{lon}, no auth, JSON.
4
+ A User-Agent header is required (NWS rate-limits anonymous traffic).
5
+
6
+ We surface only flood-relevant categories so the doc the reconciler
7
+ sees is short and on-topic.
8
+ """
9
+ from __future__ import annotations
10
+
11
+ from typing import Any
12
+
13
+ import httpx
14
+
15
+ DOC_ID = "nws_alerts"
16
+ CITATION = "NWS public alert API (api.weather.gov/alerts)"
17
+
18
+ USER_AGENT = "Riprap-NYC/0.1 (civic-flood-tool; +https://huggingface.co/spaces/msradam/riprap-nyc)"
19
+
20
+ _FLOOD_EVENT_KEYWORDS = (
21
+ "flood", "flash flood", "coastal flood", "high surf", "storm surge",
22
+ "hurricane", "tropical storm", "tornado warning", # high-impact context
23
+ "rip current",
24
+ )
25
+
26
+
27
+ def _is_flood_relevant(event_name: str) -> bool:
28
+ e = (event_name or "").lower()
29
+ return any(k in e for k in _FLOOD_EVENT_KEYWORDS)
30
+
31
+
32
+ def alerts_at(lat: float, lon: float) -> list[dict[str, Any]]:
33
+ r = httpx.get(
34
+ "https://api.weather.gov/alerts/active",
35
+ params={"point": f"{lat:.4f},{lon:.4f}"},
36
+ headers={"User-Agent": USER_AGENT, "Accept": "application/geo+json"},
37
+ timeout=8.0,
38
+ )
39
+ r.raise_for_status()
40
+ out = []
41
+ for f in r.json().get("features", []):
42
+ p = f.get("properties", {}) or {}
43
+ event = p.get("event") or ""
44
+ if not _is_flood_relevant(event):
45
+ continue
46
+ out.append({
47
+ "id": p.get("id"),
48
+ "event": event,
49
+ "severity": p.get("severity"),
50
+ "urgency": p.get("urgency"),
51
+ "certainty": p.get("certainty"),
52
+ "headline": p.get("headline"),
53
+ "sent": p.get("sent"),
54
+ "effective": p.get("effective"),
55
+ "expires": p.get("expires"),
56
+ "sender_name": p.get("senderName"),
57
+ "areaDesc": p.get("areaDesc"),
58
+ })
59
+ return out
60
+
61
+
62
+ def summary_for_point(lat: float, lon: float) -> dict:
63
+ try:
64
+ active = alerts_at(lat, lon)
65
+ except Exception as e:
66
+ return {"n_active": 0, "alerts": [], "error": str(e)}
67
+ return {
68
+ "n_active": len(active),
69
+ "alerts": active,
70
+ "error": None,
71
+ }
app/context/nws_obs.py ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """NWS station observations — latest hourly METAR for the nearest NYC airport.
2
+
3
+ api.weather.gov/stations/{id}/observations/latest.
4
+
5
+ Five NYC-region ASOS stations cover the city; we pick the nearest.
6
+ Most useful field for flood context is hourly precipitation (the
7
+ `precipitationLastHour` quantity, mm). The latest observation is
8
+ typically <60 min old.
9
+ """
10
+ from __future__ import annotations
11
+
12
+ from dataclasses import dataclass
13
+ from math import asin, cos, radians, sin, sqrt
14
+
15
+ import httpx
16
+
17
+ DOC_ID = "nws_obs"
18
+ CITATION = "NWS station observations API (api.weather.gov/stations)"
19
+
20
+ USER_AGENT = "Riprap-NYC/0.1 (civic-flood-tool; +https://huggingface.co/spaces/msradam/riprap-nyc)"
21
+
22
+ # NYC + Hudson Corridor ASOS stations. Picker is haversine-nearest, so adding
23
+ # upstate stations enables Albany / Poughkeepsie / Newburgh queries without
24
+ # breaking NYC behaviour (NYC stations stay closer for NYC lat/lon).
25
+ STATIONS = [
26
+ # NYC region
27
+ ("KNYC", "Central Park, NY", 40.7794, -73.9692),
28
+ ("KLGA", "LaGuardia Airport, NY", 40.7794, -73.8800),
29
+ ("KJFK", "JFK Airport, NY", 40.6413, -73.7781),
30
+ ("KEWR", "Newark Liberty, NJ", 40.6925, -74.1687),
31
+ ("KFRG", "Republic Farmingdale, NY", 40.7288, -73.4134),
32
+ # Hudson Corridor (south → north)
33
+ ("KHPN", "White Plains, NY", 41.0670, -73.7076),
34
+ ("KSWF", "Newburgh-Stewart, NY", 41.5042, -74.1048),
35
+ ("KPOU", "Poughkeepsie, NY", 41.6262, -73.8842),
36
+ ("KALB", "Albany Intl, NY", 42.7475, -73.8025),
37
+ ]
38
+
39
+
40
+ @dataclass
41
+ class Obs:
42
+ station_id: str
43
+ station_name: str
44
+ distance_km: float
45
+ obs_time: str | None
46
+ temp_c: float | None
47
+ precip_last_hour_mm: float | None
48
+ precip_last_3h_mm: float | None
49
+ precip_last_6h_mm: float | None
50
+ error: str | None = None
51
+
52
+
53
+ def _haversine_km(lat1, lon1, lat2, lon2) -> float:
54
+ R = 6371.0
55
+ p1, p2 = radians(lat1), radians(lat2)
56
+ dp = radians(lat2 - lat1); dl = radians(lon2 - lon1)
57
+ a = sin(dp/2)**2 + cos(p1)*cos(p2)*sin(dl/2)**2
58
+ return 2 * R * asin(sqrt(a))
59
+
60
+
61
+ def _val_mm(props, key) -> float | None:
62
+ """NWS returns {value: ..., unitCode: 'wmoUnit:mm'} per quantity. Convert
63
+ to mm; if value is null, return None."""
64
+ q = (props or {}).get(key) or {}
65
+ v = q.get("value")
66
+ if v is None:
67
+ return None
68
+ return round(float(v), 2)
69
+
70
+
71
+ def obs_at(lat: float, lon: float) -> Obs:
72
+ sid, name, slat, slon = min(STATIONS, key=lambda s: _haversine_km(lat, lon, s[2], s[3]))
73
+ dist_km = round(_haversine_km(lat, lon, slat, slon), 1)
74
+ out = Obs(station_id=sid, station_name=name, distance_km=dist_km,
75
+ obs_time=None, temp_c=None,
76
+ precip_last_hour_mm=None, precip_last_3h_mm=None,
77
+ precip_last_6h_mm=None)
78
+ try:
79
+ r = httpx.get(
80
+ f"https://api.weather.gov/stations/{sid}/observations/latest",
81
+ headers={"User-Agent": USER_AGENT, "Accept": "application/geo+json"},
82
+ timeout=8.0,
83
+ )
84
+ r.raise_for_status()
85
+ p = r.json().get("properties", {}) or {}
86
+ out.obs_time = p.get("timestamp")
87
+ out.temp_c = _val_mm(p, "temperature")
88
+ out.precip_last_hour_mm = _val_mm(p, "precipitationLastHour")
89
+ out.precip_last_3h_mm = _val_mm(p, "precipitationLast3Hours")
90
+ out.precip_last_6h_mm = _val_mm(p, "precipitationLast6Hours")
91
+ except Exception as e:
92
+ out.error = str(e)
93
+ return out
94
+
95
+
96
+ def summary_for_point(lat: float, lon: float) -> dict:
97
+ o = obs_at(lat, lon)
98
+ return {
99
+ "station_id": o.station_id,
100
+ "station_name": o.station_name,
101
+ "distance_km": o.distance_km,
102
+ "obs_time": o.obs_time,
103
+ "temp_c": o.temp_c,
104
+ "precip_last_hour_mm": o.precip_last_hour_mm,
105
+ "precip_last_3h_mm": o.precip_last_3h_mm,
106
+ "precip_last_6h_mm": o.precip_last_6h_mm,
107
+ "error": o.error,
108
+ }
app/context/nyc311.py CHANGED
@@ -35,6 +35,8 @@ class Complaint:
35
  created_date: str
36
  address: str | None
37
  status: str | None
 
 
38
 
39
 
40
  def complaints_near(lat: float, lon: float, radius_m: float = 200,
@@ -46,12 +48,69 @@ def complaints_near(lat: float, lon: float, radius_m: float = 200,
46
  ts = since.replace(tzinfo=None).isoformat(timespec="seconds")
47
  where += f" AND created_date >= '{ts}'"
48
  r = httpx.get(URL, params={
49
- "$select": "unique_key, descriptor, created_date, incident_address, status",
 
50
  "$where": where,
51
  "$order": "created_date desc",
52
  "$limit": str(limit),
53
  }, timeout=30)
54
  r.raise_for_status()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
  return [
56
  Complaint(
57
  unique_key=row.get("unique_key", ""),
@@ -64,12 +123,28 @@ def complaints_near(lat: float, lon: float, radius_m: float = 200,
64
  ]
65
 
66
 
67
- def summary_for_point(lat: float, lon: float, radius_m: float = 200,
68
- years: int = 5) -> dict:
 
 
69
  since = datetime.now(timezone.utc) - timedelta(days=365 * years)
70
- cs = complaints_near(lat, lon, radius_m, since=since, limit=2000)
 
 
 
 
71
  by_year: Counter = Counter(c.created_date[:4] for c in cs if c.created_date)
72
  by_descriptor: Counter = Counter(c.descriptor for c in cs)
 
 
 
 
 
 
 
 
 
 
73
  return {
74
  "n": len(cs),
75
  "radius_m": radius_m,
@@ -82,4 +157,5 @@ def summary_for_point(lat: float, lon: float, radius_m: float = 200,
82
  "address": c.address}
83
  for c in cs[:5]
84
  ],
 
85
  }
 
35
  created_date: str
36
  address: str | None
37
  status: str | None
38
+ lat: float | None = None
39
+ lon: float | None = None
40
 
41
 
42
  def complaints_near(lat: float, lon: float, radius_m: float = 200,
 
48
  ts = since.replace(tzinfo=None).isoformat(timespec="seconds")
49
  where += f" AND created_date >= '{ts}'"
50
  r = httpx.get(URL, params={
51
+ "$select": "unique_key, descriptor, created_date, incident_address, "
52
+ "status, latitude, longitude",
53
  "$where": where,
54
  "$order": "created_date desc",
55
  "$limit": str(limit),
56
  }, timeout=30)
57
  r.raise_for_status()
58
+ out = []
59
+ for row in r.json():
60
+ lat = row.get("latitude")
61
+ lon = row.get("longitude")
62
+ try:
63
+ lat = float(lat) if lat is not None else None
64
+ lon = float(lon) if lon is not None else None
65
+ except Exception:
66
+ lat, lon = None, None
67
+ out.append(Complaint(
68
+ unique_key=row.get("unique_key", ""),
69
+ descriptor=row.get("descriptor", ""),
70
+ created_date=row.get("created_date", ""),
71
+ address=row.get("incident_address"),
72
+ status=row.get("status"),
73
+ lat=lat, lon=lon,
74
+ ))
75
+ return out
76
+
77
+
78
+ def summary_for_point(lat: float, lon: float, radius_m: float = 200,
79
+ years: int = 5) -> dict:
80
+ since = datetime.now(timezone.utc) - timedelta(days=365 * years)
81
+ cs = complaints_near(lat, lon, radius_m, since=since, limit=2000)
82
+ return _summarize(cs, years=years, radius_m=radius_m)
83
+
84
+
85
+ def complaints_in_polygon(polygon, polygon_crs: str = "EPSG:4326",
86
+ since: datetime | None = None,
87
+ limit: int = 5000,
88
+ simplify_tolerance: float = 0.0005) -> list[Complaint]:
89
+ """Pull flood-related complaints inside an arbitrary polygon via
90
+ Socrata's `within_polygon(location, 'MULTIPOLYGON(...)')` predicate.
91
+
92
+ NYC NTA polygons can have thousands of vertices and exceed Socrata's
93
+ URL length limit (414). We simplify in EPSG:4326 with a default
94
+ ~50 m tolerance, which collapses vertex count ~10-20× without
95
+ materially changing the contained-points result.
96
+
97
+ Polygon must be EPSG:4326 (lat/lon) for the Socrata query.
98
+ """
99
+ import geopandas as gpd
100
+ g = gpd.GeoDataFrame(geometry=[polygon], crs=polygon_crs).to_crs("EPSG:4326")
101
+ geom = g.iloc[0].geometry.simplify(simplify_tolerance, preserve_topology=True)
102
+ wkt = geom.wkt
103
+ where = f"{_DESC_CLAUSE} AND within_polygon(location, '{wkt}')"
104
+ if since:
105
+ ts = since.replace(tzinfo=None).isoformat(timespec="seconds")
106
+ where += f" AND created_date >= '{ts}'"
107
+ r = httpx.get(URL, params={
108
+ "$select": "unique_key, descriptor, created_date, incident_address, status",
109
+ "$where": where,
110
+ "$order": "created_date desc",
111
+ "$limit": str(limit),
112
+ }, timeout=60)
113
+ r.raise_for_status()
114
  return [
115
  Complaint(
116
  unique_key=row.get("unique_key", ""),
 
123
  ]
124
 
125
 
126
+ def summary_for_polygon(polygon, polygon_crs: str = "EPSG:4326",
127
+ years: int = 5) -> dict:
128
+ """Polygon-mode aggregation: counts of flood-related 311 complaints
129
+ inside the polygon over the trailing window."""
130
  since = datetime.now(timezone.utc) - timedelta(days=365 * years)
131
+ cs = complaints_in_polygon(polygon, polygon_crs=polygon_crs, since=since)
132
+ return _summarize(cs, years=years, radius_m=None)
133
+
134
+
135
+ def _summarize(cs: list[Complaint], years: int, radius_m: float | None) -> dict:
136
  by_year: Counter = Counter(c.created_date[:4] for c in cs if c.created_date)
137
  by_descriptor: Counter = Counter(c.descriptor for c in cs)
138
+ # Cap at 60 most-recent points for the map layer — keeps the SSE
139
+ # payload small while still showing meaningful clustering.
140
+ points = [
141
+ {"lat": c.lat, "lon": c.lon,
142
+ "descriptor": c.descriptor,
143
+ "date": c.created_date[:10],
144
+ "address": c.address}
145
+ for c in cs[:60]
146
+ if c.lat is not None and c.lon is not None
147
+ ]
148
  return {
149
  "n": len(cs),
150
  "radius_m": radius_m,
 
157
  "address": c.address}
158
  for c in cs[:5]
159
  ],
160
+ "points": points,
161
  }
app/context/terramind_synthesis.py ADDED
@@ -0,0 +1,357 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """TerraMind v1 base as a real-time FSM node — DEM → ESRI LULC.
2
+
3
+ Per user query: take the geocoded (lat, lon), pull a DEM patch from
4
+ Riprap's existing NYC-wide LiDAR raster (already used by the microtopo
5
+ specialist — no STAC dependency), run TerraMind to generate a
6
+ plausible categorical land-cover map from the terrain context, and
7
+ emit class fractions the reconciler can cite as a synthetic-prior
8
+ context layer alongside the empirical and modeled flood evidence.
9
+
10
+ Why DEM → LULC (and not DEM → S2L2A as initially prototyped):
11
+ - LULC is *categorical* and *interpretable*. The output is one of
12
+ 10 ESRI Land Cover classes per pixel; class fractions like "78%
13
+ Built Area" go straight into the briefing as cite-able claims.
14
+ - S2L2A is 12-channel reflectance — uninterpretable downstream
15
+ without a separate segmentation head.
16
+ - LULC is *comparable to ground truth*: NYC PLUTO land-use class
17
+ is already in the data layer; future calibration possible.
18
+
19
+ Class label mapping is *tentative* against ESRI 2020-2022 schema
20
+ (which TerraMesh's LULC tokenizer was trained on). The doc body
21
+ discloses the mapping as tentative and the reconciler is instructed
22
+ to use hedged framing ("the synthetic land-cover prior identifies …
23
+ likely class …") rather than asserting hard labels.
24
+
25
+ Why this shape:
26
+ - **No STAC dependency.** Microsoft Planetary Computer search has
27
+ been intermittent during this hackathon; the DEM raster is local
28
+ and always available.
29
+ - **Real-time.** < 0.3 s synthesis + < 0.5 s DEM patch read on M3
30
+ CPU once warm.
31
+ - **Honesty discipline.** Synthetic-prior tier, fourth epistemic
32
+ class alongside empirical / modeled / proxy.
33
+
34
+ License: Apache-2.0 — `ibm-esa-geospatial/TerraMind-1.0-base`.
35
+ """
36
+
37
+ from __future__ import annotations
38
+
39
+ import logging
40
+ import os
41
+ import random
42
+ import threading
43
+ import time
44
+ from typing import Any
45
+
46
+ log = logging.getLogger("riprap.terramind")
47
+
48
+ ENABLE = os.environ.get("RIPRAP_TERRAMIND_ENABLE", "1").lower() in ("1", "true", "yes")
49
+ DEFAULT_STEPS = int(os.environ.get("RIPRAP_TERRAMIND_STEPS", "10"))
50
+ DEFAULT_SEED = int(os.environ.get("RIPRAP_TERRAMIND_SEED", "42"))
51
+ CHIP_PX = int(os.environ.get("RIPRAP_TERRAMIND_CHIP_PX", "224"))
52
+ CHIP_M = CHIP_PX * 30 # NYC DEM is at 30 m -> 6.72 km square
53
+ HALF_M = CHIP_M / 2
54
+
55
+ _MODEL = None
56
+ _INIT_LOCK = threading.Lock()
57
+
58
+ # Tentative ESRI 2020-2022 Land Cover class mapping for TerraMind v1's
59
+ # LULC tokenizer output (10 channels, argmax over channel axis -> class
60
+ # index 0-9). The README/docs don't expose the exact mapping and the
61
+ # tokenizer source confirms only "ESRI LULC" without a label table, so
62
+ # the names below are best-effort. The doc body discloses tentativeness.
63
+ LULC_CLASSES = [
64
+ "water", # 0
65
+ "trees", # 1
66
+ "grass", # 2
67
+ "flooded_vegetation", # 3
68
+ "crops", # 4
69
+ "scrub_shrub", # 5
70
+ "built_area", # 6
71
+ "bare_ground", # 7
72
+ "snow_ice", # 8
73
+ "clouds_or_no_data", # 9
74
+ ]
75
+
76
+
77
+ def _has_required_deps() -> tuple[bool, str | None]:
78
+ """Probe deps. Distinguishes a *truly missing* package
79
+ (ModuleNotFoundError) from a *transient race* (other ImportError —
80
+ typically sklearn's "partially initialized module" from concurrent
81
+ imports inside the parallel-fanout block).
82
+
83
+ Truly missing returns (False, names). Transient race returns
84
+ (True, None) — let the caller try again, the import will resolve
85
+ on the next attempt once the racing thread finishes.
86
+ """
87
+ missing = []
88
+ for name in ("terratorch", "rasterio"):
89
+ try:
90
+ __import__(name)
91
+ except ModuleNotFoundError:
92
+ missing.append(name)
93
+ except ImportError:
94
+ # sklearn-style partial-init race; treat as available and
95
+ # let _ensure_model retry. Logged but not surfaced as missing.
96
+ log.debug("terramind: import race on %s, will retry on demand", name)
97
+ return (not missing, ", ".join(missing) if missing else None)
98
+
99
+
100
+ _DEPS_OK, _DEPS_MISSING = _has_required_deps()
101
+
102
+
103
+ def _ensure_model():
104
+ """Lazy load with a lock so the parallel-block worker can't double-init."""
105
+ global _MODEL
106
+ if _MODEL is not None:
107
+ return _MODEL
108
+ with _INIT_LOCK:
109
+ if _MODEL is not None:
110
+ return _MODEL
111
+ # Heavy import deferred to first call so module import stays cheap
112
+ # and HF Spaces (no terratorch) doesn't pay it at all.
113
+ import terratorch.models.backbones.terramind.model.terramind_register # noqa
114
+ from terratorch.registry import FULL_MODEL_REGISTRY
115
+ log.info("terramind: loading v1 base generate (DEM -> LULC)")
116
+ m = FULL_MODEL_REGISTRY.build(
117
+ "terratorch_terramind_v1_base_generate",
118
+ modalities=["DEM"],
119
+ output_modalities=["LULC"],
120
+ pretrained=True,
121
+ timesteps=DEFAULT_STEPS,
122
+ )
123
+ m.eval()
124
+ _MODEL = m
125
+ log.info("terramind: model ready")
126
+ return _MODEL
127
+
128
+
129
+ def warm():
130
+ """Call at app boot to amortize the ~6 s checkpoint load + first-call
131
+ JIT. No-op when deps are absent."""
132
+ if ENABLE and _DEPS_OK:
133
+ try:
134
+ _ensure_model()
135
+ except Exception:
136
+ log.exception("terramind: warm() failed; specialist will no-op")
137
+
138
+
139
+ def _read_dem_patch(lat: float, lon: float):
140
+ """Read a CHIP_PX×CHIP_PX DEM patch centered on (lat, lon) from the
141
+ local NYC-wide LiDAR raster. Returns (array, bounds_4326) where
142
+ bounds_4326 is (minlon, minlat, maxlon, maxlat) so the synthesised
143
+ LULC can be georeferenced onto the same extent for map rendering.
144
+ Returns None if outside the raster's extent."""
145
+ from pathlib import Path
146
+
147
+ import numpy as np
148
+ import rasterio
149
+ from rasterio.windows import from_bounds
150
+ dem_path = (Path(__file__).resolve().parents[2]
151
+ / "data" / "nyc_dem_30m.tif")
152
+ if not dem_path.exists():
153
+ return None
154
+ with rasterio.open(dem_path) as src:
155
+ # The DEM is in EPSG:4326 (geographic) in our cache — convert
156
+ # the chip extent in the same CRS by building a rough degree
157
+ # bbox from a meters-square half-side at NYC latitude.
158
+ # 1 degree lat ≈ 111 km, 1 degree lon ≈ 85 km at 40.7°N.
159
+ d_lat = (HALF_M / 111_000.0)
160
+ d_lon = (HALF_M / 85_000.0)
161
+ win = from_bounds(lon - d_lon, lat - d_lat,
162
+ lon + d_lon, lat + d_lat,
163
+ src.transform)
164
+ arr = src.read(1, window=win, boundless=True, fill_value=0).astype("float32")
165
+ if arr.size == 0 or arr.shape[0] < 8 or arr.shape[1] < 8:
166
+ return None
167
+ # Resize to CHIP_PX × CHIP_PX via torch interpolation. The exact
168
+ # pixel-perfect alignment doesn't matter for a synthetic prior; the
169
+ # model just needs a real terrain patch to condition on.
170
+ import torch
171
+ t = torch.from_numpy(arr).unsqueeze(0).unsqueeze(0)
172
+ t = torch.nn.functional.interpolate(t, size=(CHIP_PX, CHIP_PX),
173
+ mode="bilinear", align_corners=False)
174
+ out = t.squeeze(0).numpy() # (1, CHIP_PX, CHIP_PX)
175
+ # Replace NaN sentinel values with median elevation so the model
176
+ # doesn't see NaN tokens.
177
+ if np.isnan(out).any():
178
+ med = float(np.nanmedian(out))
179
+ out = np.nan_to_num(out, nan=med)
180
+ bounds_4326 = (lon - d_lon, lat - d_lat, lon + d_lon, lat + d_lat)
181
+ return out, bounds_4326
182
+
183
+
184
+ # Map class index -> visual color for the categorical fill on the
185
+ # MapLibre layer. Colors picked to be visually distinct from the
186
+ # existing red (Sandy) / blue (DEP) / cyan (Prithvi) / orange (Ida HWM).
187
+ LULC_FILL_COLORS = {
188
+ "water": "#0284c7", # not used (we keep water clear so
189
+ # the underlying basemap shows)
190
+ "trees": "#16a34a", # green
191
+ "grass": "#86efac", # pale green
192
+ "flooded_vegetation": "#a3e635", # lime
193
+ "crops": "#fde047", # yellow
194
+ "scrub_shrub": "#bef264",
195
+ "built_area": "#9ca3af", # neutral gray
196
+ "bare_ground": "#d6d3d1", # warm light gray
197
+ "snow_ice": "#f3f4f6",
198
+ "clouds_or_no_data": "#000000", # not used (kept transparent)
199
+ }
200
+ # Classes we don't render at all (transparent) — water is best left
201
+ # uncolored so the basemap shoreline reads through; clouds/no-data is
202
+ # semantically meaningless to fill.
203
+ LULC_HIDE_CLASSES = {"water", "clouds_or_no_data"}
204
+
205
+
206
+ def _polygonize_lulc(class_idx, bounds_4326: tuple) -> dict:
207
+ """Vectorize the per-pixel argmax classification into one MultiPolygon
208
+ per class label, then dump as a single GeoJSON FeatureCollection in
209
+ EPSG:4326. Each feature carries `label` + `class_idx` properties so
210
+ the frontend can colour by category.
211
+ """
212
+ import json
213
+
214
+ import geopandas as gpd
215
+ from rasterio.features import shapes
216
+ from rasterio.transform import from_bounds as transform_from_bounds
217
+ from shapely.geometry import shape
218
+
219
+ minlon, minlat, maxlon, maxlat = bounds_4326
220
+ h, w = class_idx.shape
221
+ transform = transform_from_bounds(minlon, minlat, maxlon, maxlat, w, h)
222
+ feats = []
223
+ for i, label in enumerate(LULC_CLASSES):
224
+ if label in LULC_HIDE_CLASSES:
225
+ continue
226
+ mask = (class_idx == i).astype("uint8")
227
+ if mask.sum() < 8: # skip tiny noise
228
+ continue
229
+ polys = []
230
+ for geom, value in shapes(mask, mask=mask.astype(bool),
231
+ transform=transform):
232
+ if value != 1:
233
+ continue
234
+ polys.append(shape(geom))
235
+ if not polys:
236
+ continue
237
+ # Dissolve via geopandas + simplify lightly. The chip is 30 m
238
+ # per pixel and we don't need pixel-edge fidelity at urban zoom.
239
+ gdf = gpd.GeoDataFrame({"geometry": polys}, crs="EPSG:4326")
240
+ gdf["geometry"] = gdf.geometry.simplify(1e-4, preserve_topology=True)
241
+ for geom in gdf.geometry:
242
+ feats.append({
243
+ "type": "Feature",
244
+ "geometry": json.loads(gpd.GeoSeries([geom],
245
+ crs="EPSG:4326").to_json())["features"][0]["geometry"],
246
+ "properties": {"label": label, "class_idx": i,
247
+ "fill_color": LULC_FILL_COLORS.get(label, "#9ca3af")},
248
+ })
249
+ return {"type": "FeatureCollection", "features": feats}
250
+
251
+
252
+ def fetch(lat: float, lon: float, timeout_s: float = 60.0) -> dict[str, Any]:
253
+ """Run the specialist. Returns:
254
+ { ok: bool,
255
+ skipped: str | None,
256
+ synthetic_modality: bool,
257
+ tim_chain: list[str],
258
+ diffusion_steps: int, diffusion_seed: int,
259
+ dem_mean_m: float,
260
+ class_fractions: dict[str, float], # tentative ESRI labels
261
+ dominant_class: str, # highest-fraction label
262
+ dominant_pct: float,
263
+ n_classes_observed: int,
264
+ chip_shape: list[int],
265
+ elapsed_s: float,
266
+ err: str | None }
267
+
268
+ Designed never to raise. Failures show up as ok=False with reason.
269
+ """
270
+ if not ENABLE:
271
+ return {"ok": False, "skipped": "RIPRAP_TERRAMIND_ENABLE=0"}
272
+ if not _DEPS_OK:
273
+ return {"ok": False, "skipped": f"deps unavailable: {_DEPS_MISSING}"}
274
+ t0 = time.time()
275
+ try:
276
+ import numpy as np
277
+ patch = _read_dem_patch(lat, lon)
278
+ if patch is None:
279
+ return {"ok": False, "skipped": "no DEM coverage at this point"}
280
+ dem, bounds_4326 = patch
281
+ dem_mean = float(dem.mean())
282
+
283
+ import torch
284
+ random.seed(DEFAULT_SEED)
285
+ torch.manual_seed(DEFAULT_SEED)
286
+
287
+ model = _ensure_model()
288
+ dem_t = torch.from_numpy(dem).unsqueeze(0).float() # (1, 1, H, W)
289
+ if time.time() - t0 > timeout_s:
290
+ return {"ok": False, "skipped": "terramind exceeded budget"}
291
+
292
+ with torch.no_grad():
293
+ out = model({"DEM": dem_t}, timesteps=DEFAULT_STEPS,
294
+ verbose=False)
295
+ lulc = out["LULC"]
296
+ if hasattr(lulc, "detach"):
297
+ lulc = lulc.detach().cpu().numpy()
298
+ if lulc.ndim == 4:
299
+ lulc = lulc[0] # (n_classes, H, W)
300
+ # Argmax over class channel -> per-pixel class index, then
301
+ # fraction by class. This is the cite-able structured output.
302
+ class_idx = lulc.argmax(axis=0) # (H, W)
303
+ unique, counts = np.unique(class_idx, return_counts=True)
304
+ total = float(class_idx.size)
305
+ fractions: dict[str, float] = {}
306
+ for u, c in zip(unique, counts, strict=False):
307
+ label = (LULC_CLASSES[int(u)] if 0 <= int(u) < len(LULC_CLASSES)
308
+ else f"class_{int(u)}")
309
+ fractions[label] = round(100.0 * c / total, 2)
310
+ # Sort dominant -> tail for deterministic doc body ordering.
311
+ ordered = dict(sorted(fractions.items(),
312
+ key=lambda kv: kv[1], reverse=True))
313
+ dominant_class = next(iter(ordered)) if ordered else "unknown"
314
+ dominant_pct = ordered.get(dominant_class, 0.0)
315
+ # Class indices map to TerraMesh's LULC tokenizer codebook; the
316
+ # exact label-to-index mapping isn't published. Surface a tentative
317
+ # name plus the raw index so a reader can see we're not asserting
318
+ # ground truth.
319
+ dominant_idx = next((i for i, lbl in enumerate(LULC_CLASSES)
320
+ if lbl == dominant_class), -1)
321
+ dominant_display = (
322
+ f"class_{dominant_idx} (tentative: {dominant_class})"
323
+ if dominant_idx >= 0 else dominant_class
324
+ )
325
+
326
+ # Polygonize the categorical raster for the map layer.
327
+ # Best-effort — failure here doesn't fail the specialist.
328
+ try:
329
+ polygons_geojson = _polygonize_lulc(class_idx, bounds_4326)
330
+ except Exception:
331
+ log.exception("terramind: polygonize failed; skipping map layer")
332
+ polygons_geojson = None
333
+
334
+ return {
335
+ "ok": True,
336
+ "synthetic_modality": True,
337
+ "tim_chain": ["DEM", "LULC_synthetic"],
338
+ "diffusion_steps": DEFAULT_STEPS,
339
+ "diffusion_seed": DEFAULT_SEED,
340
+ "dem_mean_m": round(dem_mean, 2),
341
+ "class_fractions": ordered,
342
+ "dominant_class": dominant_class,
343
+ "dominant_class_display": dominant_display,
344
+ "dominant_pct": dominant_pct,
345
+ "n_classes_observed": len(ordered),
346
+ "chip_shape": list(lulc.shape),
347
+ "bounds_4326": list(bounds_4326),
348
+ "polygons_geojson": polygons_geojson,
349
+ "label_schema": "ESRI 2020-2022 Land Cover (tentative — "
350
+ "TerraMind tokenizer source confirms ESRI but "
351
+ "not exact label-to-index mapping)",
352
+ "elapsed_s": round(time.time() - t0, 2),
353
+ }
354
+ except Exception as e:
355
+ log.exception("terramind: fetch failed")
356
+ return {"ok": False, "err": f"{type(e).__name__}: {e}",
357
+ "elapsed_s": round(time.time() - t0, 2)}
app/flood_layers/dep_stormwater.py CHANGED
@@ -8,7 +8,6 @@ Four scenarios, all in EPSG:2263. Polygons are categorized by depth class:
8
  from __future__ import annotations
9
 
10
  from functools import lru_cache
11
- from pathlib import Path
12
 
13
  import geopandas as gpd
14
 
@@ -71,3 +70,42 @@ def join(assets: gpd.GeoDataFrame, scenario: str) -> gpd.GeoDataFrame:
71
 
72
  def label(scenario: str) -> str:
73
  return SCENARIOS[scenario]["label"]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  from __future__ import annotations
9
 
10
  from functools import lru_cache
 
11
 
12
  import geopandas as gpd
13
 
 
70
 
71
  def label(scenario: str) -> str:
72
  return SCENARIOS[scenario]["label"]
73
+
74
+
75
+ def coverage_for_polygon(polygon, scenario: str,
76
+ polygon_crs: str = "EPSG:4326") -> dict:
77
+ """Polygon-level summary: what fraction of the input polygon falls into
78
+ each depth class for a given DEP scenario? Used in neighborhood mode.
79
+
80
+ Returns:
81
+ {
82
+ 'scenario': scenario id,
83
+ 'label': human-readable scenario name,
84
+ 'fraction_any': fraction of polygon inside any flooded class,
85
+ 'fraction_class': {1: f, 2: f, 3: f} fraction in each class,
86
+ 'polygon_area_m2': total polygon area,
87
+ }
88
+ """
89
+ z = load(scenario)
90
+ poly_gdf = gpd.GeoDataFrame(geometry=[polygon], crs=polygon_crs).to_crs(NYC_CRS)
91
+ poly_geom = poly_gdf.iloc[0].geometry
92
+ poly_ft2 = float(poly_geom.area)
93
+ sqft_to_m2 = 0.092903
94
+ fraction_class = {1: 0.0, 2: 0.0, 3: 0.0}
95
+ if poly_ft2:
96
+ for cat in (1, 2, 3):
97
+ sub = z[z["Flooding_Category"] == cat]
98
+ if sub.empty:
99
+ continue
100
+ inter = sub.geometry.intersection(poly_geom)
101
+ inter = inter[~inter.is_empty]
102
+ ft2 = float(inter.area.sum()) if len(inter) else 0.0
103
+ fraction_class[cat] = round(ft2 / poly_ft2, 4)
104
+ fraction_any = round(sum(fraction_class.values()), 4)
105
+ return {
106
+ "scenario": scenario,
107
+ "label": label(scenario),
108
+ "fraction_any": fraction_any,
109
+ "fraction_class": fraction_class,
110
+ "polygon_area_m2": round(poly_ft2 * sqft_to_m2, 1),
111
+ }
app/flood_layers/ida_hwm.py CHANGED
@@ -33,6 +33,7 @@ class HWMSummary:
33
  nearest_site: str | None
34
  nearest_elev_ft: float | None
35
  sample_sites: list[str]
 
36
 
37
 
38
  def _haversine_m(lat1, lon1, lat2, lon2):
@@ -71,6 +72,17 @@ def summary_for_point(lat: float, lon: float, radius_m: int = 1000) -> HWMSummar
71
  if f["properties"].get("height_above_gnd") is not None]
72
  sites = [f["properties"].get("site_description") for _, f in in_radius]
73
  sites = [s for s in sites if s][:5]
 
 
 
 
 
 
 
 
 
 
 
74
  return HWMSummary(
75
  n_within_radius=len(in_radius),
76
  radius_m=radius_m,
@@ -80,4 +92,5 @@ def summary_for_point(lat: float, lon: float, radius_m: int = 1000) -> HWMSummar
80
  nearest_site=nf["properties"].get("site_description") if nf else None,
81
  nearest_elev_ft=nf["properties"].get("elev_ft") if nf else None,
82
  sample_sites=sites,
 
83
  )
 
33
  nearest_site: str | None
34
  nearest_elev_ft: float | None
35
  sample_sites: list[str]
36
+ points: list[dict] | None = None # per-mark for the map layer
37
 
38
 
39
  def _haversine_m(lat1, lon1, lat2, lon2):
 
72
  if f["properties"].get("height_above_gnd") is not None]
73
  sites = [f["properties"].get("site_description") for _, f in in_radius]
74
  sites = [s for s in sites if s][:5]
75
+ points = []
76
+ for d, f in in_radius[:50]: # cap so SSE payload stays small
77
+ flon, flat = f["geometry"]["coordinates"]
78
+ p = f["properties"]
79
+ points.append({
80
+ "lat": flat, "lon": flon,
81
+ "site": p.get("site_description"),
82
+ "elev_ft": p.get("elev_ft"),
83
+ "height_above_gnd_ft": p.get("height_above_gnd"),
84
+ "distance_m": round(d, 1),
85
+ })
86
  return HWMSummary(
87
  n_within_radius=len(in_radius),
88
  radius_m=radius_m,
 
92
  nearest_site=nf["properties"].get("site_description") if nf else None,
93
  nearest_elev_ft=nf["properties"].get("elev_ft") if nf else None,
94
  sample_sites=sites,
95
+ points=points,
96
  )
app/flood_layers/prithvi_live.py ADDED
@@ -0,0 +1,299 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Prithvi-EO 2.0 (Sen1Floods11 fine-tune) live water segmentation.
2
+
3
+ A per-query specialist: pulls the most recent low-cloud Sentinel-2 L2A
4
+ scene over the address from Microsoft Planetary Computer, runs the
5
+ IBM-NASA flood-mapping fine-tune, and reports % water within 500 m.
6
+
7
+ Distinct from `app/flood_layers/prithvi_water.py`, which serves the
8
+ offline-precomputed 2021 Ida polygons. This one is *fresh observation*
9
+ each query — different doc_id (`prithvi_live`), different epistemic
10
+ claim, additive to the static layer.
11
+
12
+ Network calls (STAC search + COG band reads) and a 300M-param model
13
+ forward pass make this the slowest specialist after the LLM. Gated by
14
+ RIPRAP_PRITHVI_LIVE_ENABLE so deployments without the deps installed
15
+ silently skip it. Cloud-cover refuses out at 30%+ to honor the
16
+ Sen1Floods11 training distribution.
17
+
18
+ License: Apache-2.0 (verified — `ibm-nasa-geospatial/Prithvi-EO-2.0-
19
+ 300M-TL-Sen1Floods11`). See experiments/shared/licenses.md.
20
+ """
21
+
22
+ from __future__ import annotations
23
+
24
+ import logging
25
+ import os
26
+ import threading
27
+ import time
28
+ from typing import Any
29
+
30
+ log = logging.getLogger("riprap.prithvi_live")
31
+
32
+ ENABLE = os.environ.get("RIPRAP_PRITHVI_LIVE_ENABLE", "1").lower() in ("1", "true", "yes")
33
+ SEARCH_DAYS = int(os.environ.get("RIPRAP_PRITHVI_LIVE_SEARCH_DAYS", "120"))
34
+ MAX_CLOUD_PCT = float(os.environ.get("RIPRAP_PRITHVI_LIVE_MAX_CLOUD", "30"))
35
+ DEVICE = os.environ.get("RIPRAP_PRITHVI_LIVE_DEVICE", "cpu")
36
+ REPO = "ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11"
37
+
38
+ # Sen1Floods11 expects 6 bands in this exact order.
39
+ BANDS = ["B02", "B03", "B04", "B8A", "B11", "B12"]
40
+ IMG_SIZE = 512 # Sen1Floods11 training crop
41
+ CHIP_PX = 1024
42
+ CHIP_M = CHIP_PX * 10
43
+ HALF_M = CHIP_M / 2
44
+ CENTER_RADIUS_M = 500
45
+ PIXEL_M = 10
46
+
47
+ _MODEL = None
48
+ _RUN_MODEL = None
49
+ _INIT_LOCK = threading.Lock() # serializes lazy load if multiple threads
50
+ # hit fetch() before _MODEL is populated
51
+
52
+
53
+ def _has_required_deps() -> tuple[bool, str | None]:
54
+ """Heavy-EO deps (terratorch / planetary_computer / rioxarray /
55
+ pystac-client / xarray / einops) live in requirements-experiments.txt
56
+ only — they don't fit Riprap's HF Spaces' Py3.10 dep cone alongside
57
+ transformers<5 / hf_hub<1 / granite-tsfm<0.3.4 / mellea<0.4.
58
+
59
+ Probe each importable name once at module load. If any are missing,
60
+ fetch() returns a clean `skipped: deps_unavailable` outcome instead
61
+ of crashing with a noisy ModuleNotFoundError in the trace. Local
62
+ dev + AMD path have these installed and the specialist runs."""
63
+ missing = []
64
+ for name in ("terratorch", "planetary_computer", "pystac_client",
65
+ "rioxarray", "xarray", "einops"):
66
+ try:
67
+ __import__(name)
68
+ except ImportError:
69
+ missing.append(name)
70
+ if missing:
71
+ return False, ", ".join(missing)
72
+ return True, None
73
+
74
+
75
+ _DEPS_OK, _DEPS_MISSING = _has_required_deps()
76
+
77
+
78
+ def warm():
79
+ """Optional pre-load. The FSM action is lazy too — calling warm()
80
+ here just amortizes the first-query cost at app boot."""
81
+ if not ENABLE:
82
+ return
83
+ try:
84
+ _ensure_model()
85
+ except Exception:
86
+ log.exception("prithvi_live: warm() failed; specialist will no-op")
87
+
88
+
89
+ def _ensure_model():
90
+ global _MODEL, _RUN_MODEL
91
+ if _MODEL is not None:
92
+ return _MODEL, _RUN_MODEL
93
+ with _INIT_LOCK:
94
+ if _MODEL is not None: # double-check inside the lock
95
+ return _MODEL, _RUN_MODEL
96
+ import importlib.util
97
+
98
+ from huggingface_hub import hf_hub_download
99
+ from terratorch.cli_tools import LightningInferenceModel
100
+ config_path = hf_hub_download(REPO, "config.yaml")
101
+ checkpoint = hf_hub_download(REPO, "Prithvi-EO-V2-300M-TL-Sen1Floods11.pt")
102
+ log.info("prithvi_live: loading model")
103
+ m = LightningInferenceModel.from_config(config_path, checkpoint)
104
+ m.model.eval()
105
+ if DEVICE == "cuda":
106
+ try:
107
+ import torch
108
+ if torch.cuda.is_available():
109
+ m.model.cuda()
110
+ except Exception:
111
+ log.exception("prithvi_live: cuda move failed")
112
+
113
+ inference_py = hf_hub_download(REPO, "inference.py")
114
+ spec = importlib.util.spec_from_file_location("_prithvi_inference",
115
+ inference_py)
116
+ mod = importlib.util.module_from_spec(spec)
117
+ spec.loader.exec_module(mod)
118
+ _MODEL = m
119
+ _RUN_MODEL = mod.run_model
120
+ return _MODEL, _RUN_MODEL
121
+
122
+
123
+ def _search_recent_scene(lat: float, lon: float):
124
+ """Most recent low-cloud S2 L2A item near (lat, lon) in the last
125
+ SEARCH_DAYS days, or None."""
126
+ import datetime as dt
127
+
128
+ import planetary_computer as pc
129
+ from pystac_client import Client
130
+ end = dt.datetime.utcnow().date()
131
+ start = end - dt.timedelta(days=SEARCH_DAYS)
132
+ client = Client.open(
133
+ "https://planetarycomputer.microsoft.com/api/stac/v1",
134
+ modifier=pc.sign_inplace,
135
+ )
136
+ delta = 0.02
137
+ search = client.search(
138
+ collections=["sentinel-2-l2a"],
139
+ bbox=[lon - delta, lat - delta, lon + delta, lat + delta],
140
+ datetime=f"{start}/{end}",
141
+ query={"eo:cloud_cover": {"lt": MAX_CLOUD_PCT}},
142
+ max_items=20,
143
+ )
144
+ items = sorted(
145
+ search.items(),
146
+ key=lambda it: (it.properties.get("eo:cloud_cover", 100),
147
+ -(it.datetime.timestamp() if it.datetime else 0)),
148
+ )
149
+ return items[0] if items else None
150
+
151
+
152
+ def _build_chip(item, lat: float, lon: float):
153
+ """Returns (img, ref_da, epsg) — img is the (6, H, W) center-cropped
154
+ float32 array; ref_da is the rioxarray DataArray of the reference
155
+ band BEFORE the center crop (kept so we can compute the affine
156
+ transform for polygonization in EPSG:4326)."""
157
+ import numpy as np
158
+ import rioxarray # noqa: F401
159
+ import xarray as xr
160
+ from pyproj import Transformer
161
+ if "proj:epsg" in item.properties:
162
+ epsg = int(item.properties["proj:epsg"])
163
+ else:
164
+ code = item.properties.get("proj:code", "")
165
+ if code.startswith("EPSG:"):
166
+ epsg = int(code.split(":", 1)[1])
167
+ else:
168
+ raise RuntimeError("STAC item missing proj:epsg / proj:code")
169
+ fwd = Transformer.from_crs("EPSG:4326", f"EPSG:{epsg}", always_xy=True)
170
+ cx, cy = fwd.transform(lon, lat)
171
+ xmin, xmax = cx - HALF_M, cx + HALF_M
172
+ ymin, ymax = cy - HALF_M, cy + HALF_M
173
+ ref = rioxarray.open_rasterio(item.assets[BANDS[0]].href, masked=False).squeeze(drop=True)
174
+ ref = ref.rio.clip_box(minx=xmin, miny=ymin, maxx=xmax, maxy=ymax)
175
+ ref = ref.isel(y=slice(0, CHIP_PX), x=slice(0, CHIP_PX))
176
+ arrs = [ref.astype("float32")]
177
+ for b in BANDS[1:]:
178
+ da = rioxarray.open_rasterio(item.assets[b].href, masked=False).squeeze(drop=True)
179
+ da = da.rio.clip_box(minx=xmin, miny=ymin, maxx=xmax, maxy=ymax)
180
+ if da.shape != ref.shape:
181
+ da = da.rio.reproject_match(ref)
182
+ arrs.append(da.astype("float32"))
183
+ stacked = xr.concat(arrs, dim="band", join="override").assign_coords(band=BANDS)
184
+ img = stacked.values # (6, H, W)
185
+ # Center crop to IMG_SIZE x IMG_SIZE.
186
+ _, h, w = img.shape
187
+ sy, sx = (h - IMG_SIZE) // 2, (w - IMG_SIZE) // 2
188
+ img = img[:, sy:sy + IMG_SIZE, sx:sx + IMG_SIZE]
189
+ if img.mean() > 1:
190
+ img = img / 10000.0
191
+ return np.nan_to_num(img.astype("float32")), ref, epsg
192
+
193
+
194
+ def _polygonize_mask(pred, ref_da, epsg: int) -> dict | None:
195
+ """Vectorize the binary water mask into an EPSG:4326 GeoJSON
196
+ FeatureCollection so the frontend can paint it on the MapLibre
197
+ map. Returns None on failure (best-effort — never raises into the
198
+ caller path)."""
199
+ try:
200
+ import json
201
+
202
+ import geopandas as gpd
203
+ from rasterio.features import shapes
204
+ from rasterio.transform import from_origin
205
+ from shapely.geometry import shape
206
+ # Reconstruct the affine transform of the center-cropped pred.
207
+ # ref_da has 1024 px at 10 m; we cropped to the central 512.
208
+ xs = ref_da.x.values
209
+ ys = ref_da.y.values
210
+ if len(xs) < IMG_SIZE or len(ys) < IMG_SIZE:
211
+ return None
212
+ # rioxarray gives pixel-centered coords; offset by half a pixel
213
+ # to the upper-left to build a from_origin transform.
214
+ sy = (len(ys) - IMG_SIZE) // 2
215
+ sx = (len(xs) - IMG_SIZE) // 2
216
+ # ys are descending (top-to-bottom); take the top of the crop.
217
+ top_y = float(ys[sy]) + (PIXEL_M / 2.0)
218
+ left_x = float(xs[sx]) - (PIXEL_M / 2.0)
219
+ transform = from_origin(left_x, top_y, PIXEL_M, PIXEL_M)
220
+ # Polygonize only the water class (1).
221
+ mask = (pred == 1).astype("uint8")
222
+ polys = []
223
+ for geom, value in shapes(mask, mask=mask.astype(bool),
224
+ transform=transform):
225
+ if value != 1:
226
+ continue
227
+ polys.append(shape(geom))
228
+ if not polys:
229
+ return {"type": "FeatureCollection", "features": []}
230
+ gdf = gpd.GeoDataFrame({"geometry": polys},
231
+ crs=f"EPSG:{epsg}").to_crs("EPSG:4326")
232
+ # Simplify slightly to keep the SSE payload small (10 m raster
233
+ # over 5 km square = up to ~10 k tiny squares; simplification
234
+ # collapses adjacent water pixels into smooth polygons).
235
+ gdf["geometry"] = gdf.geometry.simplify(0.00005, preserve_topology=True)
236
+ return json.loads(gdf.to_json())
237
+ except Exception:
238
+ log.exception("prithvi_live: polygonize failed")
239
+ return None
240
+
241
+
242
+ def fetch(lat: float, lon: float, timeout_s: float = 60.0) -> dict[str, Any]:
243
+ """Run the specialist. Returns a dict with at minimum:
244
+ { "ok": bool,
245
+ "skipped": str | None, # reason if no observation
246
+ "item_id": str | None,
247
+ "item_datetime": str | None,
248
+ "cloud_cover": float | None,
249
+ "pct_water_within_500m": float | None,
250
+ "pct_water_full": float | None }
251
+ Designed to never raise; failures show up as ok=False with an `err`.
252
+ """
253
+ if not ENABLE:
254
+ return {"ok": False, "skipped": "RIPRAP_PRITHVI_LIVE_ENABLE=0"}
255
+ if not _DEPS_OK:
256
+ # Clean "not deployed here" signal instead of a ModuleNotFoundError
257
+ # surfaced as an exception. Same trace-card layout as ENABLE=0.
258
+ return {"ok": False,
259
+ "skipped": f"deps unavailable on this deployment: "
260
+ f"{_DEPS_MISSING}"}
261
+ t0 = time.time()
262
+ try:
263
+ item = _search_recent_scene(lat, lon)
264
+ if item is None:
265
+ return {"ok": False, "skipped": f"no <{MAX_CLOUD_PCT}% cloud "
266
+ f"S2 in last {SEARCH_DAYS}d"}
267
+ cc = float(item.properties.get("eo:cloud_cover", -1))
268
+ if time.time() - t0 > timeout_s:
269
+ return {"ok": False, "skipped": "stac search exceeded budget"}
270
+ img, ref_da, epsg = _build_chip(item, lat, lon)
271
+ if time.time() - t0 > timeout_s:
272
+ return {"ok": False, "skipped": "chip build exceeded budget"}
273
+ model, run_model = _ensure_model()
274
+ x = img[None, :, None, :, :] # (1, 6, 1, H, W)
275
+ pred_t = run_model(x, None, None, model.model, model.datamodule, IMG_SIZE)
276
+ import numpy as np
277
+ pred = pred_t[0].cpu().numpy().astype("uint8")
278
+ pct_full = float(100.0 * pred.mean())
279
+ yy, xx = np.indices(pred.shape)
280
+ cy, cx = pred.shape[0] // 2, pred.shape[1] // 2
281
+ radius_px = CENTER_RADIUS_M / PIXEL_M
282
+ circle = (yy - cy) ** 2 + (xx - cx) ** 2 <= radius_px ** 2
283
+ pct_500 = float(100.0 * pred[circle].mean()) if circle.sum() else 0.0
284
+ # Polygonize the water mask into EPSG:4326 GeoJSON for the map.
285
+ polygons_geojson = _polygonize_mask(pred, ref_da, epsg)
286
+ return {
287
+ "ok": True,
288
+ "item_id": item.id,
289
+ "item_datetime": str(item.datetime),
290
+ "cloud_cover": cc,
291
+ "pct_water_full": pct_full,
292
+ "pct_water_within_500m": pct_500,
293
+ "polygons_geojson": polygons_geojson,
294
+ "elapsed_s": round(time.time() - t0, 2),
295
+ }
296
+ except Exception as e:
297
+ log.exception("prithvi_live: fetch failed")
298
+ return {"ok": False, "err": f"{type(e).__name__}: {e}",
299
+ "elapsed_s": round(time.time() - t0, 2)}
app/flood_layers/sandy_inundation.py CHANGED
@@ -17,7 +17,7 @@ def load() -> gpd.GeoDataFrame:
17
  return g[["geometry"]]
18
 
19
 
20
- def join(assets: gpd.GeoDataFrame) -> "gpd.pd.Series":
21
  """Return a boolean Series indexed like assets: True if inside Sandy zone."""
22
  z = load()
23
  # spatial join avoids fragile unary union over messy public polygons
@@ -32,3 +32,31 @@ def join(assets: gpd.GeoDataFrame) -> "gpd.pd.Series":
32
  s[:] = False
33
  s.iloc[list(flagged)] = True
34
  return s.reset_index(drop=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  return g[["geometry"]]
18
 
19
 
20
+ def join(assets: gpd.GeoDataFrame) -> gpd.pd.Series:
21
  """Return a boolean Series indexed like assets: True if inside Sandy zone."""
22
  z = load()
23
  # spatial join avoids fragile unary union over messy public polygons
 
32
  s[:] = False
33
  s.iloc[list(flagged)] = True
34
  return s.reset_index(drop=True)
35
+
36
+
37
+ def coverage_for_polygon(polygon, polygon_crs: str = "EPSG:4326") -> dict:
38
+ """Polygon-level summary: what fraction of the input polygon overlaps
39
+ the 2012 Sandy inundation extent? Used in neighborhood-mode queries.
40
+
41
+ Returns:
42
+ {
43
+ 'overlap_area_m2': absolute overlap in m2,
44
+ 'polygon_area_m2': total polygon area in m2,
45
+ 'fraction': overlap / polygon_area, range [0, 1],
46
+ 'inside': True if any overlap exists,
47
+ }
48
+ """
49
+ z = load().to_crs("EPSG:2263") # NY State Plane Long Island, units = ft
50
+ poly_gdf = gpd.GeoDataFrame(geometry=[polygon], crs=polygon_crs).to_crs("EPSG:2263")
51
+ poly_geom = poly_gdf.iloc[0].geometry
52
+ inter = z.intersection(poly_geom)
53
+ inter = inter[~inter.is_empty]
54
+ overlap_ft2 = float(inter.area.sum()) if len(inter) else 0.0
55
+ poly_ft2 = float(poly_geom.area)
56
+ sqft_to_m2 = 0.092903
57
+ return {
58
+ "overlap_area_m2": round(overlap_ft2 * sqft_to_m2, 1),
59
+ "polygon_area_m2": round(poly_ft2 * sqft_to_m2, 1),
60
+ "fraction": round(overlap_ft2 / poly_ft2, 4) if poly_ft2 else 0.0,
61
+ "inside": overlap_ft2 > 0,
62
+ }
app/fsm.py CHANGED
@@ -1,12 +1,14 @@
1
- """HeliOS-NYC Burr FSM for address-query flood risk.
2
 
3
- Linear pipeline; each action degrades gracefully (empty result -> no doc).
4
- The reconciler (Granite 4.1) only sees documents from specialists that
5
- actually produced data.
 
6
  """
7
  from __future__ import annotations
8
 
9
  import logging
 
10
  import time
11
  from typing import Any
12
 
@@ -14,15 +16,93 @@ import geopandas as gpd
14
  from burr.core import ApplicationBuilder, State, action
15
  from shapely.geometry import Point
16
 
17
- from app.context import floodnet, microtopo, nyc311
18
  from app.energy import estimate as energy_estimate
19
  from app.flood_layers import dep_stormwater, ida_hwm, prithvi_water, sandy_inundation
20
  from app.geocode import geocode_one
 
 
21
  from app.rag import retrieve as rag_retrieve
22
  from app.reconcile import reconcile as run_reconcile
 
 
 
 
23
 
24
- log = logging.getLogger("helios_nyc.fsm")
25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
  def _step(state: State, name: str) -> dict[str, Any]:
28
  """Append a step record to the trace; returns the dict so the action
@@ -67,7 +147,10 @@ def step_sandy(state: State) -> State:
67
  try:
68
  if state.get("lat") is None:
69
  rec["ok"] = False; rec["err"] = "no coords"
70
- return state.update(trace=trace)
 
 
 
71
  pt = gpd.GeoDataFrame(geometry=[Point(state["lon"], state["lat"])], crs="EPSG:4326").to_crs("EPSG:2263")
72
  flag = bool(sandy_inundation.join(pt).iloc[0])
73
  rec["ok"] = True; rec["result"] = {"inside": flag}
@@ -75,7 +158,7 @@ def step_sandy(state: State) -> State:
75
  except Exception as e:
76
  rec["ok"] = False; rec["err"] = str(e)
77
  log.exception("sandy failed")
78
- return state.update(trace=trace)
79
  finally:
80
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
81
 
@@ -86,7 +169,10 @@ def step_dep(state: State) -> State:
86
  try:
87
  if state.get("lat") is None:
88
  rec["ok"] = False; rec["err"] = "no coords"
89
- return state.update(trace=trace)
 
 
 
90
  pt = gpd.GeoDataFrame(geometry=[Point(state["lon"], state["lat"])], crs="EPSG:4326").to_crs("EPSG:2263")
91
  out: dict[str, Any] = {}
92
  for scen in ["dep_extreme_2080", "dep_moderate_2050", "dep_moderate_current"]:
@@ -101,7 +187,7 @@ def step_dep(state: State) -> State:
101
  except Exception as e:
102
  rec["ok"] = False; rec["err"] = str(e)
103
  log.exception("dep failed")
104
- return state.update(trace=trace)
105
  finally:
106
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
107
 
@@ -112,7 +198,10 @@ def step_floodnet(state: State) -> State:
112
  try:
113
  if state.get("lat") is None:
114
  rec["ok"] = False; rec["err"] = "no coords"
115
- return state.update(trace=trace)
 
 
 
116
  s = floodnet.summary_for_point(state["lat"], state["lon"], radius_m=600)
117
  s["radius_m"] = 600
118
  rec["ok"] = True
@@ -122,7 +211,7 @@ def step_floodnet(state: State) -> State:
122
  except Exception as e:
123
  rec["ok"] = False; rec["err"] = str(e)
124
  log.exception("floodnet failed")
125
- return state.update(trace=trace)
126
  finally:
127
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
128
 
@@ -133,14 +222,17 @@ def step_311(state: State) -> State:
133
  try:
134
  if state.get("lat") is None:
135
  rec["ok"] = False; rec["err"] = "no coords"
136
- return state.update(trace=trace)
 
 
 
137
  s = nyc311.summary_for_point(state["lat"], state["lon"], radius_m=200, years=5)
138
  rec["ok"] = True; rec["result"] = {"n": s["n"]}
139
  return state.update(nyc311=s, trace=trace)
140
  except Exception as e:
141
  rec["ok"] = False; rec["err"] = str(e)
142
  log.exception("311 failed")
143
- return state.update(trace=trace)
144
  finally:
145
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
146
 
@@ -151,7 +243,7 @@ def step_ida_hwm(state: State) -> State:
151
  try:
152
  if state.get("lat") is None:
153
  rec["ok"] = False; rec["err"] = "no coords"
154
- return state.update(trace=trace)
155
  s = ida_hwm.summary_for_point(state["lat"], state["lon"], radius_m=800)
156
  if s is None:
157
  rec["ok"] = False; rec["err"] = "HWM data missing"
@@ -178,6 +270,9 @@ def step_prithvi(state: State) -> State:
178
  if state.get("lat") is None:
179
  rec["ok"] = False; rec["err"] = "no coords"
180
  return state.update(prithvi_water=None, trace=trace)
 
 
 
181
  s = prithvi_water.summary_for_point(state["lat"], state["lon"])
182
  if s is None:
183
  rec["ok"] = False; rec["err"] = "Prithvi mask missing"
@@ -197,13 +292,375 @@ def step_prithvi(state: State) -> State:
197
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
198
 
199
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
200
  @action(reads=["lat", "lon"], writes=["microtopo", "trace"])
201
  def step_microtopo(state: State) -> State:
202
  rec, trace = _step(state, "microtopo_lidar")
203
  try:
204
  if state.get("lat") is None:
205
  rec["ok"] = False; rec["err"] = "no coords"
206
- return state.update(trace=trace)
 
 
 
207
  m = microtopo.microtopo_at(state["lat"], state["lon"])
208
  if m is None:
209
  rec["ok"] = False; rec["err"] = "DEM fetch failed"
@@ -223,12 +680,19 @@ def step_microtopo(state: State) -> State:
223
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
224
 
225
 
226
- @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo", "ida_hwm", "prithvi_water"],
 
 
 
 
227
  writes=["rag", "trace"])
228
  def step_rag(state: State) -> State:
229
  rec, trace = _step(state, "rag_granite_embedding")
230
  try:
231
  geo = state.get("geocode") or {}
 
 
 
232
  sandy = state.get("sandy")
233
  dep = state.get("dep") or {}
234
  # Build a context-rich query so retrieval pulls policy paragraphs
@@ -240,7 +704,7 @@ def step_rag(state: State) -> State:
240
  bits.append(f"in {geo['borough']}")
241
  if sandy:
242
  bits.append("inside Hurricane Sandy 2012 inundation zone")
243
- for k, v in dep.items():
244
  if v.get("depth_class", 0) > 0:
245
  bits.append(f"in {v['depth_label']} pluvial scenario")
246
  bits.append("flood resilience plan, vulnerability, hardening, mitigation")
@@ -258,11 +722,59 @@ def step_rag(state: State) -> State:
258
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
259
 
260
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
261
  @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
262
- "ida_hwm", "prithvi_water", "rag"],
263
- writes=["paragraph", "audit", "trace"])
 
 
 
 
264
  def step_reconcile(state: State) -> State:
265
- rec, trace = _step(state, "reconcile_granite41")
 
 
266
  try:
267
  snap = {
268
  "geocode": state.get("geocode"),
@@ -273,52 +785,141 @@ def step_reconcile(state: State) -> State:
273
  "microtopo": state.get("microtopo"),
274
  "ida_hwm": state.get("ida_hwm"),
275
  "prithvi_water": state.get("prithvi_water"),
 
 
 
 
 
 
276
  "rag": state.get("rag"),
 
 
 
 
 
 
 
277
  }
278
- para, audit = run_reconcile(snap, return_audit=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
279
  rec["ok"] = True
280
- rec["result"] = {
281
- "paragraph_chars": len(para),
282
- "dropped_sentences": len(audit["dropped"]),
283
- }
284
- return state.update(paragraph=para, audit=audit, trace=trace)
285
  except Exception as e:
286
  rec["ok"] = False; rec["err"] = str(e)
287
  log.exception("reconcile failed")
288
- return state.update(paragraph="", audit={"raw": "", "dropped": []}, trace=trace)
 
289
  finally:
290
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
291
 
292
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
293
  def build_app(query: str):
294
- return (
 
 
 
 
 
 
 
295
  ApplicationBuilder()
296
- .with_actions(
297
- geocode=step_geocode,
298
- sandy=step_sandy,
299
- dep=step_dep,
300
- floodnet=step_floodnet,
301
- nyc311=step_311,
302
- microtopo=step_microtopo,
303
- ida_hwm=step_ida_hwm,
304
- prithvi=step_prithvi,
305
- rag=step_rag,
306
- reconcile=step_reconcile,
307
- )
308
- .with_transitions(
309
- ("geocode", "sandy"),
310
- ("sandy", "dep"),
311
- ("dep", "floodnet"),
312
- ("floodnet", "nyc311"),
313
- ("nyc311", "microtopo"),
314
- ("microtopo", "ida_hwm"),
315
- ("ida_hwm", "prithvi"),
316
- ("prithvi", "rag"),
317
- ("rag", "reconcile"),
318
- )
319
  .with_state(query=query, trace=[])
320
  .with_entrypoint("geocode")
321
- .build()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
322
  )
323
 
324
 
@@ -345,37 +946,115 @@ def run(query: str) -> dict[str, Any]:
345
  "microtopo": final_state.get("microtopo"),
346
  "ida_hwm": final_state.get("ida_hwm"),
347
  "prithvi_water": final_state.get("prithvi_water"),
 
 
 
 
 
 
 
 
 
 
 
348
  "rag": final_state.get("rag"),
349
  "paragraph": final_state.get("paragraph"),
350
  "audit": final_state.get("audit"),
 
351
  "energy": _summarize_energy(trace),
352
  "trace": trace,
353
  }
354
 
355
 
356
  def iter_steps(query: str):
357
- """Yield (action_name, partial_state_dict) after each Burr action.
358
 
359
- Used by the web UI for SSE streaming each yield is a "step lit up"
360
- moment. The final yield carries the reconciled paragraph.
 
 
 
 
 
 
 
 
 
361
  """
 
 
 
 
 
 
 
 
 
 
 
 
362
  app = build_app(query)
363
- last_trace_len = 0
364
- for action_obj, result, state in app.iterate(halt_after=["reconcile"]):
365
- trace = list(state.get("trace", []))
366
- # Yield only the new trace records since the prior step
367
- new_records = trace[last_trace_len:]
368
- last_trace_len = len(trace)
369
- for rec in new_records:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
370
  yield {
371
  "kind": "step",
372
- "step": rec["step"],
373
- "ok": rec.get("ok"),
374
- "elapsed_s": rec.get("elapsed_s"),
375
- "result": rec.get("result"),
376
- "err": rec.get("err"),
377
  }
378
- # final
 
 
 
 
 
 
379
  trace = state.get("trace", [])
380
  yield {
381
  "kind": "final",
@@ -387,8 +1066,22 @@ def iter_steps(query: str):
387
  "microtopo": state.get("microtopo"),
388
  "ida_hwm": state.get("ida_hwm"),
389
  "prithvi_water": state.get("prithvi_water"),
 
 
 
 
 
 
 
 
 
 
 
 
390
  "rag": state.get("rag"),
 
391
  "paragraph": state.get("paragraph"),
392
  "audit": state.get("audit"),
 
393
  "energy": _summarize_energy(trace),
394
  }
 
1
+ """Riprap Burr FSM linear specialist pipeline for one address.
2
 
3
+ Each action either produces a structured fact (which becomes a document
4
+ the reconciler can cite) or stays silent on failure. The reconciler
5
+ (Granite 4.1) only sees documents from specialists that actually
6
+ produced data — the silence-over-confabulation contract.
7
  """
8
  from __future__ import annotations
9
 
10
  import logging
11
+ import threading as _threading
12
  import time
13
  from typing import Any
14
 
 
16
  from burr.core import ApplicationBuilder, State, action
17
  from shapely.geometry import Point
18
 
19
+ from app.context import floodnet, microtopo, noaa_tides, nws_alerts, nws_obs, nyc311
20
  from app.energy import estimate as energy_estimate
21
  from app.flood_layers import dep_stormwater, ida_hwm, prithvi_water, sandy_inundation
22
  from app.geocode import geocode_one
23
+ from app.live import floodnet_forecast as fn_forecast
24
+ from app.live import ttm_forecast
25
  from app.rag import retrieve as rag_retrieve
26
  from app.reconcile import reconcile as run_reconcile
27
+ from app.registers import doe_schools as r_schools
28
+ from app.registers import doh_hospitals as r_hospitals
29
+ from app.registers import mta_entrances as r_mta
30
+ from app.registers import nycha as r_nycha
31
 
32
+ log = logging.getLogger("riprap.fsm")
33
 
34
+ # NYC five-borough bbox. Specialists whose data sources are NYC-only
35
+ # (Sandy 2012, NYC DEP Stormwater, FloodNet, NYC 311, NYC microtopo
36
+ # raster, NYC Hurricane Ida Prithvi polygons) skip with an explicit
37
+ # "out of NYC scope" reason when geocode lands outside this envelope.
38
+ # Live specialists (NWS / NOAA / TTM) and the NY-State Ida HWMs run
39
+ # unconditionally.
40
+ _NYC_S, _NYC_W, _NYC_N, _NYC_E = 40.49, -74.27, 40.92, -73.69
41
+
42
+
43
+ def _in_nyc(lat, lon) -> bool:
44
+ if lat is None or lon is None:
45
+ return False
46
+ return _NYC_S <= lat <= _NYC_N and _NYC_W <= lon <= _NYC_E
47
+
48
+ # Thread-local hook so the streaming endpoint can subscribe to per-token
49
+ # Granite output during reconcile, without threading a callback through
50
+ # every Burr action signature.
51
+ _FSM_LOCAL = _threading.local()
52
+
53
+
54
+ def set_token_callback(on_token):
55
+ """Install a per-thread on_token(delta) callable for the next reconcile.
56
+ Pass None to clear."""
57
+ _FSM_LOCAL.on_token = on_token
58
+
59
+
60
+ def _current_token_callback():
61
+ return getattr(_FSM_LOCAL, "on_token", None)
62
+
63
+
64
+ def set_mellea_attempt_callback(fn):
65
+ _FSM_LOCAL.on_mellea_attempt = fn
66
+
67
+
68
+ def _current_mellea_attempt_callback():
69
+ return getattr(_FSM_LOCAL, "on_mellea_attempt", None)
70
+
71
+
72
+ def set_strict_mode(strict: bool):
73
+ """Per-thread flag — when True the linear FSM's reconcile step routes
74
+ through Mellea-validated rejection sampling instead of the standard
75
+ streaming reconciler. Disables token streaming for that step."""
76
+ _FSM_LOCAL.strict = bool(strict)
77
+
78
+
79
+ def _current_strict_mode() -> bool:
80
+ return bool(getattr(_FSM_LOCAL, "strict", False))
81
+
82
+
83
+ def set_planned_specialists(spec_names):
84
+ """Install a per-thread set of specialist names from the planner.
85
+
86
+ Used by step_reconcile to trim doc messages: documents whose family
87
+ prefix doesn't match any planned specialist are dropped before the
88
+ Mellea call. Cuts ~30-50% of prompt tokens on local Ollama, where
89
+ the FSM otherwise hands the reconciler every specialist's output
90
+ even if the planner only asked for a subset."""
91
+ _FSM_LOCAL.planned_specialists = set(spec_names) if spec_names else None
92
+
93
+
94
+ def _current_planned_specialists():
95
+ return getattr(_FSM_LOCAL, "planned_specialists", None)
96
+
97
+
98
+ # Canonical Burr: one action per specialist, sequential transitions.
99
+ # A previous version of this module wrapped 16 specialists in a single
100
+ # fan-out action that ran them concurrently in a ThreadPoolExecutor;
101
+ # that path was removed because it sometimes hung after the fan-out
102
+ # completed (Burr-internal post-action cleanup with custom executors)
103
+ # and made the trace UI's per-step timing harder to reason about.
104
+ # Parallelism, when wanted, belongs at the inference layer
105
+ # (vLLM / Ollama NUM_PARALLEL), not the FSM.
106
 
107
  def _step(state: State, name: str) -> dict[str, Any]:
108
  """Append a step record to the trace; returns the dict so the action
 
147
  try:
148
  if state.get("lat") is None:
149
  rec["ok"] = False; rec["err"] = "no coords"
150
+ return state.update(sandy=None, trace=trace)
151
+ if not _in_nyc(state["lat"], state["lon"]):
152
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
153
+ return state.update(sandy=None, trace=trace)
154
  pt = gpd.GeoDataFrame(geometry=[Point(state["lon"], state["lat"])], crs="EPSG:4326").to_crs("EPSG:2263")
155
  flag = bool(sandy_inundation.join(pt).iloc[0])
156
  rec["ok"] = True; rec["result"] = {"inside": flag}
 
158
  except Exception as e:
159
  rec["ok"] = False; rec["err"] = str(e)
160
  log.exception("sandy failed")
161
+ return state.update(sandy=None, trace=trace)
162
  finally:
163
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
164
 
 
169
  try:
170
  if state.get("lat") is None:
171
  rec["ok"] = False; rec["err"] = "no coords"
172
+ return state.update(dep=None, trace=trace)
173
+ if not _in_nyc(state["lat"], state["lon"]):
174
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
175
+ return state.update(dep=None, trace=trace)
176
  pt = gpd.GeoDataFrame(geometry=[Point(state["lon"], state["lat"])], crs="EPSG:4326").to_crs("EPSG:2263")
177
  out: dict[str, Any] = {}
178
  for scen in ["dep_extreme_2080", "dep_moderate_2050", "dep_moderate_current"]:
 
187
  except Exception as e:
188
  rec["ok"] = False; rec["err"] = str(e)
189
  log.exception("dep failed")
190
+ return state.update(dep=None, trace=trace)
191
  finally:
192
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
193
 
 
198
  try:
199
  if state.get("lat") is None:
200
  rec["ok"] = False; rec["err"] = "no coords"
201
+ return state.update(floodnet=None, trace=trace)
202
+ if not _in_nyc(state["lat"], state["lon"]):
203
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
204
+ return state.update(floodnet=None, trace=trace)
205
  s = floodnet.summary_for_point(state["lat"], state["lon"], radius_m=600)
206
  s["radius_m"] = 600
207
  rec["ok"] = True
 
211
  except Exception as e:
212
  rec["ok"] = False; rec["err"] = str(e)
213
  log.exception("floodnet failed")
214
+ return state.update(floodnet=None, trace=trace)
215
  finally:
216
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
217
 
 
222
  try:
223
  if state.get("lat") is None:
224
  rec["ok"] = False; rec["err"] = "no coords"
225
+ return state.update(nyc311=None, trace=trace)
226
+ if not _in_nyc(state["lat"], state["lon"]):
227
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
228
+ return state.update(nyc311=None, trace=trace)
229
  s = nyc311.summary_for_point(state["lat"], state["lon"], radius_m=200, years=5)
230
  rec["ok"] = True; rec["result"] = {"n": s["n"]}
231
  return state.update(nyc311=s, trace=trace)
232
  except Exception as e:
233
  rec["ok"] = False; rec["err"] = str(e)
234
  log.exception("311 failed")
235
+ return state.update(nyc311=None, trace=trace)
236
  finally:
237
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
238
 
 
243
  try:
244
  if state.get("lat") is None:
245
  rec["ok"] = False; rec["err"] = "no coords"
246
+ return state.update(ida_hwm=None, trace=trace)
247
  s = ida_hwm.summary_for_point(state["lat"], state["lon"], radius_m=800)
248
  if s is None:
249
  rec["ok"] = False; rec["err"] = "HWM data missing"
 
270
  if state.get("lat") is None:
271
  rec["ok"] = False; rec["err"] = "no coords"
272
  return state.update(prithvi_water=None, trace=trace)
273
+ if not _in_nyc(state["lat"], state["lon"]):
274
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
275
+ return state.update(prithvi_water=None, trace=trace)
276
  s = prithvi_water.summary_for_point(state["lat"], state["lon"])
277
  if s is None:
278
  rec["ok"] = False; rec["err"] = "Prithvi mask missing"
 
292
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
293
 
294
 
295
+ @action(reads=["lat", "lon"], writes=["prithvi_live", "trace"])
296
+ def step_prithvi_live(state: State) -> State:
297
+ """Live Sentinel-2 water segmentation via Prithvi-EO 2.0.
298
+
299
+ Network + 300M-param forward pass per query, so it's the slowest
300
+ specialist by far. Gracefully no-ops via the underlying module if
301
+ `RIPRAP_PRITHVI_LIVE_ENABLE=0` or if STAC / model load fails.
302
+ """
303
+ rec, trace = _step(state, "prithvi_eo_live")
304
+ try:
305
+ if state.get("lat") is None:
306
+ rec["ok"] = False; rec["err"] = "no coords"
307
+ return state.update(prithvi_live=None, trace=trace)
308
+ if not _in_nyc(state["lat"], state["lon"]):
309
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
310
+ return state.update(prithvi_live=None, trace=trace)
311
+ from app.flood_layers import prithvi_live
312
+ s = prithvi_live.fetch(state["lat"], state["lon"])
313
+ rec["ok"] = bool(s.get("ok"))
314
+ if not s.get("ok"):
315
+ rec["err"] = s.get("err") or s.get("skipped") or "no observation"
316
+ else:
317
+ rec["result"] = {
318
+ "scene_date": (s.get("item_datetime") or "")[:10],
319
+ "cloud_cover": s.get("cloud_cover"),
320
+ "pct_water_500m": s.get("pct_water_within_500m"),
321
+ "pct_water_5km": s.get("pct_water_full"),
322
+ }
323
+ return state.update(prithvi_live=s, trace=trace)
324
+ except Exception as e:
325
+ rec["ok"] = False; rec["err"] = str(e)
326
+ log.exception("prithvi_live failed")
327
+ return state.update(prithvi_live=None, trace=trace)
328
+ finally:
329
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
330
+
331
+
332
+ @action(reads=["lat", "lon"], writes=["ttm_311_forecast", "trace"])
333
+ def step_ttm_311_forecast(state: State) -> State:
334
+ """TTM r2 zero-shot forecast on weekly 311 flood-complaint counts
335
+ at this specific address (200 m radius). 52 weeks of context →
336
+ 4 weeks of forecast. Per-query, per-address, citable."""
337
+ rec, trace = _step(state, "ttm_311_forecast")
338
+ try:
339
+ if state.get("lat") is None:
340
+ rec["ok"] = False; rec["err"] = "no coords"
341
+ return state.update(ttm_311_forecast=None, trace=trace)
342
+ if not _in_nyc(state["lat"], state["lon"]):
343
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
344
+ return state.update(ttm_311_forecast=None, trace=trace)
345
+ s = ttm_forecast.weekly_311_forecast_for_point(state["lat"], state["lon"])
346
+ rec["ok"] = bool(s.get("available"))
347
+ if not rec["ok"]:
348
+ rec["err"] = s.get("reason", "unavailable")
349
+ else:
350
+ rec["result"] = {
351
+ "history_total": s.get("history_total_complaints"),
352
+ "history_recent_mean": s.get("history_recent_3mo_mean"),
353
+ "forecast_mean": s.get("forecast_mean_per_week"),
354
+ "forecast_peak": s.get("forecast_peak_per_week"),
355
+ "accelerating": s.get("accelerating"),
356
+ }
357
+ return state.update(ttm_311_forecast=s, trace=trace)
358
+ except Exception as e:
359
+ rec["ok"] = False; rec["err"] = str(e)
360
+ log.exception("ttm_311_forecast failed")
361
+ return state.update(ttm_311_forecast=None, trace=trace)
362
+ finally:
363
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
364
+
365
+
366
+ @action(reads=["lat", "lon"], writes=["terramind", "trace"])
367
+ def step_terramind(state: State) -> State:
368
+ """TerraMind v1 base — DEM → S2L2A synthesis as a per-query
369
+ cognitive-engine node. ~3-7s on M3 CPU. Output is a
370
+ *synthetic-prior* — explicitly fourth epistemic class alongside
371
+ empirical / modeled / proxy. Frame the doc body and reconciler
372
+ narration as 'plausible synthesis from terrain context', never
373
+ 'imaged' or 'reconstructed'."""
374
+ rec, trace = _step(state, "terramind_synthesis")
375
+ try:
376
+ if state.get("lat") is None:
377
+ rec["ok"] = False; rec["err"] = "no coords"
378
+ return state.update(terramind=None, trace=trace)
379
+ if not _in_nyc(state["lat"], state["lon"]):
380
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
381
+ return state.update(terramind=None, trace=trace)
382
+ from app.context import terramind_synthesis
383
+ s = terramind_synthesis.fetch(state["lat"], state["lon"])
384
+ rec["ok"] = bool(s.get("ok"))
385
+ if not s.get("ok"):
386
+ rec["err"] = s.get("err") or s.get("skipped") or "terramind unavailable"
387
+ else:
388
+ rec["result"] = {
389
+ "tim_chain": s.get("tim_chain"),
390
+ "diffusion_steps": s.get("diffusion_steps"),
391
+ "dem_mean_m": s.get("dem_mean_m"),
392
+ "synth_chip_shape": s.get("synth_chip_shape"),
393
+ "elapsed_s": s.get("elapsed_s"),
394
+ }
395
+ return state.update(terramind=s, trace=trace)
396
+ except Exception as e:
397
+ rec["ok"] = False; rec["err"] = str(e)
398
+ log.exception("terramind failed")
399
+ return state.update(terramind=None, trace=trace)
400
+ finally:
401
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
402
+
403
+
404
+ @action(reads=["lat", "lon"], writes=["noaa_tides", "trace"])
405
+ def step_noaa_tides(state: State) -> State:
406
+ rec, trace = _step(state, "noaa_tides")
407
+ try:
408
+ if state.get("lat") is None:
409
+ rec["ok"] = False; rec["err"] = "no coords"
410
+ return state.update(noaa_tides=None, trace=trace)
411
+ s = noaa_tides.summary_for_point(state["lat"], state["lon"])
412
+ rec["ok"] = s.get("error") is None
413
+ rec["result"] = {
414
+ "station": s["station_id"],
415
+ "observed_ft_mllw": s["observed_ft_mllw"],
416
+ "residual_ft": s["residual_ft"],
417
+ }
418
+ if s.get("error"): rec["err"] = s["error"]
419
+ return state.update(noaa_tides=s, trace=trace)
420
+ except Exception as e:
421
+ rec["ok"] = False; rec["err"] = str(e)
422
+ log.exception("noaa_tides failed")
423
+ return state.update(noaa_tides=None, trace=trace)
424
+ finally:
425
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
426
+
427
+
428
+ @action(reads=["lat", "lon"], writes=["nws_alerts", "trace"])
429
+ def step_nws_alerts(state: State) -> State:
430
+ rec, trace = _step(state, "nws_alerts")
431
+ try:
432
+ if state.get("lat") is None:
433
+ rec["ok"] = False; rec["err"] = "no coords"
434
+ return state.update(nws_alerts=None, trace=trace)
435
+ s = nws_alerts.summary_for_point(state["lat"], state["lon"])
436
+ rec["ok"] = s.get("error") is None
437
+ rec["result"] = {"n_active": s["n_active"]}
438
+ if s.get("error"): rec["err"] = s["error"]
439
+ return state.update(nws_alerts=s, trace=trace)
440
+ except Exception as e:
441
+ rec["ok"] = False; rec["err"] = str(e)
442
+ log.exception("nws_alerts failed")
443
+ return state.update(nws_alerts=None, trace=trace)
444
+ finally:
445
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
446
+
447
+
448
+ @action(reads=["lat", "lon"], writes=["nws_obs", "trace"])
449
+ def step_nws_obs(state: State) -> State:
450
+ rec, trace = _step(state, "nws_obs")
451
+ try:
452
+ if state.get("lat") is None:
453
+ rec["ok"] = False; rec["err"] = "no coords"
454
+ return state.update(nws_obs=None, trace=trace)
455
+ s = nws_obs.summary_for_point(state["lat"], state["lon"])
456
+ rec["ok"] = s.get("error") is None
457
+ rec["result"] = {
458
+ "station": s["station_id"],
459
+ "p1h_mm": s["precip_last_hour_mm"],
460
+ "p6h_mm": s["precip_last_6h_mm"],
461
+ }
462
+ if s.get("error"): rec["err"] = s["error"]
463
+ return state.update(nws_obs=s, trace=trace)
464
+ except Exception as e:
465
+ rec["ok"] = False; rec["err"] = str(e)
466
+ log.exception("nws_obs failed")
467
+ return state.update(nws_obs=None, trace=trace)
468
+ finally:
469
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
470
+
471
+
472
+ @action(reads=["lat", "lon"], writes=["ttm_forecast", "trace"])
473
+ def step_ttm_forecast(state: State) -> State:
474
+ """Granite TTM r2 zero-shot forecast of the Battery surge residual."""
475
+ rec, trace = _step(state, "ttm_forecast")
476
+ try:
477
+ if state.get("lat") is None:
478
+ rec["ok"] = False; rec["err"] = "no coords"
479
+ return state.update(ttm_forecast=None, trace=trace)
480
+ s = ttm_forecast.summary_for_point(state["lat"], state["lon"])
481
+ if not s.get("available"):
482
+ rec["ok"] = False
483
+ rec["err"] = s.get("reason", "TTM unavailable")
484
+ return state.update(ttm_forecast=None, trace=trace)
485
+ rec["ok"] = True
486
+ rec["result"] = {
487
+ "context": s["context_length"],
488
+ "horizon": s["horizon_steps"],
489
+ "forecast_peak_ft": s["forecast_peak_ft"],
490
+ "forecast_peak_min_ahead": s["forecast_peak_minutes_ahead"],
491
+ "interesting": s["interesting"],
492
+ }
493
+ return state.update(ttm_forecast=s, trace=trace)
494
+ except Exception as e:
495
+ rec["ok"] = False; rec["err"] = str(e)
496
+ log.exception("ttm_forecast failed")
497
+ return state.update(ttm_forecast=None, trace=trace)
498
+ finally:
499
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
500
+
501
+
502
+ @action(reads=["lat", "lon"], writes=["floodnet_forecast", "trace"])
503
+ def step_floodnet_forecast(state: State) -> State:
504
+ """TTM r2 forecast of flood-event recurrence at the nearest FloodNet
505
+ sensor. Reuses the same (512, 96) singleton as ttm_311_forecast — no
506
+ additional model loaded into memory. Silent when the sensor has too
507
+ few historical events for a defensible forecast."""
508
+ rec, trace = _step(state, "floodnet_forecast")
509
+ try:
510
+ if state.get("lat") is None:
511
+ rec["ok"] = False; rec["err"] = "no coords"
512
+ return state.update(floodnet_forecast=None, trace=trace)
513
+ if not _in_nyc(state["lat"], state["lon"]):
514
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
515
+ return state.update(floodnet_forecast=None, trace=trace)
516
+ s = fn_forecast.summary_for_point(state["lat"], state["lon"])
517
+ rec["ok"] = bool(s.get("available"))
518
+ if not rec["ok"]:
519
+ rec["err"] = s.get("reason", "unavailable")
520
+ else:
521
+ rec["result"] = {
522
+ "sensor_id": s.get("sensor_id"),
523
+ "distance_m": s.get("distance_from_query_m"),
524
+ "history_28d": s.get("history_recent_28d_events"),
525
+ "forecast_28d": s.get("forecast_28d_expected_events"),
526
+ "accelerating": s.get("accelerating"),
527
+ }
528
+ return state.update(floodnet_forecast=s if rec["ok"] else None,
529
+ trace=trace)
530
+ except Exception as e:
531
+ rec["ok"] = False; rec["err"] = str(e)
532
+ log.exception("floodnet_forecast failed")
533
+ return state.update(floodnet_forecast=None, trace=trace)
534
+ finally:
535
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
536
+
537
+
538
+ @action(reads=["lat", "lon"], writes=["mta_entrances", "trace"])
539
+ def step_mta_entrances(state: State) -> State:
540
+ rec, trace = _step(state, "mta_entrance_exposure")
541
+ try:
542
+ if state.get("lat") is None:
543
+ rec["ok"] = False; rec["err"] = "no coords"
544
+ return state.update(mta_entrances=None, trace=trace)
545
+ if not _in_nyc(state["lat"], state["lon"]):
546
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
547
+ return state.update(mta_entrances=None, trace=trace)
548
+ s = r_mta.summary_for_point(state["lat"], state["lon"])
549
+ if not s.get("available"):
550
+ rec["ok"] = False; rec["err"] = "no entrances within radius"
551
+ return state.update(mta_entrances=None, trace=trace)
552
+ rec["ok"] = True
553
+ rec["result"] = {
554
+ "n_entrances": s["n_entrances"],
555
+ "n_inside_sandy_2012": s["n_inside_sandy_2012"],
556
+ "n_in_dep_extreme_2080": s["n_in_dep_extreme_2080"],
557
+ }
558
+ return state.update(mta_entrances=s, trace=trace)
559
+ except Exception as e:
560
+ rec["ok"] = False; rec["err"] = str(e)
561
+ log.exception("mta_entrances failed")
562
+ return state.update(mta_entrances=None, trace=trace)
563
+ finally:
564
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
565
+
566
+
567
+ @action(reads=["lat", "lon"], writes=["nycha_developments", "trace"])
568
+ def step_nycha(state: State) -> State:
569
+ rec, trace = _step(state, "nycha_development_exposure")
570
+ try:
571
+ if state.get("lat") is None:
572
+ rec["ok"] = False; rec["err"] = "no coords"
573
+ return state.update(nycha_developments=None, trace=trace)
574
+ if not _in_nyc(state["lat"], state["lon"]):
575
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
576
+ return state.update(nycha_developments=None, trace=trace)
577
+ s = r_nycha.summary_for_point(state["lat"], state["lon"])
578
+ if not s.get("available"):
579
+ rec["ok"] = False; rec["err"] = "no NYCHA developments within radius"
580
+ return state.update(nycha_developments=None, trace=trace)
581
+ rec["ok"] = True
582
+ rec["result"] = {
583
+ "n_developments": s["n_developments"],
584
+ "n_majority_inside_sandy_2012": s["n_majority_inside_sandy_2012"],
585
+ "n_with_dep_2080_overlap": s["n_with_dep_2080_overlap"],
586
+ }
587
+ return state.update(nycha_developments=s, trace=trace)
588
+ except Exception as e:
589
+ rec["ok"] = False; rec["err"] = str(e)
590
+ log.exception("nycha failed")
591
+ return state.update(nycha_developments=None, trace=trace)
592
+ finally:
593
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
594
+
595
+
596
+ @action(reads=["lat", "lon"], writes=["doe_schools", "trace"])
597
+ def step_doe_schools(state: State) -> State:
598
+ rec, trace = _step(state, "doe_school_exposure")
599
+ try:
600
+ if state.get("lat") is None:
601
+ rec["ok"] = False; rec["err"] = "no coords"
602
+ return state.update(doe_schools=None, trace=trace)
603
+ if not _in_nyc(state["lat"], state["lon"]):
604
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
605
+ return state.update(doe_schools=None, trace=trace)
606
+ s = r_schools.summary_for_point(state["lat"], state["lon"])
607
+ if not s.get("available"):
608
+ rec["ok"] = False; rec["err"] = "no schools within radius"
609
+ return state.update(doe_schools=None, trace=trace)
610
+ rec["ok"] = True
611
+ rec["result"] = {
612
+ "n_schools": s["n_schools"],
613
+ "n_inside_sandy_2012": s["n_inside_sandy_2012"],
614
+ "n_in_dep_extreme_2080": s["n_in_dep_extreme_2080"],
615
+ }
616
+ return state.update(doe_schools=s, trace=trace)
617
+ except Exception as e:
618
+ rec["ok"] = False; rec["err"] = str(e)
619
+ log.exception("doe_schools failed")
620
+ return state.update(doe_schools=None, trace=trace)
621
+ finally:
622
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
623
+
624
+
625
+ @action(reads=["lat", "lon"], writes=["doh_hospitals", "trace"])
626
+ def step_doh_hospitals(state: State) -> State:
627
+ rec, trace = _step(state, "doh_hospital_exposure")
628
+ try:
629
+ if state.get("lat") is None:
630
+ rec["ok"] = False; rec["err"] = "no coords"
631
+ return state.update(doh_hospitals=None, trace=trace)
632
+ if not _in_nyc(state["lat"], state["lon"]):
633
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
634
+ return state.update(doh_hospitals=None, trace=trace)
635
+ s = r_hospitals.summary_for_point(state["lat"], state["lon"])
636
+ if not s.get("available"):
637
+ rec["ok"] = False; rec["err"] = "no hospitals within radius"
638
+ return state.update(doh_hospitals=None, trace=trace)
639
+ rec["ok"] = True
640
+ rec["result"] = {
641
+ "n_hospitals": s["n_hospitals"],
642
+ "n_inside_sandy_2012": s["n_inside_sandy_2012"],
643
+ "n_in_dep_extreme_2080": s["n_in_dep_extreme_2080"],
644
+ }
645
+ return state.update(doh_hospitals=s, trace=trace)
646
+ except Exception as e:
647
+ rec["ok"] = False; rec["err"] = str(e)
648
+ log.exception("doh_hospitals failed")
649
+ return state.update(doh_hospitals=None, trace=trace)
650
+ finally:
651
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
652
+
653
+
654
  @action(reads=["lat", "lon"], writes=["microtopo", "trace"])
655
  def step_microtopo(state: State) -> State:
656
  rec, trace = _step(state, "microtopo_lidar")
657
  try:
658
  if state.get("lat") is None:
659
  rec["ok"] = False; rec["err"] = "no coords"
660
+ return state.update(microtopo=None, trace=trace)
661
+ if not _in_nyc(state["lat"], state["lon"]):
662
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
663
+ return state.update(microtopo=None, trace=trace)
664
  m = microtopo.microtopo_at(state["lat"], state["lon"])
665
  if m is None:
666
  rec["ok"] = False; rec["err"] = "DEM fetch failed"
 
680
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
681
 
682
 
683
+
684
+
685
+ @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
686
+ "ida_hwm", "prithvi_water", "noaa_tides", "nws_alerts", "nws_obs",
687
+ "ttm_forecast"],
688
  writes=["rag", "trace"])
689
  def step_rag(state: State) -> State:
690
  rec, trace = _step(state, "rag_granite_embedding")
691
  try:
692
  geo = state.get("geocode") or {}
693
+ if not _in_nyc(geo.get("lat"), geo.get("lon")):
694
+ rec["ok"] = False; rec["err"] = "out of NYC scope"
695
+ return state.update(rag=[], trace=trace)
696
  sandy = state.get("sandy")
697
  dep = state.get("dep") or {}
698
  # Build a context-rich query so retrieval pulls policy paragraphs
 
704
  bits.append(f"in {geo['borough']}")
705
  if sandy:
706
  bits.append("inside Hurricane Sandy 2012 inundation zone")
707
+ for v in dep.values():
708
  if v.get("depth_class", 0) > 0:
709
  bits.append(f"in {v['depth_label']} pluvial scenario")
710
  bits.append("flood resilience plan, vulnerability, hardening, mitigation")
 
722
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
723
 
724
 
725
+ @action(reads=["rag"], writes=["gliner", "trace"])
726
+ def step_gliner(state: State) -> State:
727
+ """GLiNER typed-entity extraction over the top RAG paragraphs.
728
+
729
+ Adds structured fields (`agency`, `dollar_amount`,
730
+ `infrastructure_project`, `nyc_location`, `date_range`) the
731
+ reconciler can cite with `[gliner_<source>]`. Silent no-op when
732
+ disabled via RIPRAP_GLINER_ENABLE=0 or when the model failed to
733
+ load — preserves the existing FSM contract.
734
+ """
735
+ rec, trace = _step(state, "gliner_extract")
736
+ try:
737
+ from app.context.gliner_extract import extract_for_rag_hits
738
+ hits = state.get("rag") or []
739
+ if not hits:
740
+ rec["ok"] = True
741
+ rec["result"] = {"sources": 0, "skipped": "no rag hits"}
742
+ return state.update(gliner={}, trace=trace)
743
+ out = extract_for_rag_hits(hits)
744
+ rec["ok"] = True
745
+ rec["result"] = {
746
+ "sources": len(out),
747
+ "totals_by_label": _label_counts(out),
748
+ }
749
+ return state.update(gliner=out, trace=trace)
750
+ except Exception as e:
751
+ rec["ok"] = False
752
+ rec["err"] = str(e)
753
+ log.exception("gliner failed")
754
+ return state.update(gliner={}, trace=trace)
755
+ finally:
756
+ rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
757
+
758
+
759
+ def _label_counts(gliner_out: dict[str, dict]) -> dict[str, int]:
760
+ counts: dict[str, int] = {}
761
+ for src in gliner_out.values():
762
+ for e in src.get("entities", []):
763
+ counts[e["label"]] = counts.get(e["label"], 0) + 1
764
+ return counts
765
+
766
+
767
  @action(reads=["geocode", "sandy", "dep", "floodnet", "nyc311", "microtopo",
768
+ "ida_hwm", "prithvi_water", "prithvi_live", "terramind",
769
+ "noaa_tides", "nws_alerts", "nws_obs", "ttm_forecast",
770
+ "ttm_311_forecast", "floodnet_forecast", "mta_entrances",
771
+ "nycha_developments", "doe_schools", "doh_hospitals",
772
+ "rag", "gliner"],
773
+ writes=["paragraph", "audit", "mellea", "trace"])
774
  def step_reconcile(state: State) -> State:
775
+ is_strict = _current_strict_mode()
776
+ rec, trace = _step(state, "mellea_reconcile_address" if is_strict else "reconcile_granite41")
777
+ mellea_meta = None
778
  try:
779
  snap = {
780
  "geocode": state.get("geocode"),
 
785
  "microtopo": state.get("microtopo"),
786
  "ida_hwm": state.get("ida_hwm"),
787
  "prithvi_water": state.get("prithvi_water"),
788
+ "noaa_tides": state.get("noaa_tides"),
789
+ "nws_alerts": state.get("nws_alerts"),
790
+ "nws_obs": state.get("nws_obs"),
791
+ "ttm_forecast": state.get("ttm_forecast"),
792
+ "ttm_311_forecast": state.get("ttm_311_forecast"),
793
+ "floodnet_forecast": state.get("floodnet_forecast"),
794
  "rag": state.get("rag"),
795
+ "gliner": state.get("gliner"),
796
+ "prithvi_live": state.get("prithvi_live"),
797
+ "terramind": state.get("terramind"),
798
+ "mta_entrances": state.get("mta_entrances"),
799
+ "nycha_developments": state.get("nycha_developments"),
800
+ "doe_schools": state.get("doe_schools"),
801
+ "doh_hospitals": state.get("doh_hospitals"),
802
  }
803
+ if is_strict:
804
+ from app.mellea_validator import DEFAULT_LOOP_BUDGET, reconcile_strict_streaming
805
+ from app.reconcile import EXTRA_SYSTEM_PROMPT, build_documents, trim_docs_to_plan
806
+ doc_msgs = build_documents(snap)
807
+ doc_msgs = trim_docs_to_plan(doc_msgs, _current_planned_specialists())
808
+ if not doc_msgs:
809
+ para = "No grounded data available for this address."
810
+ audit = {"raw": para, "dropped": []}
811
+ else:
812
+ token_cb = _current_token_callback()
813
+ attempt_cb = _current_mellea_attempt_callback()
814
+ mres = reconcile_strict_streaming(
815
+ doc_msgs, EXTRA_SYSTEM_PROMPT,
816
+ user_prompt="Write the cited paragraph now.",
817
+ loop_budget=DEFAULT_LOOP_BUDGET,
818
+ on_token=(lambda d, _ai: token_cb(d)) if token_cb else None,
819
+ on_attempt_end=attempt_cb,
820
+ )
821
+ para = mres["paragraph"]
822
+ audit = {"raw": para, "dropped": []}
823
+ mellea_meta = {
824
+ "rerolls": mres["rerolls"],
825
+ "n_attempts": mres["n_attempts"],
826
+ "requirements_passed": mres["requirements_passed"],
827
+ "requirements_failed": mres["requirements_failed"],
828
+ "requirements_total": mres["requirements_total"],
829
+ "model": mres["model"],
830
+ "loop_budget": mres["loop_budget"],
831
+ }
832
+ rec["result"] = {
833
+ "rerolls": (mellea_meta or {}).get("rerolls"),
834
+ "passed": (f"{len((mellea_meta or {}).get('requirements_passed') or [])}/"
835
+ f"{(mellea_meta or {}).get('requirements_total') or 0}"),
836
+ "paragraph_chars": len(para),
837
+ }
838
+ else:
839
+ para, audit = run_reconcile(snap, return_audit=True,
840
+ on_token=_current_token_callback())
841
+ rec["result"] = {
842
+ "paragraph_chars": len(para),
843
+ "dropped_sentences": len(audit["dropped"]),
844
+ }
845
  rec["ok"] = True
846
+ return state.update(paragraph=para, audit=audit,
847
+ mellea=mellea_meta, trace=trace)
 
 
 
848
  except Exception as e:
849
  rec["ok"] = False; rec["err"] = str(e)
850
  log.exception("reconcile failed")
851
+ return state.update(paragraph="", audit={"raw": "", "dropped": []},
852
+ mellea=None, trace=trace)
853
  finally:
854
  rec["elapsed_s"] = round(time.time() - rec["started_at"], 2)
855
 
856
 
857
+ import os as _os # noqa: E402
858
+
859
+ # Specialists that involve large spatial joins (every NYCHA development
860
+ # overlapped against multiple flood layers, every DOE school footprint
861
+ # joined to DEM/HAND, etc.) or per-query model inference (Prithvi-EO live
862
+ # STAC + ViT, TerraMind diffusion). They're ~1-3 minutes apiece on a
863
+ # laptop on the FIRST call (the lru_caches inside the registers warm up
864
+ # afterwards). The previous parallel-fan-out FSM hid that cost behind
865
+ # the longest single specialist; the linear FSM exposes it.
866
+ #
867
+ # Default OFF on local-Ollama so the demo briefing returns in well under
868
+ # 90 s. Enable explicitly with RIPRAP_HEAVY_SPECIALISTS=1 (e.g. on the
869
+ # AMD-vLLM path, where the reconciler's ~5 s leaves room for the joins).
870
+ _HEAVY_SPECIALISTS_ENABLED = _os.environ.get(
871
+ "RIPRAP_HEAVY_SPECIALISTS",
872
+ "0" if _os.environ.get("RIPRAP_LLM_PRIMARY", "ollama").lower() == "ollama" else "1",
873
+ ).lower() in ("1", "true", "yes")
874
+
875
+
876
  def build_app(query: str):
877
+ """Linear, single-action-per-step Burr application.
878
+
879
+ Order: cheap-first geo + flood layers, then live live network signals,
880
+ then RAG → reconcile. Heavy specialists (NYCHA / DOE / DOH register
881
+ joins, Prithvi-EO live STAC, TerraMind diffusion) are gated behind
882
+ RIPRAP_HEAVY_SPECIALISTS — see the module-level note above.
883
+ """
884
+ builder = (
885
  ApplicationBuilder()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
886
  .with_state(query=query, trace=[])
887
  .with_entrypoint("geocode")
888
+ )
889
+
890
+ actions: dict[str, Any] = {
891
+ "geocode": step_geocode,
892
+ "sandy": step_sandy,
893
+ "dep": step_dep,
894
+ "floodnet": step_floodnet,
895
+ "nyc311": step_311,
896
+ "noaa_tides": step_noaa_tides,
897
+ "nws_alerts": step_nws_alerts,
898
+ "nws_obs": step_nws_obs,
899
+ "ttm_forecast": step_ttm_forecast,
900
+ "ttm_311_forecast": step_ttm_311_forecast,
901
+ "floodnet_forecast": step_floodnet_forecast,
902
+ "microtopo": step_microtopo,
903
+ "ida_hwm": step_ida_hwm,
904
+ "mta_entrances": step_mta_entrances,
905
+ "prithvi": step_prithvi, # baked GeoJSON polygons for Ida; cheap
906
+ }
907
+ if _HEAVY_SPECIALISTS_ENABLED:
908
+ actions["nycha"] = step_nycha
909
+ actions["doe_schools"] = step_doe_schools
910
+ actions["doh_hospitals"] = step_doh_hospitals
911
+ actions["prithvi_live"] = step_prithvi_live
912
+ actions["terramind"] = step_terramind
913
+ actions["rag"] = step_rag
914
+ actions["gliner"] = step_gliner
915
+ actions["reconcile"] = step_reconcile
916
+
917
+ # Sequential transitions — pair every adjacent action in the dict order.
918
+ keys = list(actions.keys())
919
+ transitions = list(zip(keys, keys[1:]))
920
+
921
+ return (
922
+ builder.with_actions(**actions).with_transitions(*transitions).build()
923
  )
924
 
925
 
 
946
  "microtopo": final_state.get("microtopo"),
947
  "ida_hwm": final_state.get("ida_hwm"),
948
  "prithvi_water": final_state.get("prithvi_water"),
949
+ "terramind": final_state.get("terramind"),
950
+ "noaa_tides": final_state.get("noaa_tides"),
951
+ "nws_alerts": final_state.get("nws_alerts"),
952
+ "nws_obs": final_state.get("nws_obs"),
953
+ "ttm_forecast": final_state.get("ttm_forecast"),
954
+ "ttm_311_forecast": final_state.get("ttm_311_forecast"),
955
+ "floodnet_forecast": final_state.get("floodnet_forecast"),
956
+ "mta_entrances": final_state.get("mta_entrances"),
957
+ "nycha_developments": final_state.get("nycha_developments"),
958
+ "doe_schools": final_state.get("doe_schools"),
959
+ "doh_hospitals": final_state.get("doh_hospitals"),
960
  "rag": final_state.get("rag"),
961
  "paragraph": final_state.get("paragraph"),
962
  "audit": final_state.get("audit"),
963
+ "mellea": final_state.get("mellea"),
964
  "energy": _summarize_energy(trace),
965
  "trace": trace,
966
  }
967
 
968
 
969
  def iter_steps(query: str):
970
+ """Yield SSE-friendly events as the FSM runs.
971
 
972
+ Each Burr action emits exactly one trace record on completion; we
973
+ yield it as a `step` event the moment the iterate loop returns from
974
+ that action. Reconciler tokens stream through the threadlocal
975
+ `set_token_callback` (installed before this generator is iterated),
976
+ not through this queue.
977
+
978
+ Burr's `app.iterate(halt_after=["reconcile"])` runs synchronously,
979
+ yielding `(action, result, state)` after every action. We drive it
980
+ in a background thread so the per-action SSE events reach the
981
+ client as soon as each action returns, while the reconciler's
982
+ token callback fires concurrently from the same thread.
983
  """
984
+ import queue
985
+
986
+ q: queue.Queue[tuple[str, Any] | None] = queue.Queue()
987
+ seen_keys: set[tuple[str, float]] = set()
988
+
989
+ def _push_step(rec: dict) -> None:
990
+ key = (rec.get("step", ""), rec.get("started_at", 0.0))
991
+ if key in seen_keys:
992
+ return
993
+ seen_keys.add(key)
994
+ q.put(("step", rec))
995
+
996
  app = build_app(query)
997
+ final_state_holder: dict[str, Any] = {}
998
+
999
+ # Threadlocals are per-thread; the request thread (single_address.run
1000
+ # / neighborhood.run) sets the strict-mode flag, planner specialist
1001
+ # set, and token / Mellea-attempt callbacks, but Burr's app.iterate
1002
+ # runs in this generator's thread. Snapshot the request-thread state
1003
+ # and re-install on the iterate thread so step_reconcile sees them.
1004
+ _captured_strict = _current_strict_mode()
1005
+ _captured_planned = _current_planned_specialists()
1006
+ _captured_token_cb = _current_token_callback()
1007
+ _captured_mellea_cb = _current_mellea_attempt_callback()
1008
+
1009
+ def _run_iterate():
1010
+ set_strict_mode(_captured_strict)
1011
+ set_planned_specialists(_captured_planned)
1012
+ set_token_callback(_captured_token_cb)
1013
+ set_mellea_attempt_callback(_captured_mellea_cb)
1014
+ try:
1015
+ for _action_obj, _result, state in app.iterate(halt_after=["reconcile"]):
1016
+ final_state_holder["state"] = state
1017
+ # Each action appends one record to state.trace; emit the
1018
+ # most recent so the SSE client gets the step event the
1019
+ # moment Burr returns from that action.
1020
+ trace = state.get("trace") or []
1021
+ if trace:
1022
+ _push_step(trace[-1])
1023
+ except Exception as e:
1024
+ log.exception("iterate raised")
1025
+ q.put(("error", {"err": f"{type(e).__name__}: {e}"}))
1026
+ finally:
1027
+ set_strict_mode(False)
1028
+ set_planned_specialists(None)
1029
+ set_token_callback(None)
1030
+ set_mellea_attempt_callback(None)
1031
+ q.put(None) # sentinel
1032
+
1033
+ runner = _threading.Thread(target=_run_iterate, name="riprap-fsm",
1034
+ daemon=True)
1035
+ runner.start()
1036
+
1037
+ while True:
1038
+ item = q.get()
1039
+ if item is None:
1040
+ break
1041
+ kind, payload = item
1042
+ if kind == "step":
1043
  yield {
1044
  "kind": "step",
1045
+ "step": payload.get("step"),
1046
+ "ok": payload.get("ok"),
1047
+ "elapsed_s": payload.get("elapsed_s"),
1048
+ "result": payload.get("result"),
1049
+ "err": payload.get("err"),
1050
  }
1051
+ elif kind == "error":
1052
+ yield {"kind": "error", **payload}
1053
+
1054
+ runner.join(timeout=5)
1055
+ state = final_state_holder.get("state")
1056
+ if state is None:
1057
+ return
1058
  trace = state.get("trace", [])
1059
  yield {
1060
  "kind": "final",
 
1066
  "microtopo": state.get("microtopo"),
1067
  "ida_hwm": state.get("ida_hwm"),
1068
  "prithvi_water": state.get("prithvi_water"),
1069
+ "prithvi_live": state.get("prithvi_live"),
1070
+ "terramind": state.get("terramind"),
1071
+ "noaa_tides": state.get("noaa_tides"),
1072
+ "nws_alerts": state.get("nws_alerts"),
1073
+ "nws_obs": state.get("nws_obs"),
1074
+ "ttm_forecast": state.get("ttm_forecast"),
1075
+ "ttm_311_forecast": state.get("ttm_311_forecast"),
1076
+ "floodnet_forecast": state.get("floodnet_forecast"),
1077
+ "mta_entrances": state.get("mta_entrances"),
1078
+ "nycha_developments": state.get("nycha_developments"),
1079
+ "doe_schools": state.get("doe_schools"),
1080
+ "doh_hospitals": state.get("doh_hospitals"),
1081
  "rag": state.get("rag"),
1082
+ "gliner": state.get("gliner"),
1083
  "paragraph": state.get("paragraph"),
1084
  "audit": state.get("audit"),
1085
+ "mellea": state.get("mellea"),
1086
  "energy": _summarize_energy(trace),
1087
  }
app/geocode.py CHANGED
@@ -1,21 +1,45 @@
1
- """NYC address geocoding via the city's public Geosupport service (no key).
2
 
3
- Uses NYC Department of City Planning's Geoclient-replacement via the open
4
- Geosearch API (geosearch.planninglabs.nyc) no auth required, NYC-only,
5
- runs against the public service. Stays inside the "open civic data" lane.
 
 
6
 
7
  Includes a borough-hint post-filter so Queens hyphenated-style addresses
8
- (e.g. "153-09 90 Ave, Jamaica, Queens") preferentially resolve to the
9
  borough the user named.
10
  """
11
  from __future__ import annotations
12
 
 
13
  import re
14
  from dataclasses import dataclass
15
 
16
  import httpx
17
 
 
 
18
  URL = "https://geosearch.planninglabs.nyc/v2/search"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  _BOROUGHS = ("Manhattan", "Bronx", "Brooklyn", "Queens", "Staten Island")
21
 
@@ -79,15 +103,83 @@ def geocode(text: str, limit: int = 5) -> list[GeocodeHit]:
79
  return out
80
 
81
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
  def geocode_one(text: str) -> GeocodeHit | None:
83
- """Return the best NYC match for `text`. If the user mentions a
84
- borough or neighborhood we recognize, filter candidates to that
85
- borough before picking the top hit. Avoids `183-12 Liberty Avenue,
86
- Queens` resolving to a Brooklyn match the API surfaced first."""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
87
  hint = _detect_borough(text)
88
  hits = geocode(text, limit=8)
89
  if hint:
90
  in_boro = [h for h in hits if h.borough and h.borough.lower() == hint.lower()]
91
  if in_boro:
92
  return in_boro[0]
93
- return hits[0] if hits else None
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Address geocoding NYC primary + national fallback.
2
 
3
+ NYC primary: NYC DCP Geosearch (geosearch.planninglabs.nyc), no auth,
4
+ NYC-only. It will fuzzy-match upstate addresses to NYC streets — e.g.
5
+ '257 Washington Ave, Albany NY' silently maps to Clinton Hill, Brooklyn.
6
+ We detect this via a non-NYC region or non-NYC ZIP and fall back to
7
+ OpenStreetMap Nominatim (no key, free, rate-limited per usage policy).
8
 
9
  Includes a borough-hint post-filter so Queens hyphenated-style addresses
10
+ (e.g. '153-09 90 Ave, Jamaica, Queens') preferentially resolve to the
11
  borough the user named.
12
  """
13
  from __future__ import annotations
14
 
15
+ import logging
16
  import re
17
  from dataclasses import dataclass
18
 
19
  import httpx
20
 
21
+ log = logging.getLogger("riprap.geocode")
22
+
23
  URL = "https://geosearch.planninglabs.nyc/v2/search"
24
+ NOMINATIM_URL = "https://nominatim.openstreetmap.org/search"
25
+ NOMINATIM_UA = "Riprap-NYC/0.1 (civic-flood-tool; +https://huggingface.co/spaces/msradam/riprap-nyc)"
26
+
27
+ # NYC-bbox guard: lat 40.49–40.92, lon -74.27 to -73.69. Anything outside
28
+ # this is probably not NYC; treat NYC Geosearch hits outside it as bogus.
29
+ NYC_BBOX = (40.49, -74.27, 40.92, -73.69)
30
+
31
+ # NYC ZIP prefixes are 100–104 (Manhattan), 110 (Queens), 112 (Brooklyn),
32
+ # 113 (Queens), 114 (Queens), 116 (Queens), 100 (Bronx 104), 103 (SI 1),
33
+ # basically all 1x with 3rd char 0–6. Upstate NY is 12x, 13x, 14x. We use
34
+ # this only as a HINT to escalate to Nominatim, not as a hard filter.
35
+ _UPSTATE_ZIP_RE = re.compile(r"\b1[2-4]\d{3}\b")
36
+ _NON_NYC_HINTS = re.compile(
37
+ r"\b(albany|troy|schenectady|saratoga|kingston|poughkeepsie|newburgh|"
38
+ r"yonkers|white plains|hudson|rhinebeck|peekskill|beacon|tarrytown|"
39
+ r"new paltz|catskill|tivoli|hyde park|coxsackie|cohoes|amsterdam|"
40
+ r"glens falls|lake george|nyack|garrison|cold spring|highland|saugerties)\b",
41
+ re.IGNORECASE,
42
+ )
43
 
44
  _BOROUGHS = ("Manhattan", "Bronx", "Brooklyn", "Queens", "Staten Island")
45
 
 
103
  return out
104
 
105
 
106
+ def _looks_upstate(text: str) -> bool:
107
+ """Heuristic: should this query bypass NYC Geosearch?"""
108
+ if _UPSTATE_ZIP_RE.search(text):
109
+ return True
110
+ if _NON_NYC_HINTS.search(text):
111
+ return True
112
+ return False
113
+
114
+
115
+ def _in_nyc_bbox(lat: float, lon: float) -> bool:
116
+ s, w, n, e = NYC_BBOX
117
+ return s <= lat <= n and w <= lon <= e
118
+
119
+
120
+ def geocode_nominatim(text: str) -> GeocodeHit | None:
121
+ """National OSM Nominatim fallback. Used when NYC Geosearch can't
122
+ plausibly answer the query."""
123
+ try:
124
+ r = httpx.get(NOMINATIM_URL, params={
125
+ "q": text, "format": "jsonv2", "addressdetails": "1",
126
+ "limit": 1, "countrycodes": "us",
127
+ }, headers={"User-Agent": NOMINATIM_UA}, timeout=10)
128
+ r.raise_for_status()
129
+ rows = r.json()
130
+ except Exception as e:
131
+ log.warning("Nominatim fetch failed: %r", e)
132
+ return None
133
+ if not rows:
134
+ return None
135
+ row = rows[0]
136
+ addr = row.get("address") or {}
137
+ label = row.get("display_name") or text
138
+ return GeocodeHit(
139
+ address=label,
140
+ borough=addr.get("city") or addr.get("town") or addr.get("village") or addr.get("county"),
141
+ lat=float(row["lat"]),
142
+ lon=float(row["lon"]),
143
+ bbl=None,
144
+ bin=None,
145
+ raw={"source": "nominatim", **row},
146
+ )
147
+
148
+
149
  def geocode_one(text: str) -> GeocodeHit | None:
150
+ """Best match for `text`, using NYC Geosearch primary with a national
151
+ OSM Nominatim fallback for upstate / non-NYC queries.
152
+
153
+ Strategy:
154
+ 1. If query mentions a known non-NYC city or has an upstate ZIP,
155
+ go straight to Nominatim — Geosearch will silently fuzzy-snap
156
+ '257 Washington Ave, Albany' to Clinton Hill Brooklyn otherwise.
157
+ 2. Otherwise try Geosearch with the borough-hint post-filter.
158
+ 3. If Geosearch returns nothing OR returns a hit outside the NYC
159
+ bbox (which means even Geosearch knows it isn't NYC), escalate
160
+ to Nominatim.
161
+ """
162
+ if _looks_upstate(text):
163
+ log.info("upstate hint detected in %r — using Nominatim", text)
164
+ hit = geocode_nominatim(text)
165
+ if hit:
166
+ return hit
167
+
168
  hint = _detect_borough(text)
169
  hits = geocode(text, limit=8)
170
  if hint:
171
  in_boro = [h for h in hits if h.borough and h.borough.lower() == hint.lower()]
172
  if in_boro:
173
  return in_boro[0]
174
+
175
+ if hits:
176
+ top = hits[0]
177
+ if top.lat is not None and _in_nyc_bbox(top.lat, top.lon):
178
+ return top
179
+ # Geosearch returned a hit, but it's outside the NYC bbox — that
180
+ # means even the NYC API thinks the answer isn't NYC. Try
181
+ # Nominatim before giving up.
182
+ log.info("Geosearch top hit outside NYC bbox (%s, %s) — falling back",
183
+ top.lat, top.lon)
184
+
185
+ return geocode_nominatim(text)
app/intents/__init__.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ """Per-intent execution modules. Each intent knows how to take a planner
2
+ Plan and run only the specialists relevant to it, returning a
3
+ reconciler-ready set of documents and a paragraph."""
app/intents/development_check.py ADDED
@@ -0,0 +1,324 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """development_check intent — "what are they building in <X> and is it risky?"
2
+
3
+ Pipeline:
4
+ 1. Resolve target text → NTA polygon
5
+ 2. Pull active DOB construction permits (NB / A1 / DM, last ~18 mo)
6
+ inside the polygon
7
+ 3. Cross-reference each permit with the Sandy + DEP scenarios already
8
+ loaded in memory
9
+ 4. Aggregate counts; rank flagged projects by severity
10
+ 5. Reconcile via Granite 4.1 with a development-briefing prompt that
11
+ names specific projects and addresses
12
+ """
13
+ from __future__ import annotations
14
+
15
+ import logging
16
+ import time
17
+ from typing import Any
18
+
19
+ from app import llm
20
+ from app.areas import nta
21
+ from app.context import dob_permits
22
+ from app.rag import retrieve as rag_retrieve
23
+
24
+ log = logging.getLogger("riprap.intent.development_check")
25
+
26
+ # Reconciler model — see app/reconcile.py for the env-var contract.
27
+ import os as _os # noqa: E402
28
+
29
+ OLLAMA_MODEL = _os.environ.get("RIPRAP_RECONCILER_MODEL",
30
+ _os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:8b"))
31
+
32
+ EXTRA_SYSTEM_PROMPT = """Write a flood-exposure briefing about active construction in an NYC neighborhood. Use ONLY the facts in the provided documents.
33
+
34
+ Output this markdown skeleton verbatim, filling each `<...>` with content drawn only from the documents. After every numerical claim, append the document id in square brackets — e.g. `<count> [dob_permits]`. Bold at most one phrase per section using `**...**`. Omit any section whose supporting facts are absent from the documents.
35
+
36
+ ```
37
+ **Status.**
38
+ <one sentence: name the neighborhood from [nta_resolve] and the headline counts from [dob_permits] (total active projects, fraction in Sandy zone, fraction in DEP scenarios)>.
39
+
40
+ **Flagged projects.**
41
+ - <project address from [dob_permits]> ([dob_permits]). <job_type_label> issued <date>; owner <owner_business>. <flood-layer summary>.
42
+ - <next project from [dob_permits], same pattern>
43
+ - <continue for each flagged project, max 6>
44
+
45
+ **Pattern.**
46
+ <1-2 sentences observing which streets concentrate the flagged projects and the new-building / major-alteration mix from [dob_permits]>.
47
+
48
+ **Policy context.**
49
+ <1 sentence per RAG hit, citing the agency name and [rag_*]>.
50
+ ```
51
+
52
+ Constraints:
53
+ - Copy addresses, BBLs, dates, and owner names verbatim from the documents — no paraphrasing.
54
+ - If [dob_permits] reports 0 flagged projects, omit the **Flagged projects.** section and say so in **Status.**.
55
+ - If only [nta_resolve] is present and no [dob_permits], output exactly: `No grounded data available for this neighborhood.`
56
+ """
57
+
58
+
59
+ def run(plan, query: str, progress_q=None, strict: bool = False) -> dict[str, Any]:
60
+ """Execute the development_check Plan. If progress_q is provided
61
+ (a queue.Queue), each finalized step record is put on it so a
62
+ streaming endpoint can render the trace live.
63
+
64
+ strict=True routes through Mellea-validated reconciliation (rejection
65
+ sampling against four grounding requirements). Disables token
66
+ streaming — the briefing arrives in one shot after Mellea's loop
67
+ settles. Trace gains a `mellea_validate` row with rerolls + which
68
+ requirements passed.
69
+ """
70
+ t0 = time.time()
71
+ trace: list[dict] = []
72
+
73
+ def _emit(r: dict):
74
+ if progress_q is not None:
75
+ progress_q.put({"kind": "step", **r})
76
+
77
+ target_text = next(
78
+ (t["text"] for t in plan.targets if t.get("type") in ("nta", "borough")),
79
+ None,
80
+ )
81
+ rec = {"step": "nta_resolve", "started_at": t0, "ok": False}
82
+ trace.append(rec)
83
+ # Try the planner's target first; if it didn't pick one, fall back to
84
+ # scanning the raw query text for any known neighborhood/borough name.
85
+ matches = nta.resolve(target_text) if target_text else []
86
+ if not matches:
87
+ log.info("planner gave no usable target (%r); scanning query %r",
88
+ target_text, query)
89
+ matches = nta.resolve_from_text(query)
90
+ if not matches:
91
+ rec["err"] = f"no NTA match in target={target_text!r} or query={query!r}"
92
+ rec["elapsed_s"] = round(time.time() - t0, 2)
93
+ return _empty(plan, query, trace, error=rec["err"])
94
+ target = matches[0]
95
+ rec["ok"] = True
96
+ rec["result"] = {"nta_code": target["nta_code"],
97
+ "nta_name": target["nta_name"],
98
+ "borough": target["borough"],
99
+ "bbox": list(target["geometry"].bounds)}
100
+ rec["elapsed_s"] = round(time.time() - t0, 2)
101
+ _emit(rec)
102
+
103
+ poly = target["geometry"]
104
+ docs: list[dict] = []
105
+ permits_summary = None
106
+ rag_out: list = []
107
+
108
+ # ---- DOB permits ------------------------------------------------------
109
+ p_t0 = time.time()
110
+ prec = {"step": "dob_permits_nta", "started_at": p_t0, "ok": False}
111
+ trace.append(prec)
112
+ try:
113
+ # top_n=5: 5 flagged projects in the doc context is the sweet spot —
114
+ # rich enough for a journalist briefing, cheap enough to stay under
115
+ # ~25 s reconcile on T4 with the 8b model.
116
+ permits_summary = dob_permits.summary_for_polygon(poly, top_n=5)
117
+ prec["ok"] = True
118
+ prec["result"] = {
119
+ "n_total": permits_summary["n_total"],
120
+ "n_in_sandy": permits_summary["n_in_sandy"],
121
+ "n_in_dep_any": permits_summary["n_in_dep_any"],
122
+ # Pin data so the UI can render permits the moment this step
123
+ # finishes, instead of waiting for the `final` event.
124
+ "all_pins": permits_summary["all_pins"],
125
+ }
126
+ except Exception as e:
127
+ prec["err"] = str(e)
128
+ log.exception("dob_permits failed")
129
+ prec["elapsed_s"] = round(time.time() - p_t0, 2)
130
+ _emit(prec)
131
+
132
+ # ---- RAG --------------------------------------------------------------
133
+ if "rag" in plan.specialists:
134
+ r_t0 = time.time()
135
+ rrec = {"step": "rag_dev", "started_at": r_t0, "ok": False}
136
+ trace.append(rrec)
137
+ try:
138
+ q = (f"flood resilience new construction development {target['nta_name']} "
139
+ f"{target['borough']} hardening building code")
140
+ rag_out = rag_retrieve(q, k=2, min_score=0.50)
141
+ rrec["ok"] = True
142
+ rrec["result"] = {"hits": len(rag_out)}
143
+ except Exception as e:
144
+ rrec["err"] = str(e)
145
+ rrec["elapsed_s"] = round(time.time() - r_t0, 2)
146
+ _emit(rrec)
147
+
148
+ # ---- documents --------------------------------------------------------
149
+ docs.append(_doc("nta_resolve", [
150
+ "Source: NYC DCP Neighborhood Tabulation Areas 2020.",
151
+ f"Target neighborhood: {target['nta_name']} (NTA {target['nta_code']}), "
152
+ f"in the borough of {target['borough']}.",
153
+ ]))
154
+ if permits_summary:
155
+ ps = permits_summary
156
+ body = [
157
+ "Source: NYC DOB Permit Issuance (Socrata ipu4-2q9a), filtered to "
158
+ "active New Building / Major Alteration / Demolition jobs in the "
159
+ "trailing 18 months. Cross-referenced with NYC Sandy 2012 "
160
+ "inundation extent and 3 DEP Stormwater scenarios.",
161
+ f"Total active major-construction projects in {target['nta_name']}: "
162
+ f"{ps['n_total']}.",
163
+ f"Of these: {ps['n_in_sandy']} fall inside the 2012 Sandy "
164
+ f"inundation zone; {ps['n_in_dep_any']} fall inside at least one "
165
+ f"DEP Stormwater scenario; {ps['n_in_dep_severe']} fall in the "
166
+ f"deeper DEP bands (1-4 ft or >4 ft).",
167
+ ]
168
+ if ps.get("by_job_type"):
169
+ mix = "; ".join(f"{n} {k}" for k, n in ps["by_job_type"].items())
170
+ body.append(f"Job-type mix: {mix}.")
171
+ for p in ps["flagged_top"]:
172
+ scen_str = (", ".join(p["dep_scenarios"]) or "none")
173
+ body.append(
174
+ f"- {p['address']}, {p['borough']} (BBL {p.get('bbl') or 'unknown'}). "
175
+ f"{p['job_type_label']}, permit issued {p['issuance_date']}, "
176
+ f"status {p['permit_status']}. "
177
+ f"Owner: {p.get('owner_business') or 'unknown'}. "
178
+ f"In Sandy zone: {p['in_sandy']}; in DEP scenarios: {scen_str}; "
179
+ f"max DEP depth class: {p['dep_max_class']}."
180
+ )
181
+ docs.append(_doc("dob_permits", body))
182
+ for h in rag_out:
183
+ docs.append(_doc(h["doc_id"], [
184
+ f"Source: {h['citation']}, page {h.get('page', '')}.",
185
+ f"Retrieved passage (verbatim): {h['text']}",
186
+ ]))
187
+
188
+ # ---- reconcile --------------------------------------------------------
189
+ rec_t0 = time.time()
190
+ rec_step = {"step": "reconcile_development", "started_at": rec_t0, "ok": False}
191
+ trace.append(rec_step)
192
+ paragraph = ""
193
+ audit = {"raw": "", "dropped": []}
194
+ mellea_meta = None
195
+ if len(docs) <= 1:
196
+ paragraph = ("**Status.** No active construction permit data available "
197
+ f"for {target['nta_name']} [nta_resolve].")
198
+ audit = {"raw": paragraph, "dropped": []}
199
+ rec_step["ok"] = True
200
+ elif strict:
201
+ # Streaming Mellea path: tokens stream during each attempt; on
202
+ # validation failure we emit a mellea_attempt event and reroll.
203
+ rec_step["step"] = "mellea_reconcile_development"
204
+ try:
205
+ from app.mellea_validator import DEFAULT_LOOP_BUDGET, reconcile_strict_streaming
206
+ from app.reconcile import trim_docs_to_plan as _trim
207
+ docs = _trim(docs, set(plan.specialists or []))
208
+ def _on_token(delta: str, attempt_idx: int):
209
+ if progress_q is not None:
210
+ progress_q.put({"kind": "token", "delta": delta,
211
+ "attempt": attempt_idx})
212
+ def _on_attempt_end(attempt_idx, passed, failed):
213
+ if progress_q is not None:
214
+ progress_q.put({"kind": "mellea_attempt",
215
+ "attempt": attempt_idx,
216
+ "passed": passed, "failed": failed})
217
+ mres = reconcile_strict_streaming(
218
+ docs, EXTRA_SYSTEM_PROMPT,
219
+ user_prompt="Write the development briefing now.",
220
+ model=OLLAMA_MODEL, loop_budget=DEFAULT_LOOP_BUDGET,
221
+ on_token=_on_token if progress_q else None,
222
+ on_attempt_end=_on_attempt_end if progress_q else None,
223
+ )
224
+ paragraph = mres["paragraph"]
225
+ audit = {"raw": paragraph, "dropped": []}
226
+ mellea_meta = {
227
+ "rerolls": mres["rerolls"],
228
+ "n_attempts": mres["n_attempts"],
229
+ "requirements_passed": mres["requirements_passed"],
230
+ "requirements_failed": mres["requirements_failed"],
231
+ "requirements_total": mres["requirements_total"],
232
+ "model": mres["model"],
233
+ "loop_budget": mres["loop_budget"],
234
+ }
235
+ rec_step["ok"] = True
236
+ rec_step["result"] = {
237
+ "rerolls": mellea_meta["rerolls"],
238
+ "passed": f"{len(mellea_meta['requirements_passed'])}/{mellea_meta['requirements_total']}",
239
+ "paragraph_chars": len(paragraph),
240
+ }
241
+ except Exception as e:
242
+ rec_step["err"] = str(e)
243
+ log.exception("Mellea-validated reconcile failed")
244
+ paragraph = ""
245
+ audit = {"raw": "", "dropped": []}
246
+ else:
247
+ def _on_token(delta: str):
248
+ if progress_q is not None:
249
+ progress_q.put({"kind": "token", "delta": delta})
250
+ try:
251
+ paragraph, audit = _reconcile(docs, on_token=_on_token if progress_q else None)
252
+ rec_step["ok"] = True
253
+ rec_step["result"] = {"paragraph_chars": len(paragraph),
254
+ "dropped": len(audit["dropped"])}
255
+ except Exception as e:
256
+ rec_step["err"] = str(e)
257
+ log.exception("development reconcile failed")
258
+ rec_step["elapsed_s"] = round(time.time() - rec_t0, 2)
259
+ _emit(rec_step)
260
+
261
+ target_safe = {k: v for k, v in target.items() if k != "geometry"}
262
+ target_safe["bbox"] = list(target["geometry"].bounds)
263
+ return {
264
+ "intent": "development_check",
265
+ "query": query,
266
+ "plan": {
267
+ "intent": plan.intent,
268
+ "targets": plan.targets,
269
+ "specialists": plan.specialists,
270
+ "rationale": plan.rationale,
271
+ },
272
+ "target": target_safe,
273
+ "n_matches": len(matches),
274
+ "dob_summary": permits_summary,
275
+ "rag": rag_out,
276
+ "paragraph": paragraph,
277
+ "audit": audit,
278
+ "mellea": mellea_meta,
279
+ "trace": trace,
280
+ "total_s": round(time.time() - t0, 2),
281
+ }
282
+
283
+
284
+ def _doc(doc_id: str, body_lines: list[str]) -> dict:
285
+ return {"role": f"document {doc_id}", "content": "\n".join(body_lines)}
286
+
287
+
288
+ def _reconcile(docs: list[dict], on_token=None) -> tuple[str, dict]:
289
+ from app.reconcile import verify_paragraph
290
+ messages = docs + [
291
+ {"role": "system", "content": EXTRA_SYSTEM_PROMPT},
292
+ {"role": "user", "content": "Write the development briefing now."},
293
+ ]
294
+ # num_ctx 6144 covers a typical dev_check prompt: system ~700 + nta
295
+ # doc + DOB body with 5 flagged projects ~3000 + RAG hits ~1000.
296
+ # 12288 was over-allocating KV cache — costly on T4. num_predict caps
297
+ # the briefing at ~600 tokens (4 sections + 5 bullet projects).
298
+ OPTS = {"temperature": 0, "num_ctx": 6144, "num_predict": 600}
299
+ if on_token is None:
300
+ resp = llm.chat(model=OLLAMA_MODEL, messages=messages, options=OPTS)
301
+ raw = resp["message"]["content"].strip()
302
+ else:
303
+ chunks: list[str] = []
304
+ for chunk in llm.chat(model=OLLAMA_MODEL, messages=messages,
305
+ stream=True, options=OPTS):
306
+ delta = (chunk.get("message") or {}).get("content") or ""
307
+ if delta:
308
+ chunks.append(delta)
309
+ on_token(delta)
310
+ raw = "".join(chunks).strip()
311
+ cleaned, dropped = verify_paragraph(raw, docs)
312
+ return cleaned, {"raw": raw, "dropped": dropped}
313
+
314
+
315
+ def _empty(plan, query, trace, error):
316
+ return {
317
+ "intent": "development_check",
318
+ "query": query,
319
+ "error": error,
320
+ "plan": {"intent": plan.intent, "targets": plan.targets,
321
+ "specialists": plan.specialists, "rationale": plan.rationale},
322
+ "trace": trace,
323
+ "paragraph": f"Could not resolve target to an NTA: {error}",
324
+ }
app/intents/live_now.py ADDED
@@ -0,0 +1,231 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """live_now intent — only fire live specialists. No geocode, no static
2
+ historic/modeled layers. Reconciler emits a "right now" status note.
3
+
4
+ Targets are usually `{"type": "nyc"}` for the whole city; if the user
5
+ named a specific borough we still query at the same gauges (NOAA only
6
+ has 3 NYC stations) and the same NWS forecast zones (the API takes a
7
+ lat/lon point — we use a borough centroid).
8
+ """
9
+ from __future__ import annotations
10
+
11
+ import logging
12
+ import time
13
+ from typing import Any
14
+
15
+ from app import llm
16
+ from app.context import noaa_tides, nws_alerts, nws_obs
17
+ from app.live import ttm_forecast
18
+
19
+ log = logging.getLogger("riprap.intent.live_now")
20
+
21
+ import os as _os # noqa: E402
22
+
23
+ # live_now stays on the smaller model: short outputs, speed matters more.
24
+ OLLAMA_MODEL = _os.environ.get("RIPRAP_LIVE_MODEL",
25
+ _os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:3b"))
26
+
27
+ # NWS API requires a lat/lon point; pick a representative one per borough.
28
+ BOROUGH_POINTS = {
29
+ "Manhattan": (40.7831, -73.9712), # Central Park
30
+ "Brooklyn": (40.6500, -73.9500), # Park Slope-ish
31
+ "Queens": (40.7282, -73.7949), # Forest Hills
32
+ "Bronx": (40.8448, -73.8648), # Fordham
33
+ "Staten Island": (40.5795, -74.1502), # central SI
34
+ "NYC": (40.7128, -74.0060), # Lower Manhattan default
35
+ }
36
+
37
+
38
+ EXTRA_SYSTEM_PROMPT = """Write a current-conditions flood briefing for NYC. Use ONLY the facts in the provided documents.
39
+
40
+ Output this markdown skeleton verbatim, filling each `<...>` with content drawn only from the documents. After every numerical claim, append the document id in square brackets — e.g. `<value> [noaa_tides]`. Bold at most one phrase per section using `**...**`. Omit any section whose supporting facts are absent from the documents.
41
+
42
+ ```
43
+ **Status.**
44
+ <one sentence on whether flood-relevant conditions are active right now, citing the strongest live signal>.
45
+
46
+ **Live signals.**
47
+ <1-3 sentences citing each live signal that fired: NWS alerts from [nws_alerts], tide observation and residual from [noaa_tides], recent precipitation from [nws_obs], any TTM forecast peak from [ttm_forecast]>.
48
+ ```
49
+
50
+ Constraints:
51
+ - Be brief — current-conditions reports are read in seconds.
52
+ - Copy numerical values verbatim from documents. Do not round.
53
+ - Do not invoke historic events (Sandy, Ida) — this is a now-only report.
54
+ - If every live document indicates calm, write only: `**Status.** No active flood-relevant signals at this time [live_target].`
55
+ """
56
+
57
+
58
+ def run(plan, query: str, progress_q=None) -> dict[str, Any]:
59
+ t0 = time.time()
60
+ trace: list[dict] = []
61
+
62
+ def _emit(r: dict):
63
+ if progress_q is not None:
64
+ progress_q.put({"kind": "step", **r})
65
+
66
+ boro = next((t.get("text") for t in plan.targets if t.get("type") == "borough"), None)
67
+ if boro and boro in BOROUGH_POINTS:
68
+ lat, lon = BOROUGH_POINTS[boro]
69
+ place = boro
70
+ else:
71
+ lat, lon = BOROUGH_POINTS["NYC"]
72
+ place = "NYC"
73
+
74
+ docs: list[dict] = []
75
+ tides_out = alerts_out = obs_out = ttm_out = None
76
+
77
+ if "noaa_tides" in plan.specialists:
78
+ tides_out = _run_step(trace, "noaa_tides", lambda: noaa_tides.summary_for_point(lat, lon), progress_q)
79
+ if "nws_alerts" in plan.specialists:
80
+ alerts_out = _run_step(trace, "nws_alerts", lambda: nws_alerts.summary_for_point(lat, lon), progress_q)
81
+ if "nws_obs" in plan.specialists:
82
+ obs_out = _run_step(trace, "nws_obs", lambda: nws_obs.summary_for_point(lat, lon), progress_q)
83
+ if "ttm_forecast" in plan.specialists:
84
+ ttm_out = _run_step(trace, "ttm_forecast", lambda: ttm_forecast.summary_for_point(lat, lon), progress_q)
85
+
86
+ # ---- documents ----
87
+ docs.append({"role": "document live_target", "content":
88
+ f"Source: planner. Live-conditions report for {place}. "
89
+ f"Coordinates used for NWS lookups: {lat:.4f}, {lon:.4f}."})
90
+
91
+ if tides_out and tides_out.get("observed_ft_mllw") is not None:
92
+ body = [
93
+ f"Source: NOAA CO-OPS Tides & Currents. Station: {tides_out['station_name']} "
94
+ f"(NOAA {tides_out['station_id']}, {tides_out['distance_km']} km from {place}).",
95
+ f"Observation time: {tides_out.get('obs_time') or 'unknown'}.",
96
+ f"Observed water level: {tides_out['observed_ft_mllw']} ft above MLLW.",
97
+ ]
98
+ if tides_out.get("predicted_ft_mllw") is not None:
99
+ body.append(f"Astronomical tide prediction at the same instant: "
100
+ f"{tides_out['predicted_ft_mllw']} ft.")
101
+ if tides_out.get("residual_ft") is not None:
102
+ body.append(f"Residual (observed - predicted): {tides_out['residual_ft']} ft. "
103
+ f"Positive = surge component above tide; negative = setdown.")
104
+ docs.append(_doc("noaa_tides", body))
105
+
106
+ if alerts_out and alerts_out.get("n_active", 0) > 0:
107
+ body = [f"Source: NWS Public Alerts API. Active flood-relevant alerts: "
108
+ f"{alerts_out['n_active']}."]
109
+ for a in alerts_out["alerts"][:4]:
110
+ body.append(
111
+ f"- {a.get('event','?')} (severity: {a.get('severity','?')}, "
112
+ f"urgency: {a.get('urgency','?')}); expires {a.get('expires','')[:16]}; "
113
+ f"area: {(a.get('areaDesc') or '')[:120]}."
114
+ )
115
+ if a.get("headline"):
116
+ body.append(f" Headline: {a['headline'][:240]}")
117
+ docs.append(_doc("nws_alerts", body))
118
+
119
+ if obs_out and (obs_out.get("precip_last_hour_mm") is not None
120
+ or obs_out.get("precip_last_6h_mm") is not None):
121
+ body = [
122
+ f"Source: NWS Station Observations. Nearest ASOS: {obs_out['station_name']} "
123
+ f"({obs_out['station_id']}, {obs_out['distance_km']} km).",
124
+ f"Observation time: {obs_out.get('obs_time') or 'unknown'}.",
125
+ ]
126
+ if obs_out.get("precip_last_hour_mm") is not None:
127
+ body.append(f"Precipitation last 1 h: {obs_out['precip_last_hour_mm']} mm.")
128
+ if obs_out.get("precip_last_6h_mm") is not None:
129
+ body.append(f"Precipitation last 6 h: {obs_out['precip_last_6h_mm']} mm.")
130
+ docs.append(_doc("nws_obs", body))
131
+
132
+ if ttm_out and ttm_out.get("available") and ttm_out.get("interesting"):
133
+ docs.append(_doc("ttm_forecast", [
134
+ "Source: Granite TimeSeries TTM r2 (Ekambaram et al. 2024).",
135
+ f"Forecast peak surge residual at {ttm_out['station_name']}: "
136
+ f"{ttm_out['forecast_peak_ft']} ft, expected "
137
+ f"{ttm_out['forecast_peak_minutes_ahead']} minutes from now.",
138
+ f"Recent peak |residual| in context window: "
139
+ f"{ttm_out['history_peak_abs_ft']} ft.",
140
+ ]))
141
+
142
+ # ---- reconcile ----
143
+ rec_t0 = time.time()
144
+ rec_step = {"step": "reconcile_live_now", "started_at": rec_t0, "ok": False}
145
+ trace.append(rec_step)
146
+ if not docs or len(docs) == 1: # only the live_target doc, no actual signals
147
+ paragraph = ("**Status.** **No active flood-relevant signals at this time** for "
148
+ f"{place} [live_target].")
149
+ audit = {"raw": paragraph, "dropped": []}
150
+ rec_step["ok"] = True
151
+ else:
152
+ def _on_token(delta: str):
153
+ if progress_q is not None:
154
+ progress_q.put({"kind": "token", "delta": delta})
155
+ try:
156
+ paragraph, audit = _reconcile(docs, on_token=_on_token if progress_q else None)
157
+ rec_step["ok"] = True
158
+ except Exception as e:
159
+ paragraph = "Could not produce a live-conditions report."
160
+ audit = {"raw": "", "dropped": []}
161
+ rec_step["err"] = str(e)
162
+ rec_step["elapsed_s"] = round(time.time() - rec_t0, 2)
163
+ _emit(rec_step)
164
+
165
+ return {
166
+ "intent": "live_now",
167
+ "query": query,
168
+ "place": place,
169
+ "plan": {
170
+ "intent": plan.intent,
171
+ "targets": plan.targets,
172
+ "specialists": plan.specialists,
173
+ "rationale": plan.rationale,
174
+ },
175
+ "noaa_tides": tides_out,
176
+ "nws_alerts": alerts_out,
177
+ "nws_obs": obs_out,
178
+ "ttm_forecast": ttm_out,
179
+ "paragraph": paragraph,
180
+ "audit": audit,
181
+ "trace": trace,
182
+ "total_s": round(time.time() - t0, 2),
183
+ }
184
+
185
+
186
+ def _run_step(trace: list, name: str, fn, progress_q=None) -> Any:
187
+ t0 = time.time()
188
+ rec = {"step": name, "started_at": t0, "ok": False}
189
+ trace.append(rec)
190
+ try:
191
+ out = fn()
192
+ rec["ok"] = True
193
+ rec["result"] = {k: out.get(k) for k in list(out.keys())[:3]} if isinstance(out, dict) else None
194
+ return out
195
+ except Exception as e:
196
+ rec["err"] = str(e)
197
+ log.exception("%s failed", name)
198
+ return None
199
+ finally:
200
+ rec["elapsed_s"] = round(time.time() - t0, 2)
201
+ if progress_q is not None:
202
+ progress_q.put({"kind": "step", **rec})
203
+
204
+
205
+ def _doc(doc_id: str, body_lines: list[str]) -> dict:
206
+ return {"role": f"document {doc_id}", "content": "\n".join(body_lines)}
207
+
208
+
209
+ def _reconcile(docs: list[dict], on_token=None) -> tuple[str, dict]:
210
+ from app.reconcile import verify_paragraph
211
+ messages = docs + [
212
+ {"role": "system", "content": EXTRA_SYSTEM_PROMPT},
213
+ {"role": "user", "content": "Write the live-conditions briefing now."},
214
+ ]
215
+ # live_now is the smallest intent: ~4 live docs, short briefing.
216
+ # num_predict 200 caps to a 2-section status note.
217
+ OPTS = {"temperature": 0, "num_ctx": 2048, "num_predict": 200}
218
+ if on_token is None:
219
+ resp = llm.chat(model=OLLAMA_MODEL, messages=messages, options=OPTS)
220
+ raw = resp["message"]["content"].strip()
221
+ else:
222
+ chunks: list[str] = []
223
+ for chunk in llm.chat(model=OLLAMA_MODEL, messages=messages,
224
+ stream=True, options=OPTS):
225
+ delta = (chunk.get("message") or {}).get("content") or ""
226
+ if delta:
227
+ chunks.append(delta)
228
+ on_token(delta)
229
+ raw = "".join(chunks).strip()
230
+ cleaned, dropped = verify_paragraph(raw, docs)
231
+ return cleaned, {"raw": raw, "dropped": dropped}
app/intents/neighborhood.py ADDED
@@ -0,0 +1,492 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """neighborhood intent — resolve target text to one or more NTA polygons,
2
+ then run polygon-level specialists and reconcile.
3
+
4
+ The set of polygon-capable specialists is currently:
5
+ - sandy_inundation.coverage_for_polygon
6
+ - dep_stormwater.coverage_for_polygon (per scenario)
7
+ - nyc311.summary_for_polygon
8
+ - microtopo.microtopo_for_polygon
9
+
10
+ Other specialists (FloodNet, Ida HWM, Prithvi) are still point-based;
11
+ in Phase 2 we'll add polygon support for them. For now, neighborhood
12
+ mode produces the four signals above + RAG, and the reconciler emits
13
+ a structurally-different briefing aimed at a place rather than an
14
+ address.
15
+ """
16
+ from __future__ import annotations
17
+
18
+ import logging
19
+ import time
20
+ from typing import Any
21
+
22
+ from app import llm
23
+ from app.areas import nta
24
+ from app.context import microtopo, nyc311
25
+ from app.flood_layers import dep_stormwater, sandy_inundation
26
+ from app.rag import retrieve as rag_retrieve
27
+
28
+ log = logging.getLogger("riprap.intent.neighborhood")
29
+
30
+ import os as _os # noqa: E402
31
+
32
+ OLLAMA_MODEL = _os.environ.get("RIPRAP_RECONCILER_MODEL",
33
+ _os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:8b"))
34
+
35
+ EXTRA_SYSTEM_PROMPT = """Write a flood-exposure briefing for an NYC neighborhood. Use ONLY the facts in the provided documents.
36
+
37
+ Output this markdown skeleton verbatim, filling each `<...>` with content drawn only from the documents. After every numerical claim, append the document id in square brackets — e.g. `<value> [sandy_nta]`. Bold at most one phrase per section using `**...**`. Omit any section whose supporting facts are absent from the documents.
38
+
39
+ ```
40
+ **Status.**
41
+ <one sentence: name the neighborhood from [nta_resolve] and the dominant exposure pattern>.
42
+
43
+ **Empirical evidence.**
44
+ <1-3 sentences citing observed flood evidence: Sandy coverage from [sandy_nta], 311 counts from [nyc311_nta], any FloodNet or HWM signals>.
45
+
46
+ **Modeled scenarios.**
47
+ <1-2 sentences citing modeled flooding from [dep_*_nta] (fraction of polygon in each scenario) and terrain from [microtopo_nta] (median HAND, fraction of polygon with HAND below 1 m)>.
48
+
49
+ **Policy context.**
50
+ <1 sentence per RAG hit, citing the agency name and [rag_*]>.
51
+ ```
52
+
53
+ Constraints:
54
+ - Copy numerical values verbatim from documents. Do not round, paraphrase, or substitute.
55
+ - Speak about the place as a polygon (use phrases like "of the neighborhood" or "of the NTA"), not as an address.
56
+ - If only [nta_resolve] is present and no other documents, output exactly: `No grounded data available for this neighborhood.`
57
+ """
58
+
59
+
60
+ def run(plan, query: str, progress_q=None, strict: bool = False) -> dict[str, Any]:
61
+ """Execute the planner's neighborhood Plan.
62
+
63
+ Resolves all targets to NTAs, picks the largest matching NTA (or the
64
+ first if multiple equally good), runs the polygon specialists, and
65
+ reconciles via Granite 4.1.
66
+
67
+ strict=True routes the reconciler through Mellea-validated rejection
68
+ sampling. Disables token streaming.
69
+ """
70
+ t0 = time.time()
71
+ trace: list[dict] = []
72
+
73
+ def _emit(r: dict):
74
+ if progress_q is not None:
75
+ progress_q.put({"kind": "step", **r})
76
+
77
+ # Resolve targets to NTAs. Try the planner's pick first; if it gave no
78
+ # usable target, scan the raw query text for any known neighborhood name.
79
+ target_text = next(
80
+ (t["text"] for t in plan.targets if t.get("type") in ("nta", "borough")),
81
+ None,
82
+ )
83
+ rec = {"step": "nta_resolve", "started_at": t0, "ok": False}
84
+ trace.append(rec)
85
+ matches = nta.resolve(target_text) if target_text else []
86
+ if not matches:
87
+ matches = nta.resolve_from_text(query)
88
+ if not matches:
89
+ rec["err"] = f"no NTA match in target={target_text!r} or query={query!r}"
90
+ rec["elapsed_s"] = round(time.time() - t0, 2)
91
+ return _empty_result(plan, query, trace, error=rec["err"])
92
+ target = matches[0]
93
+ rec["ok"] = True
94
+ rec["result"] = {
95
+ "nta_code": target["nta_code"],
96
+ "nta_name": target["nta_name"],
97
+ "borough": target["borough"],
98
+ "n_matches": len(matches),
99
+ # Bbox lets the UI fly-to and render the polygon while the rest
100
+ # of the specialists are still running.
101
+ "bbox": list(target["geometry"].bounds),
102
+ }
103
+ rec["elapsed_s"] = round(time.time() - t0, 2)
104
+ _emit(rec)
105
+
106
+ poly = target["geometry"]
107
+ docs: list[dict] = []
108
+ sandy_out = None
109
+ dep_out = {}
110
+ nyc311_out = None
111
+ micro_out = None
112
+ rag_out = []
113
+ prithvi_live_out = None
114
+ terramind_out = None
115
+
116
+ # ---- sandy ----
117
+ if "sandy" in plan.specialists:
118
+ s_t0 = time.time()
119
+ srec = {"step": "sandy_nta", "started_at": s_t0, "ok": False}
120
+ trace.append(srec)
121
+ try:
122
+ sandy_out = sandy_inundation.coverage_for_polygon(poly)
123
+ srec["ok"] = True
124
+ srec["result"] = {"fraction": sandy_out["fraction"], "inside": sandy_out["inside"]}
125
+ except Exception as e:
126
+ srec["err"] = str(e)
127
+ log.exception("sandy polygon failed")
128
+ srec["elapsed_s"] = round(time.time() - s_t0, 2)
129
+ _emit(srec)
130
+
131
+ # ---- dep_stormwater ----
132
+ if "dep_stormwater" in plan.specialists:
133
+ for scen in ["dep_extreme_2080", "dep_moderate_2050", "dep_moderate_current"]:
134
+ d_t0 = time.time()
135
+ drec = {"step": f"{scen}_nta", "started_at": d_t0, "ok": False}
136
+ trace.append(drec)
137
+ try:
138
+ cov = dep_stormwater.coverage_for_polygon(poly, scen)
139
+ dep_out[scen] = cov
140
+ drec["ok"] = True
141
+ drec["result"] = {"fraction_any": cov["fraction_any"]}
142
+ except Exception as e:
143
+ drec["err"] = str(e)
144
+ log.exception("%s polygon failed", scen)
145
+ drec["elapsed_s"] = round(time.time() - d_t0, 2)
146
+ _emit(drec)
147
+
148
+ # ---- nyc311 ----
149
+ if "nyc311" in plan.specialists:
150
+ n_t0 = time.time()
151
+ nrec = {"step": "nyc311_nta", "started_at": n_t0, "ok": False}
152
+ trace.append(nrec)
153
+ try:
154
+ nyc311_out = nyc311.summary_for_polygon(poly, years=3)
155
+ nrec["ok"] = True
156
+ nrec["result"] = {"n": nyc311_out["n"]}
157
+ except Exception as e:
158
+ nrec["err"] = str(e)
159
+ log.exception("nyc311 polygon failed")
160
+ nrec["elapsed_s"] = round(time.time() - n_t0, 2)
161
+ _emit(nrec)
162
+
163
+ # ---- microtopo ----
164
+ if "microtopo" in plan.specialists:
165
+ m_t0 = time.time()
166
+ mrec = {"step": "microtopo_nta", "started_at": m_t0, "ok": False}
167
+ trace.append(mrec)
168
+ try:
169
+ micro_out = microtopo.microtopo_for_polygon(poly)
170
+ mrec["ok"] = micro_out is not None
171
+ mrec["result"] = {
172
+ "elev_median_m": (micro_out or {}).get("elev_median_m"),
173
+ "frac_hand_lt1": (micro_out or {}).get("frac_hand_lt1"),
174
+ }
175
+ except Exception as e:
176
+ mrec["err"] = str(e)
177
+ log.exception("microtopo polygon failed")
178
+ mrec["elapsed_s"] = round(time.time() - m_t0, 2)
179
+ _emit(mrec)
180
+
181
+ # ---- Prithvi-EO live water mask (NTA centroid) ----
182
+ # Polygon-scoped queries don't have a single point of interest, but
183
+ # the NTA centroid is a fair sampling point: the 5 km chip the
184
+ # specialist fetches comfortably covers any NTA. The reconciler
185
+ # gets an `[prithvi_live]` doc with the % water observed today, and
186
+ # the frontend gets a GeoJSON layer to paint over the NTA polygon.
187
+ try:
188
+ from app.flood_layers import prithvi_live as plive_mod
189
+ if plive_mod.ENABLE:
190
+ p_t0 = time.time()
191
+ prec = {"step": "prithvi_eo_live", "started_at": p_t0, "ok": False}
192
+ trace.append(prec)
193
+ centroid = poly.centroid
194
+ prithvi_live_out = plive_mod.fetch(centroid.y, centroid.x)
195
+ prec["ok"] = bool(prithvi_live_out and prithvi_live_out.get("ok"))
196
+ if prec["ok"]:
197
+ prec["result"] = {
198
+ "scene_date": (prithvi_live_out.get("item_datetime") or "")[:10],
199
+ "cloud_cover": prithvi_live_out.get("cloud_cover"),
200
+ "pct_water_5km": prithvi_live_out.get("pct_water_full"),
201
+ }
202
+ else:
203
+ prec["err"] = (prithvi_live_out or {}).get("err") \
204
+ or (prithvi_live_out or {}).get("skipped") or "no observation"
205
+ prec["elapsed_s"] = round(time.time() - p_t0, 2)
206
+ _emit(prec)
207
+ except Exception as e:
208
+ log.exception("prithvi_live (neighborhood) failed")
209
+ prithvi_live_out = {"ok": False, "err": str(e)}
210
+
211
+ # ---- TerraMind synthesis (NTA centroid) ----
212
+ # Generative-prior tier — synthesized ESRI Land Cover from the
213
+ # local LiDAR DEM at the NTA centroid. Renders as dashed-outline
214
+ # polygons on the map alongside the polygon-aggregated specialists.
215
+ try:
216
+ from app.context import terramind_synthesis as tm_mod
217
+ if tm_mod.ENABLE:
218
+ t_t0 = time.time()
219
+ trec = {"step": "terramind_synthesis", "started_at": t_t0, "ok": False}
220
+ trace.append(trec)
221
+ centroid = poly.centroid
222
+ terramind_out = tm_mod.fetch(centroid.y, centroid.x)
223
+ trec["ok"] = bool(terramind_out and terramind_out.get("ok"))
224
+ if trec["ok"]:
225
+ trec["result"] = {
226
+ "tim_chain": terramind_out.get("tim_chain"),
227
+ "dominant_class": terramind_out.get("dominant_class_display")
228
+ or terramind_out.get("dominant_class"),
229
+ "dominant_pct": terramind_out.get("dominant_pct"),
230
+ "n_classes": terramind_out.get("n_classes_observed"),
231
+ }
232
+ else:
233
+ trec["err"] = (terramind_out or {}).get("err") \
234
+ or (terramind_out or {}).get("skipped") or "no synthesis"
235
+ trec["elapsed_s"] = round(time.time() - t_t0, 2)
236
+ _emit(trec)
237
+ except Exception as e:
238
+ log.exception("terramind (neighborhood) failed")
239
+ terramind_out = {"ok": False, "err": str(e)}
240
+
241
+ # ---- rag ----
242
+ if "rag" in plan.specialists:
243
+ r_t0 = time.time()
244
+ rrec = {"step": "rag_nta", "started_at": r_t0, "ok": False}
245
+ trace.append(rrec)
246
+ try:
247
+ q = (f"flood exposure {target['nta_name']} {target['borough']} "
248
+ "vulnerability hardening mitigation")
249
+ rag_out = rag_retrieve(q, k=3, min_score=0.45)
250
+ rrec["ok"] = True
251
+ rrec["result"] = {"hits": len(rag_out)}
252
+ except Exception as e:
253
+ rrec["err"] = str(e)
254
+ log.exception("rag polygon failed")
255
+ rrec["elapsed_s"] = round(time.time() - r_t0, 2)
256
+ _emit(rrec)
257
+
258
+ # ---- build documents ----
259
+ docs.append(_doc("nta_resolve", [
260
+ "Source: NYC DCP Neighborhood Tabulation Areas 2020.",
261
+ f"Target neighborhood: {target['nta_name']} (NTA {target['nta_code']}), "
262
+ f"in the borough of {target['borough']}.",
263
+ f"Community District: {target.get('cdta') or 'unknown'}.",
264
+ ]))
265
+ if sandy_out and sandy_out["inside"]:
266
+ docs.append(_doc("sandy_nta", [
267
+ "Source: NYC Sandy Inundation Zone (NYC OD 5xsi-dfpx).",
268
+ f"Fraction of {target['nta_name']} inside the 2012 inundation extent: "
269
+ f"{sandy_out['fraction'] * 100:.1f}%.",
270
+ f"Total NTA area: {sandy_out['polygon_area_m2']/1e6:.2f} km².",
271
+ ]))
272
+ for scen, cov in dep_out.items():
273
+ if cov["fraction_any"] > 0:
274
+ cls = cov["fraction_class"]
275
+ docs.append(_doc(f"{scen}_nta", [
276
+ f"Source: {cov['label']}.",
277
+ f"Fraction of {target['nta_name']} inside any modeled flooded area: "
278
+ f"{cov['fraction_any'] * 100:.1f}%.",
279
+ f"Of which: {cls.get(1, 0) * 100:.1f}% in nuisance band (>4 in to 1 ft), "
280
+ f"{cls.get(2, 0) * 100:.1f}% in 1-4 ft band, "
281
+ f"{cls.get(3, 0) * 100:.1f}% in >4 ft band.",
282
+ ]))
283
+ if nyc311_out and nyc311_out.get("n", 0) > 0:
284
+ body = [
285
+ "Source: NYC 311 service requests (Socrata erm2-nwe9), aggregated inside the NTA polygon.",
286
+ f"Flood-related complaints in the last 3 years inside {target['nta_name']}: "
287
+ f"{nyc311_out['n']}.",
288
+ ]
289
+ if nyc311_out.get("by_descriptor"):
290
+ top = "; ".join(f"{k}: {v}" for k, v in list(nyc311_out["by_descriptor"].items())[:3])
291
+ body.append(f"Top descriptors: {top}.")
292
+ docs.append(_doc("nyc311_nta", body))
293
+ if micro_out and micro_out.get("n_cells", 0) > 0:
294
+ body = [
295
+ "Source: USGS 3DEP DEM (precomputed citywide GeoTIFF) with derived HAND and TWI rasters; aggregated over NTA polygon.",
296
+ f"Polygon contains {micro_out['n_cells']} 30-m DEM cells.",
297
+ f"Median elevation: {micro_out['elev_median_m']} m; "
298
+ f"10th-percentile elevation: {micro_out['elev_p10_m']} m.",
299
+ ]
300
+ if micro_out.get("hand_median_m") is not None:
301
+ body.append(
302
+ f"Median HAND (Height Above Nearest Drainage): "
303
+ f"{micro_out['hand_median_m']} m. "
304
+ f"Fraction of polygon cells with HAND below 1 m "
305
+ f"(near-channel, water reaches at flood): "
306
+ f"{(micro_out.get('frac_hand_lt1') or 0) * 100:.1f}%."
307
+ )
308
+ if micro_out.get("twi_median") is not None:
309
+ body.append(
310
+ f"Median TWI: {micro_out['twi_median']}. "
311
+ f"Fraction of polygon cells with TWI > 10 (saturation-prone): "
312
+ f"{(micro_out.get('frac_twi_gt10') or 0) * 100:.1f}%."
313
+ )
314
+ docs.append(_doc("microtopo_nta", body))
315
+ if prithvi_live_out and prithvi_live_out.get("ok"):
316
+ docs.append(_doc("prithvi_live", [
317
+ "Source: Prithvi-EO 2.0 (Sen1Floods11 fine-tune) live "
318
+ "segmentation over a Sentinel-2 L2A scene from Microsoft "
319
+ f"Planetary Computer, sampled at the NTA centroid of "
320
+ f"{target['nta_name']}.",
321
+ f"Sentinel-2 scene id: {prithvi_live_out.get('item_id')}.",
322
+ f"Observation date: "
323
+ f"{(prithvi_live_out.get('item_datetime') or '')[:10]}.",
324
+ f"Cloud cover: {prithvi_live_out.get('cloud_cover', 0):.3f}%.",
325
+ f"% water across the 5 km chip around the centroid: "
326
+ f"{prithvi_live_out.get('pct_water_full', 0):.2f}.",
327
+ ]))
328
+ if terramind_out and terramind_out.get("ok"):
329
+ body = [
330
+ "Source: TerraMind 1.0 base (IBM/ESA, Apache-2.0) any-to-any "
331
+ "generative foundation model. SYNTHETIC PRIOR — generated "
332
+ "categorical land-cover from the LiDAR DEM at the NTA "
333
+ f"centroid of {target['nta_name']}; not a measurement.",
334
+ f"Chain: {' -> '.join(terramind_out.get('tim_chain') or ['DEM','LULC_synthetic'])}.",
335
+ f"Diffusion steps: {terramind_out.get('diffusion_steps')}.",
336
+ f"Diffusion seed: {terramind_out.get('diffusion_seed')}.",
337
+ f"Dominant synthetic class: "
338
+ f"{terramind_out.get('dominant_class_display') or terramind_out.get('dominant_class')} "
339
+ f"at {terramind_out.get('dominant_pct', 0):.1f}% (tentative ESRI "
340
+ "Land Cover labels).",
341
+ ]
342
+ for label, pct in (terramind_out.get("class_fractions") or {}).items():
343
+ body.append(f" - {label}: {pct:.1f}%")
344
+ body.append("Use 'TerraMind generated a plausible synthetic "
345
+ "land-cover prior' framing — never 'imaged' or "
346
+ "'reconstructed'.")
347
+ docs.append(_doc("terramind_synthetic", body))
348
+ for h in rag_out:
349
+ docs.append(_doc(h["doc_id"], [
350
+ f"Source: {h['citation']}, page {h.get('page', '')}.",
351
+ f"Retrieved passage (verbatim): {h['text']}",
352
+ ]))
353
+
354
+ # ---- reconcile ----
355
+ rec_t0 = time.time()
356
+ rec_step = {"step": "reconcile_neighborhood", "started_at": rec_t0, "ok": False}
357
+ trace.append(rec_step)
358
+ paragraph = ""
359
+ audit = {"raw": "", "dropped": []}
360
+ mellea_meta = None
361
+ if docs and strict:
362
+ rec_step["step"] = "mellea_reconcile_neighborhood"
363
+ try:
364
+ from app.mellea_validator import DEFAULT_LOOP_BUDGET, reconcile_strict_streaming
365
+ from app.reconcile import trim_docs_to_plan as _trim
366
+ docs = _trim(docs, set(plan.specialists or []))
367
+ def _on_token(delta: str, attempt_idx: int):
368
+ if progress_q is not None:
369
+ progress_q.put({"kind": "token", "delta": delta,
370
+ "attempt": attempt_idx})
371
+ def _on_attempt_end(attempt_idx, passed, failed):
372
+ if progress_q is not None:
373
+ progress_q.put({"kind": "mellea_attempt",
374
+ "attempt": attempt_idx,
375
+ "passed": passed, "failed": failed})
376
+ mres = reconcile_strict_streaming(
377
+ docs, EXTRA_SYSTEM_PROMPT,
378
+ user_prompt="Write the cited briefing now.",
379
+ model=OLLAMA_MODEL, loop_budget=DEFAULT_LOOP_BUDGET,
380
+ on_token=_on_token if progress_q else None,
381
+ on_attempt_end=_on_attempt_end if progress_q else None,
382
+ )
383
+ paragraph = mres["paragraph"]
384
+ audit = {"raw": paragraph, "dropped": []}
385
+ mellea_meta = {
386
+ "rerolls": mres["rerolls"],
387
+ "n_attempts": mres["n_attempts"],
388
+ "requirements_passed": mres["requirements_passed"],
389
+ "requirements_failed": mres["requirements_failed"],
390
+ "requirements_total": mres["requirements_total"],
391
+ "model": mres["model"], "loop_budget": mres["loop_budget"],
392
+ }
393
+ rec_step["ok"] = True
394
+ rec_step["result"] = {
395
+ "rerolls": mellea_meta["rerolls"],
396
+ "passed": f"{len(mellea_meta['requirements_passed'])}/{mellea_meta['requirements_total']}",
397
+ "paragraph_chars": len(paragraph),
398
+ }
399
+ except Exception as e:
400
+ rec_step["err"] = str(e)
401
+ log.exception("Mellea-validated reconcile failed")
402
+ elif docs:
403
+ def _on_token(delta: str):
404
+ if progress_q is not None:
405
+ progress_q.put({"kind": "token", "delta": delta})
406
+ try:
407
+ paragraph, audit = _reconcile(docs, on_token=_on_token if progress_q else None)
408
+ rec_step["ok"] = True
409
+ rec_step["result"] = {"paragraph_chars": len(paragraph),
410
+ "dropped": len(audit["dropped"])}
411
+ except Exception as e:
412
+ rec_step["err"] = str(e)
413
+ log.exception("neighborhood reconcile failed")
414
+ else:
415
+ paragraph = "No grounded data available for this neighborhood."
416
+ rec_step["ok"] = True
417
+ rec_step["result"] = {"paragraph_chars": len(paragraph)}
418
+ rec_step["elapsed_s"] = round(time.time() - rec_t0, 2)
419
+ _emit(rec_step)
420
+
421
+ target_safe = {k: v for k, v in target.items() if k != "geometry"}
422
+ target_safe["bbox"] = list(target["geometry"].bounds) # [minx, miny, maxx, maxy]
423
+ return {
424
+ "intent": "neighborhood",
425
+ "query": query,
426
+ "plan": {
427
+ "intent": plan.intent,
428
+ "targets": plan.targets,
429
+ "specialists": plan.specialists,
430
+ "rationale": plan.rationale,
431
+ },
432
+ "target": target_safe,
433
+ "n_matches": len(matches),
434
+ "sandy_nta": sandy_out,
435
+ "dep_nta": dep_out,
436
+ "nyc311_nta": nyc311_out,
437
+ "microtopo_nta": micro_out,
438
+ "prithvi_live": prithvi_live_out,
439
+ "terramind": terramind_out,
440
+ "rag": rag_out,
441
+ "paragraph": paragraph,
442
+ "audit": audit,
443
+ "mellea": mellea_meta,
444
+ "trace": trace,
445
+ "total_s": round(time.time() - t0, 2),
446
+ }
447
+
448
+
449
+ def _doc(doc_id: str, body_lines: list[str]) -> dict:
450
+ return {"role": f"document {doc_id}", "content": "\n".join(body_lines)}
451
+
452
+
453
+ def _reconcile(docs: list[dict], on_token=None) -> tuple[str, dict]:
454
+ from app.reconcile import verify_paragraph
455
+ messages = docs + [
456
+ {"role": "system", "content": EXTRA_SYSTEM_PROMPT},
457
+ {"role": "user", "content": "Write the cited briefing now."},
458
+ ]
459
+ # num_ctx 4096 covers our actual prompt (system ~600 + 6 docs ~2000)
460
+ # with margin; 8192 was over-allocating KV cache. num_predict caps the
461
+ # briefing at ~400 tokens — enough for 4 sections, no runaway.
462
+ OPTS = {"temperature": 0, "num_ctx": 4096, "num_predict": 400}
463
+ if on_token is None:
464
+ resp = llm.chat(model=OLLAMA_MODEL, messages=messages, options=OPTS)
465
+ raw = resp["message"]["content"].strip()
466
+ else:
467
+ chunks: list[str] = []
468
+ for chunk in llm.chat(model=OLLAMA_MODEL, messages=messages,
469
+ stream=True, options=OPTS):
470
+ delta = (chunk.get("message") or {}).get("content") or ""
471
+ if delta:
472
+ chunks.append(delta)
473
+ on_token(delta)
474
+ raw = "".join(chunks).strip()
475
+ cleaned, dropped = verify_paragraph(raw, docs)
476
+ return cleaned, {"raw": raw, "dropped": dropped}
477
+
478
+
479
+ def _empty_result(plan, query: str, trace: list, error: str) -> dict:
480
+ return {
481
+ "intent": "neighborhood",
482
+ "query": query,
483
+ "error": error,
484
+ "plan": {
485
+ "intent": plan.intent,
486
+ "targets": plan.targets,
487
+ "specialists": plan.specialists,
488
+ "rationale": plan.rationale,
489
+ },
490
+ "trace": trace,
491
+ "paragraph": f"Could not resolve target to an NTA: {error}",
492
+ }
app/intents/single_address.py ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """single_address intent — the existing linear FSM, wrapped behind the
2
+ planner-aware execution interface. The planner's specialist list is
3
+ respected only as an OPT-OUT: if the planner explicitly omitted a
4
+ specialist we'd otherwise run, we skip it. The fixed FSM stays as the
5
+ canonical path because (a) it's well-tested, (b) order-of-execution
6
+ matters slightly (geocode before everything), and (c) the executor
7
+ parallelism for an address is bounded by Granite 4.1 reconcile time
8
+ anyway."""
9
+ from __future__ import annotations
10
+
11
+ from app.fsm import run as run_linear
12
+
13
+
14
+ def run(plan, query: str, progress_q=None, strict: bool = False) -> dict:
15
+ """Execute the planner's single_address Plan via the existing linear
16
+ FSM. If progress_q is provided, FSM steps and Granite reconcile tokens
17
+ are forwarded to it for live streaming.
18
+
19
+ strict=True flips the FSM's reconcile step to Mellea-validated
20
+ rejection sampling (via a thread-local flag). Disables token
21
+ streaming for that step."""
22
+ from app.fsm import (
23
+ iter_steps,
24
+ set_mellea_attempt_callback,
25
+ set_planned_specialists,
26
+ set_strict_mode,
27
+ set_token_callback,
28
+ )
29
+ planner_addr = next(
30
+ (t["text"] for t in plan.targets if t.get("type") == "address"),
31
+ None,
32
+ )
33
+ addr = planner_addr if (planner_addr and len(planner_addr) >= len(query) * 0.7) else query
34
+ set_strict_mode(strict)
35
+ set_planned_specialists(plan.specialists or [])
36
+ if progress_q is not None:
37
+ def _on_token(delta: str):
38
+ progress_q.put({"kind": "token", "delta": delta})
39
+ def _on_mellea_attempt(attempt_idx, passed, failed):
40
+ progress_q.put({"kind": "mellea_attempt",
41
+ "attempt": attempt_idx,
42
+ "passed": passed, "failed": failed})
43
+ # Streaming Mellea now emits tokens during each attempt — wire
44
+ # the token callback for both strict and non-strict paths.
45
+ set_token_callback(_on_token)
46
+ set_mellea_attempt_callback(_on_mellea_attempt)
47
+ try:
48
+ final = None
49
+ for ev in iter_steps(addr):
50
+ if ev["kind"] == "step":
51
+ progress_q.put({"kind": "step", **ev})
52
+ else:
53
+ final = ev
54
+ out = {**(final or {}), "trace": []}
55
+ finally:
56
+ set_token_callback(None)
57
+ set_mellea_attempt_callback(None)
58
+ set_strict_mode(False)
59
+ set_planned_specialists(None)
60
+ else:
61
+ try:
62
+ out = run_linear(addr)
63
+ finally:
64
+ set_strict_mode(False)
65
+ set_planned_specialists(None)
66
+ out["intent"] = "single_address"
67
+ out["plan"] = {
68
+ "intent": plan.intent,
69
+ "targets": plan.targets,
70
+ "specialists": plan.specialists,
71
+ "rationale": plan.rationale,
72
+ }
73
+ return out
app/live/__init__.py ADDED
File without changes
app/live/floodnet_forecast.py ADDED
@@ -0,0 +1,184 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Granite TimeSeries TTM r2 forecast on FloodNet sensor flood events.
2
+
3
+ This is the strongest single TTM win for the NYU CUSP audience.
4
+ FloodNet (CUSP/Brooklyn College, Charlie Mydlarz + Andrea Silverman)
5
+ operates the sensor network and publishes the historical events; they
6
+ do not publish per-sensor forecasts. Riprap producing a forecast on
7
+ FloodNet's own data is a genuine ecosystem-extension capability — and
8
+ unlike the surge / 311 forecasts, the audience explicitly cares about
9
+ this dataset.
10
+
11
+ Architecture:
12
+ - Nearest FloodNet sensor to the queried address (reuse
13
+ `app.context.floodnet.sensors_near`).
14
+ - 512 days of binary daily-event history at that sensor (1 if any
15
+ labeled flood event started on that day, else 0).
16
+ - TTM r2 (512 → 96) reused from `app.live.ttm_forecast._load_model` —
17
+ *no new model class loaded into memory*. The existing 311 daily
18
+ forecaster has already paid this load cost.
19
+ - 96-day-ahead daily forecast → aggregated into 4-week and 12-week
20
+ expected counts so the briefing narration stays human-readable.
21
+
22
+ Silence over confabulation: returns `available: False` with a
23
+ reason field on every failure path. Sensors with fewer than 5
24
+ flood events in their entire history yield no forecast (the TTM
25
+ output on near-empty histories is dominated by quantization noise).
26
+
27
+ Doc-id format: `floodnet_forecast_<deployment_id>` so it's distinct
28
+ from the existing `[floodnet]` event-history doc.
29
+ """
30
+ from __future__ import annotations
31
+
32
+ import logging
33
+ from datetime import datetime, timedelta, timezone
34
+
35
+ import numpy as np
36
+
37
+ from app.context.floodnet import flood_events_for, sensors_near
38
+ from app.live.ttm_forecast import (
39
+ DAILY_CONTEXT,
40
+ DAILY_PREDICTION,
41
+ _MODEL_LOAD_ERROR,
42
+ _run_ttm,
43
+ )
44
+
45
+ log = logging.getLogger("riprap.floodnet_forecast")
46
+
47
+ DOC_ID_PREFIX = "floodnet_forecast"
48
+ CITATION = (
49
+ "FloodNet NYC ultrasonic depth sensors (api.floodnet.nyc) + "
50
+ "IBM Granite TimeSeries TTM r2 (Ekambaram et al. 2024, NeurIPS) "
51
+ "via granite-tsfm — daily flood-event recurrence forecast"
52
+ )
53
+
54
+ # A sensor with <5 historical events in 512 days has too sparse a
55
+ # signal for TTM to produce a meaningful forecast. The model still
56
+ # runs, but the output is dominated by quantization noise around
57
+ # zero; emitting a doc from that state is exactly the kind of
58
+ # pseudo-quantitative claim the four-tier discipline guards against.
59
+ MIN_EVENTS_FOR_FORECAST = 5
60
+
61
+ # Search radius for nearest-sensor lookup. Wider than the existing
62
+ # `floodnet` specialist's 600 m (which scans for *all* sensors at
63
+ # the address) — we just need *one* relevant sensor for the forecast.
64
+ NEAREST_SENSOR_RADIUS_M = 1500
65
+
66
+
67
+ def _haversine_m(lat1, lon1, lat2, lon2) -> float:
68
+ from math import asin, cos, radians, sin, sqrt
69
+ R = 6371000.0
70
+ p1, p2 = radians(lat1), radians(lat2)
71
+ dp = radians(lat2 - lat1); dl = radians(lon2 - lon1)
72
+ a = sin(dp / 2) ** 2 + cos(p1) * cos(p2) * sin(dl / 2) ** 2
73
+ return 2 * R * asin(sqrt(a))
74
+
75
+
76
+ def _build_daily_event_series(
77
+ deployment_id: str, days: int
78
+ ) -> tuple[np.ndarray, list[str], int]:
79
+ """Pull flood events for one sensor over `days` days, return a
80
+ daily binary series (1 if ≥1 flood event started that day, 0
81
+ otherwise) plus the event count."""
82
+ since = datetime.now(timezone.utc) - timedelta(days=days + 2)
83
+ events = flood_events_for([deployment_id], since=since)
84
+ end = datetime.now(timezone.utc).date()
85
+ start = end - timedelta(days=days - 1)
86
+ by_day: dict[str, int] = {}
87
+ for e in events:
88
+ ds = (e.start_time or "")[:10]
89
+ if not ds:
90
+ continue
91
+ by_day[ds] = 1
92
+ series: list[int] = []
93
+ labels: list[str] = []
94
+ for i in range(days):
95
+ d = start + timedelta(days=i)
96
+ d_iso = d.isoformat()
97
+ labels.append(d_iso)
98
+ series.append(by_day.get(d_iso, 0))
99
+ return np.array(series, dtype=np.float32), labels, len(events)
100
+
101
+
102
+ def summary_for_point(lat: float, lon: float) -> dict:
103
+ """Forecast flood-event recurrence at the nearest FloodNet sensor.
104
+
105
+ Returns a dict with `available: bool`. On success, includes the
106
+ sensor identity, history summary, and forecast aggregates.
107
+ """
108
+ try:
109
+ sensors = sensors_near(lat, lon, NEAREST_SENSOR_RADIUS_M)
110
+ except Exception as e:
111
+ log.warning("FloodNet sensor lookup failed: %r", e)
112
+ return {"available": False, "reason": "FloodNet API unreachable"}
113
+
114
+ if not sensors:
115
+ return {"available": False,
116
+ "reason": f"no FloodNet sensor within {NEAREST_SENSOR_RADIUS_M} m"}
117
+
118
+ # Closest by haversine. Some deployments have null geometry; skip those.
119
+ geo_sensors = [s for s in sensors if s.lat is not None and s.lon is not None]
120
+ if not geo_sensors:
121
+ return {"available": False, "reason": "nearest sensor has no geometry"}
122
+ nearest = min(geo_sensors,
123
+ key=lambda s: _haversine_m(lat, lon, s.lat, s.lon))
124
+ distance_m = _haversine_m(lat, lon, nearest.lat, nearest.lon)
125
+
126
+ try:
127
+ history, labels, total_events = _build_daily_event_series(
128
+ nearest.deployment_id, days=DAILY_CONTEXT
129
+ )
130
+ except Exception as e:
131
+ log.warning("FloodNet history fetch failed for %s: %r",
132
+ nearest.deployment_id, e)
133
+ return {"available": False, "reason": "history fetch failed"}
134
+
135
+ if total_events < MIN_EVENTS_FOR_FORECAST:
136
+ return {
137
+ "available": False,
138
+ "reason": (f"sensor has only {total_events} historical events "
139
+ f"(<{MIN_EVENTS_FOR_FORECAST}); forecast omitted"),
140
+ "sensor_id": nearest.deployment_id,
141
+ "sensor_name": nearest.name,
142
+ }
143
+
144
+ forecast = _run_ttm(history, DAILY_CONTEXT, DAILY_PREDICTION)
145
+ if forecast is None:
146
+ return {"available": False,
147
+ "reason": _MODEL_LOAD_ERROR or "TTM inference failed"}
148
+
149
+ fc = np.clip(forecast, 0, None)
150
+ fc28 = float(fc[:28].sum())
151
+ fc_total = float(fc.sum())
152
+ fc_peak_offset = int(fc.argmax()) + 1
153
+ fc_peak_value = float(fc.max())
154
+
155
+ hist_total = int(history.sum())
156
+ hist_recent_28d = float(history[-28:].sum())
157
+
158
+ # "Accelerating" if the next-28-days expected count materially
159
+ # exceeds the prior-28-days observed count.
160
+ accelerating = (hist_recent_28d > 0
161
+ and fc28 > 1.5 * hist_recent_28d)
162
+
163
+ return {
164
+ "available": True,
165
+ "doc_id": f"{DOC_ID_PREFIX}_{nearest.deployment_id}",
166
+ "sensor_id": nearest.deployment_id,
167
+ "sensor_name": nearest.name,
168
+ "sensor_street": nearest.street,
169
+ "sensor_borough": nearest.borough,
170
+ "sensor_lat": nearest.lat,
171
+ "sensor_lon": nearest.lon,
172
+ "distance_from_query_m": round(distance_m, 1),
173
+ "history_window_days": DAILY_CONTEXT,
174
+ "history_total_events": hist_total,
175
+ "history_recent_28d_events": int(hist_recent_28d),
176
+ "forecast_horizon_days": DAILY_PREDICTION,
177
+ "forecast_28d_expected_events": round(fc28, 2),
178
+ "forecast_total_horizon_events": round(fc_total, 2),
179
+ "forecast_peak_day_offset": fc_peak_offset,
180
+ "forecast_peak_day_value": round(fc_peak_value, 3),
181
+ "accelerating": accelerating,
182
+ "model": "granite-timeseries-ttm-r2",
183
+ "citation": CITATION,
184
+ }
app/live/ttm_forecast.py ADDED
@@ -0,0 +1,363 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Granite TimeSeries TTM r2 — short-horizon nowcast for the live tide
2
+ residual (storm surge / wind setup) at the NYC harbor entrance.
3
+
4
+ Why TTM here, vs the existing live NOAA fetcher:
5
+ - The existing `noaa_tides` specialist returns a single 6-min snapshot:
6
+ observed, predicted, residual = observed - predicted. That's "right now."
7
+ - TTM forecasts the next ~9.6 hours of the *residual* — the meteorologic
8
+ component (surge + wind setup). NOAA already publishes the astronomical
9
+ tide; TTM tells us if the surge component is about to peak.
10
+ - This is the genuinely useful add: a nowcast of the part NOAA *doesn't*
11
+ predict.
12
+
13
+ Architecture: ibm-granite/granite-timeseries-ttm-r2, ~1.5M params,
14
+ zero-shot multivariate (we use it univariate here on the residual
15
+ series). 512-step context @ 6-min cadence = ~51 h of history;
16
+ 96-step horizon = ~9.6 h ahead.
17
+
18
+ Citation: Ekambaram, V., et al. (2024). "Tiny Time Mixers (TTMs):
19
+ Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of
20
+ Multivariate Time Series." NeurIPS 2024.
21
+
22
+ Gated emission: a doc is only added when the forecast peak residual
23
+ exceeds an absolute threshold (default 0.3 ft / 9 cm). On a calm day
24
+ the model still runs, but the reconciler sees no doc — silence over
25
+ confabulation.
26
+ """
27
+ from __future__ import annotations
28
+
29
+ import logging
30
+ from datetime import datetime, timedelta
31
+
32
+ import httpx
33
+ import numpy as np
34
+
35
+ log = logging.getLogger("riprap.ttm_forecast")
36
+
37
+ DOC_ID = "ttm_forecast"
38
+ CITATION = ("IBM Granite TimeSeries TTM r2 (Ekambaram et al. 2024, NeurIPS); "
39
+ "ibm-granite/granite-timeseries-ttm-r2 via granite-tsfm")
40
+
41
+ # Three NOAA stations covering NYC harbor + Long Island Sound + Bight.
42
+ # step_ttm_forecast picks the closest to the queried address (matches the
43
+ # existing nearest-gauge behaviour in step_noaa_tides). This means an
44
+ # inland-Queens query forecasts at Kings Point (Long Island Sound), a
45
+ # Coney Island query forecasts at Sandy Hook (Bight), and a Manhattan
46
+ # query forecasts at the Battery — each gauge characterises a different
47
+ # storm-surge regime.
48
+ STATIONS = [
49
+ ("8518750", "The Battery, NY", 40.7006, -74.0142),
50
+ ("8516945", "Kings Point, NY", 40.8103, -73.7649),
51
+ ("8531680", "Sandy Hook, NJ", 40.4669, -74.0094),
52
+ ]
53
+ NOAA_URL = "https://api.tidesandcurrents.noaa.gov/api/prod/datagetter"
54
+
55
+ CONTEXT_LENGTH = 512 # ~51 h at 6-min cadence (surge forecast)
56
+ PREDICTION_LENGTH = 96 # ~9.6 h horizon (surge forecast)
57
+ MIN_INTERESTING_RESIDUAL_FT = 0.3 # ~9 cm — gate for doc emission
58
+
59
+ # 311 daily-counts forecast — TTM r2's smallest pretrained config is
60
+ # 512 context which is awkward for weekly counts on a single address.
61
+ # Daily aggregation (512 days ≈ 17 months of complaint history) lets
62
+ # the model run natively at its standard resolution; we forecast the
63
+ # next 96 days (~3 months).
64
+ DAILY_CONTEXT = 512
65
+ DAILY_PREDICTION = 96
66
+ NYC_311_URL = "https://data.cityofnewyork.us/resource/erm2-nwe9.json"
67
+ NYC_311_FLOOD_DESCRIPTORS = (
68
+ "Sewer Backup (Use Comments) (SA)",
69
+ "Catch Basin Clogged/Flooding (Use Comments) (SC)",
70
+ "Street Flooding (SJ)",
71
+ "Manhole Overflow (Use Comments) (SA1)",
72
+ "Flooding on Street",
73
+ )
74
+
75
+
76
+ # ---- Lazy-loaded model singleton -----------------------------------------
77
+
78
+ _MODELS: dict[tuple[int, int], object] = {}
79
+ _MODEL_LOAD_ERROR: str | None = None
80
+
81
+
82
+ def _load_model(context_length: int = CONTEXT_LENGTH,
83
+ prediction_length: int = PREDICTION_LENGTH):
84
+ """TTM r2 is configured per (context, prediction) length pair. Cache
85
+ by that pair so the surge forecaster (512→96) and the weekly 311
86
+ forecaster (52→4) each get their own model handle on first use."""
87
+ global _MODEL_LOAD_ERROR
88
+ key = (context_length, prediction_length)
89
+ if key in _MODELS:
90
+ return _MODELS[key]
91
+ if _MODEL_LOAD_ERROR is not None:
92
+ return None
93
+ try:
94
+ import torch # noqa: F401
95
+ from tsfm_public.toolkit.get_model import get_model
96
+ m = get_model(
97
+ "ibm-granite/granite-timeseries-ttm-r2",
98
+ context_length=context_length,
99
+ prediction_length=prediction_length,
100
+ )
101
+ m.eval()
102
+ _MODELS[key] = m
103
+ log.info("TTM r2 loaded (context=%d horizon=%d)",
104
+ context_length, prediction_length)
105
+ return m
106
+ except Exception as e:
107
+ _MODEL_LOAD_ERROR = repr(e)
108
+ log.exception("TTM model load failed; future calls will be skipped")
109
+ return None
110
+
111
+
112
+ # Closest-of-three station selection (mirrors app/context/noaa_tides.py).
113
+ def _haversine_km(lat1, lon1, lat2, lon2) -> float:
114
+ from math import asin, cos, radians, sin, sqrt
115
+ R = 6371.0
116
+ p1, p2 = radians(lat1), radians(lat2)
117
+ dp = radians(lat2 - lat1); dl = radians(lon2 - lon1)
118
+ a = sin(dp / 2) ** 2 + cos(p1) * cos(p2) * sin(dl / 2) ** 2
119
+ return 2 * R * asin(sqrt(a))
120
+
121
+
122
+ def _nearest_station(lat: float, lon: float):
123
+ return min(STATIONS, key=lambda s: _haversine_km(lat, lon, s[2], s[3]))
124
+
125
+
126
+ # ---- NOAA history fetch --------------------------------------------------
127
+
128
+ def _fetch_noaa_series(begin_iso: str, end_iso: str, product: str,
129
+ station_id: str) -> dict:
130
+ """One-shot NOAA datagetter for a date range. Returns the JSON body."""
131
+ r = httpx.get(NOAA_URL, params={
132
+ "begin_date": begin_iso, "end_date": end_iso,
133
+ "station": station_id, "product": product,
134
+ "datum": "MLLW", "units": "english", "time_zone": "lst_ldt",
135
+ "format": "json",
136
+ }, timeout=15.0)
137
+ r.raise_for_status()
138
+ return r.json()
139
+
140
+
141
+ def _residual_series(station_id: str,
142
+ n_obs_needed: int = CONTEXT_LENGTH) -> tuple[np.ndarray, list[str]] | None:
143
+ """Build the recent residual series (observed - predicted) at 6-min
144
+ cadence, length CONTEXT_LENGTH. Returns (values_ft, timestamps_iso).
145
+ Returns None if NOAA refused, returned mismatched shapes, or the
146
+ series is too short."""
147
+ # Fetch slightly more than we need to absorb the occasional missing
148
+ # 6-min sample; we'll trim to exact length below.
149
+ end = datetime.utcnow()
150
+ # NOAA recommends LST/LDT for time_zone matching across products
151
+ begin = end - timedelta(minutes=6 * (n_obs_needed + 50))
152
+ fmt = "%Y%m%d %H:%M"
153
+ begin_s = begin.strftime(fmt)
154
+ end_s = end.strftime(fmt)
155
+ try:
156
+ obs_j = _fetch_noaa_series(begin_s, end_s, "water_level", station_id)
157
+ pred_j = _fetch_noaa_series(begin_s, end_s, "predictions", station_id)
158
+ except Exception as e:
159
+ log.warning("NOAA fetch failed: %r", e)
160
+ return None
161
+ obs_data = obs_j.get("data") or []
162
+ pred_data = pred_j.get("predictions") or []
163
+ if not obs_data or not pred_data:
164
+ return None
165
+ # Both products are 6-min cadence and share timestamps; align by t.
166
+ obs_by_t = {row["t"]: float(row["v"]) for row in obs_data if row.get("v")}
167
+ pred_by_t = {row["t"]: float(row["v"]) for row in pred_data if row.get("v")}
168
+ common_ts = sorted(set(obs_by_t) & set(pred_by_t))
169
+ if len(common_ts) < n_obs_needed:
170
+ log.warning("only %d aligned NOAA samples (need %d)",
171
+ len(common_ts), n_obs_needed)
172
+ return None
173
+ common_ts = common_ts[-n_obs_needed:]
174
+ residual = np.array([obs_by_t[t] - pred_by_t[t] for t in common_ts],
175
+ dtype=np.float32)
176
+ return residual, common_ts
177
+
178
+
179
+ # ---- Forecast --------------------------------------------------------------
180
+
181
+ def _run_ttm(history: np.ndarray,
182
+ context_length: int = CONTEXT_LENGTH,
183
+ prediction_length: int = PREDICTION_LENGTH) -> np.ndarray | None:
184
+ """Channel-wise standardize, run model, de-standardize. Returns a
185
+ `prediction_length`-step de-standardized forecast in input units."""
186
+ model = _load_model(context_length, prediction_length)
187
+ if model is None:
188
+ return None
189
+ import torch
190
+ mu = float(history.mean())
191
+ sigma = float(history.std() + 1e-6)
192
+ normed = (history - mu) / sigma
193
+ x = torch.from_numpy(normed.astype(np.float32))[None, :, None]
194
+ try:
195
+ with torch.no_grad():
196
+ out = model(past_values=x)
197
+ except Exception as e:
198
+ log.exception("TTM inference failed: %r", e)
199
+ return None
200
+ pred = out.prediction_outputs[0, :, 0].cpu().numpy()
201
+ return pred * sigma + mu
202
+
203
+
204
+ def summary_for_point(lat: float, lon: float) -> dict:
205
+ """Surge forecast at the NOAA gauge nearest the queried address.
206
+
207
+ Three gauges cover NYC: Battery (harbor entrance), Kings Point
208
+ (LI Sound), Sandy Hook (Bight). Surge regimes differ — Sandy 2012
209
+ peaked at +14 ft at the Battery vs. lower at Kings Point because
210
+ the gauges respond to different forcing geometries. Picking the
211
+ closest gauge to the queried address makes the forecast
212
+ address-relevant rather than always city-wide.
213
+ """
214
+ sid, sname, slat, slon = _nearest_station(lat, lon)
215
+ distance_km = round(_haversine_km(lat, lon, slat, slon), 1)
216
+
217
+ series = _residual_series(sid)
218
+ if series is None:
219
+ return {"available": False,
220
+ "reason": "NOAA history fetch returned insufficient data",
221
+ "station_id": sid, "station_name": sname,
222
+ "distance_km": distance_km}
223
+ history, timestamps = series
224
+ forecast = _run_ttm(history, CONTEXT_LENGTH, PREDICTION_LENGTH)
225
+ if forecast is None:
226
+ return {"available": False,
227
+ "reason": _MODEL_LOAD_ERROR or "TTM inference failed",
228
+ "station_id": sid, "station_name": sname,
229
+ "distance_km": distance_km}
230
+
231
+ history_peak = float(np.max(np.abs(history)))
232
+ fc_peak_idx = int(np.argmax(np.abs(forecast)))
233
+ fc_peak_ft = float(forecast[fc_peak_idx])
234
+ fc_peak_minutes_ahead = (fc_peak_idx + 1) * 6
235
+ fc_peak_time = datetime.utcnow() + timedelta(minutes=fc_peak_minutes_ahead)
236
+
237
+ interesting = (abs(fc_peak_ft) >= MIN_INTERESTING_RESIDUAL_FT or
238
+ history_peak >= MIN_INTERESTING_RESIDUAL_FT)
239
+
240
+ return {
241
+ "available": True,
242
+ "interesting": interesting,
243
+ "station_id": sid,
244
+ "station_name": sname,
245
+ "distance_km": distance_km,
246
+ "context_length": int(len(history)),
247
+ "horizon_steps": int(len(forecast)),
248
+ "history_peak_abs_ft": round(history_peak, 2),
249
+ "history_recent_ft": round(float(history[-1]), 2),
250
+ "forecast_peak_ft": round(fc_peak_ft, 2),
251
+ "forecast_peak_minutes_ahead": fc_peak_minutes_ahead,
252
+ "forecast_peak_time_utc": fc_peak_time.isoformat(timespec="minutes") + "Z",
253
+ "threshold_ft": MIN_INTERESTING_RESIDUAL_FT,
254
+ }
255
+
256
+
257
+ # ---- Per-address daily 311 flood-complaint forecast ----------------------
258
+
259
+ def _fetch_311_flood_daily(lat: float, lon: float,
260
+ radius_m: int = 200,
261
+ days: int = DAILY_CONTEXT,
262
+ ) -> tuple[np.ndarray, list[str]] | None:
263
+ """Pull `days` of daily flood-complaint counts within `radius_m` of
264
+ (lat, lon) from NYC OpenData. Returns (counts_array_length_days,
265
+ date_labels) or None on failure. Missing days are zero-filled."""
266
+ from collections import defaultdict
267
+ from datetime import datetime as _dt
268
+ from datetime import timedelta as _td
269
+ end = _dt.utcnow().date()
270
+ start = end - _td(days=days + 1)
271
+ descs = " OR ".join(f"descriptor='{d}'" for d in NYC_311_FLOOD_DESCRIPTORS)
272
+ where = (
273
+ f"created_date between '{start.isoformat()}T00:00:00' and "
274
+ f"'{end.isoformat()}T23:59:59' AND "
275
+ f"latitude IS NOT NULL AND longitude IS NOT NULL AND "
276
+ f"({descs}) AND "
277
+ f"within_circle(location, {lat}, {lon}, {radius_m})"
278
+ )
279
+ try:
280
+ r = httpx.get(NYC_311_URL,
281
+ params={"$select": "created_date",
282
+ "$where": where,
283
+ "$limit": "50000"},
284
+ timeout=20.0)
285
+ r.raise_for_status()
286
+ rows = r.json()
287
+ except Exception as e:
288
+ log.warning("311 flood fetch for TTM failed: %r", e)
289
+ return None
290
+
291
+ counts: dict[str, int] = defaultdict(int)
292
+ for row in rows or []:
293
+ ds = (row.get("created_date") or "")[:10]
294
+ if not ds:
295
+ continue
296
+ counts[ds] += 1
297
+
298
+ series: list[int] = []
299
+ labels: list[str] = []
300
+ for i in range(days):
301
+ d = end - _td(days=days - 1 - i)
302
+ d_iso = d.isoformat()
303
+ labels.append(d_iso)
304
+ series.append(counts.get(d_iso, 0))
305
+ return np.array(series, dtype=np.float32), labels
306
+
307
+
308
+ def weekly_311_forecast_for_point(lat: float, lon: float,
309
+ radius_m: int = 200) -> dict:
310
+ """TTM r2 zero-shot forecast on per-address daily 311
311
+ flood-complaint counts. Despite the name — kept for FSM-call-site
312
+ stability — this now operates on daily resolution (TTM r2's
313
+ smallest native config is 512 context, awkward for weekly).
314
+ History: 512 days (~17 months); forecast: 96 days (~3 months).
315
+ Returns daily and weekly summaries so the reconciler narration
316
+ stays human-readable.
317
+
318
+ Designed not to raise. Returns `available: False` with a reason
319
+ field on any failure path."""
320
+ series = _fetch_311_flood_daily(lat, lon, radius_m=radius_m)
321
+ if series is None:
322
+ return {"available": False, "reason": "311 history fetch failed"}
323
+ history, labels = series
324
+ forecast = _run_ttm(history, DAILY_CONTEXT, DAILY_PREDICTION)
325
+ if forecast is None:
326
+ return {"available": False,
327
+ "reason": _MODEL_LOAD_ERROR or "TTM inference failed"}
328
+
329
+ fc_clipped = np.clip(forecast, 0, None)
330
+ hist_total = int(history.sum())
331
+ hist_mean_per_day = float(history.mean())
332
+ hist_recent_mean_30d = float(history[-30:].mean())
333
+ fc_total = float(fc_clipped.sum())
334
+ fc_mean_per_day = float(fc_clipped.mean())
335
+ fc_peak_day = float(fc_clipped.max())
336
+ fc_peak_day_offset = int(fc_clipped.argmax()) + 1
337
+
338
+ # Aggregate to weekly equivalents for the briefing narration —
339
+ # readers think in weeks, not days.
340
+ history_weekly_mean = hist_mean_per_day * 7
341
+ forecast_weekly_mean = fc_mean_per_day * 7
342
+
343
+ accelerating = (hist_recent_mean_30d > 0 and
344
+ fc_mean_per_day > 1.5 * hist_recent_mean_30d)
345
+
346
+ return {
347
+ "available": True,
348
+ "radius_m": radius_m,
349
+ "days_context": DAILY_CONTEXT,
350
+ "days_horizon": DAILY_PREDICTION,
351
+ "history_total_complaints": hist_total,
352
+ "history_mean_per_day": round(hist_mean_per_day, 3),
353
+ "history_recent_30d_mean": round(hist_recent_mean_30d, 3),
354
+ "history_weekly_equivalent": round(history_weekly_mean, 2),
355
+ "forecast_total_next_horizon": round(fc_total, 1),
356
+ "forecast_mean_per_day": round(fc_mean_per_day, 3),
357
+ "forecast_weekly_equivalent": round(forecast_weekly_mean, 2),
358
+ "forecast_peak_day": round(fc_peak_day, 2),
359
+ "forecast_peak_day_offset": fc_peak_day_offset,
360
+ "accelerating": accelerating,
361
+ "context_window_start": labels[0] if labels else None,
362
+ "context_window_end": labels[-1] if labels else None,
363
+ }
app/llm.py ADDED
@@ -0,0 +1,296 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """LiteLLM-backed shim around the ollama.chat call surface.
2
+
3
+ Single function `chat(model, messages, options, stream)` that returns the
4
+ same dict / iterator-of-dicts shape `ollama.chat` returns, so existing
5
+ call sites swap `import ollama` -> `from app import llm` with no other
6
+ changes.
7
+
8
+ Backend selection (env):
9
+ RIPRAP_LLM_PRIMARY = "vllm" | "ollama" (default: ollama)
10
+ RIPRAP_LLM_BASE_URL = http://amd:8000/v1 (vllm only)
11
+ RIPRAP_LLM_API_KEY = <token> (vllm only)
12
+ RIPRAP_LLM_FALLBACK = "ollama" | "" (default: "ollama" when
13
+ primary=vllm, else "")
14
+ OLLAMA_BASE_URL = http://host:11434 (ollama backend only)
15
+
16
+ Model routing: callers may pass either Ollama tags ("granite4.1:8b") or
17
+ logical aliases ("granite-8b"). Mapped to:
18
+ vllm -> openai/granite-4.1-{3b,8b} on RIPRAP_LLM_BASE_URL
19
+ ollama -> ollama_chat/granite4.1:{3b,8b} on OLLAMA_BASE_URL
20
+
21
+ When primary=vllm with fallback=ollama, the LiteLLM Router auto-fails
22
+ over to the local Ollama deployment if the AMD endpoint errors (timeout,
23
+ connection refused, 5xx). Existing call sites are unaware of the swap.
24
+ """
25
+
26
+ from __future__ import annotations
27
+
28
+ import logging
29
+ import os
30
+ from collections.abc import Iterator
31
+ from typing import Any
32
+
33
+ import litellm
34
+ from litellm import Router
35
+
36
+ log = logging.getLogger(__name__)
37
+
38
+ litellm.suppress_debug_info = True
39
+ litellm.drop_params = True # silently drop unsupported params instead of erroring
40
+
41
+ _VLLM_BASE = os.environ.get("RIPRAP_LLM_BASE_URL", "").rstrip("/")
42
+ _VLLM_KEY = os.environ.get("RIPRAP_LLM_API_KEY", "") or "EMPTY"
43
+ _PRIMARY = os.environ.get("RIPRAP_LLM_PRIMARY", "ollama").lower()
44
+ _FALLBACK = os.environ.get(
45
+ "RIPRAP_LLM_FALLBACK",
46
+ "ollama" if _PRIMARY == "vllm" else "",
47
+ ).lower()
48
+
49
+ _OLLAMA_BASE = os.environ.get(
50
+ "OLLAMA_BASE_URL",
51
+ os.environ.get("OLLAMA_HOST", "http://localhost:11434"),
52
+ )
53
+ if not _OLLAMA_BASE.startswith("http"):
54
+ _OLLAMA_BASE = "http://" + _OLLAMA_BASE
55
+
56
+ # alias -> (vllm-served-name, ollama-tag)
57
+ # In our hackathon vLLM deployment only the 8B is served (one served-name
58
+ # per vLLM process and we don't want a second container). Override the
59
+ # 3B served-name with RIPRAP_LLM_VLLM_3B_NAME if you stand up a second
60
+ # vLLM serving the 3B and want the planner to hit it specifically.
61
+ _VLLM_8B = os.environ.get("RIPRAP_LLM_VLLM_8B_NAME", "granite-4.1-8b")
62
+ _VLLM_3B = os.environ.get("RIPRAP_LLM_VLLM_3B_NAME", _VLLM_8B)
63
+ # Ollama tag overrides: HF Spaces' build disk fills past the threshold
64
+ # when both granite4.1:3b and granite4.1:8b are pulled alongside the
65
+ # Phase 1 / Phase 4 EO toolchain. Set RIPRAP_OLLAMA_3B_TAG=granite4.1:8b
66
+ # on disk-constrained deployments — the planner output is short, so
67
+ # the 8B-vs-3B difference is latency, not correctness.
68
+ #
69
+ # RIPRAP_OLLAMA_8B_TAG is also the cheapest knob for swapping quants
70
+ # without touching code: e.g. "granite4.1:8b-q3_K_M" gives ~1 GB of
71
+ # memory back vs the default Q4_K_M, at minor grounding-discipline cost
72
+ # (re-run the Hollis probe before committing — see CLAUDE.md).
73
+ _OLLAMA_3B_TAG = os.environ.get("RIPRAP_OLLAMA_3B_TAG", "granite4.1:3b")
74
+ _OLLAMA_8B_TAG = os.environ.get("RIPRAP_OLLAMA_8B_TAG", "granite4.1:8b")
75
+ _LOGICAL: dict[str, tuple[str, str]] = {
76
+ "granite-3b": (_VLLM_3B, _OLLAMA_3B_TAG),
77
+ "granite-8b": (_VLLM_8B, _OLLAMA_8B_TAG),
78
+ }
79
+ _OLLAMA_TO_LOGICAL = {v[1]: k for k, v in _LOGICAL.items()}
80
+ # Also accept the canonical hardcoded tag names so callers passing
81
+ # `granite4.1:3b` resolve to the alias even when the env override
82
+ # remapped that alias to `granite4.1:8b`.
83
+ _OLLAMA_TO_LOGICAL.setdefault("granite4.1:3b", "granite-3b")
84
+ _OLLAMA_TO_LOGICAL.setdefault("granite4.1:8b", "granite-8b")
85
+
86
+
87
+ def _build_router() -> Router:
88
+ model_list: list[dict[str, Any]] = []
89
+ fallbacks: list[dict[str, list[str]]] = []
90
+ use_vllm = _PRIMARY == "vllm" and bool(_VLLM_BASE)
91
+
92
+ for alias, (vllm_name, ollama_tag) in _LOGICAL.items():
93
+ if use_vllm:
94
+ model_list.append({
95
+ "model_name": alias,
96
+ "litellm_params": {
97
+ "model": f"openai/{vllm_name}",
98
+ "api_base": _VLLM_BASE,
99
+ "api_key": _VLLM_KEY,
100
+ "timeout": 240,
101
+ "stream_timeout": 240,
102
+ },
103
+ })
104
+ if _FALLBACK == "ollama":
105
+ fb_alias = f"{alias}-ollama"
106
+ model_list.append({
107
+ "model_name": fb_alias,
108
+ "litellm_params": {
109
+ "model": f"ollama_chat/{ollama_tag}",
110
+ "api_base": _OLLAMA_BASE,
111
+ "timeout": 240,
112
+ "stream_timeout": 240,
113
+ },
114
+ })
115
+ fallbacks.append({alias: [fb_alias]})
116
+ else:
117
+ model_list.append({
118
+ "model_name": alias,
119
+ "litellm_params": {
120
+ "model": f"ollama_chat/{ollama_tag}",
121
+ "api_base": _OLLAMA_BASE,
122
+ "timeout": 240,
123
+ "stream_timeout": 240,
124
+ },
125
+ })
126
+
127
+ log.info("llm router primary=%s fallback=%s vllm_base=%s ollama_base=%s",
128
+ _PRIMARY, _FALLBACK or "<none>",
129
+ _VLLM_BASE or "<unset>", _OLLAMA_BASE)
130
+ return Router(
131
+ model_list=model_list,
132
+ fallbacks=fallbacks,
133
+ num_retries=0, # Router fallback handles the failover; no point
134
+ # burning seconds re-hitting a dead endpoint.
135
+ timeout=240,
136
+ )
137
+
138
+
139
+ _router = _build_router()
140
+
141
+
142
+ def _resolve_alias(model: str) -> str:
143
+ if model in _LOGICAL:
144
+ return model
145
+ if model in _OLLAMA_TO_LOGICAL:
146
+ return _OLLAMA_TO_LOGICAL[model]
147
+ return model # pass through; let the router report unknowns
148
+
149
+
150
+ def _opts_to_kwargs(options: dict | None) -> dict:
151
+ """Translate ollama-style options dict to LiteLLM kwargs.
152
+
153
+ Ollama-only knobs (num_ctx) are forwarded via extra_body so that the
154
+ ollama_chat backend still receives them; OpenAI/vLLM ignores them
155
+ (litellm.drop_params=True).
156
+ """
157
+ kw: dict[str, Any] = {}
158
+ extra: dict[str, Any] = {}
159
+ if options:
160
+ if "temperature" in options:
161
+ kw["temperature"] = options["temperature"]
162
+ if "top_p" in options:
163
+ kw["top_p"] = options["top_p"]
164
+ if "num_predict" in options:
165
+ kw["max_tokens"] = options["num_predict"]
166
+ for k in ("num_ctx",):
167
+ if k in options:
168
+ extra[k] = options[k]
169
+ if extra:
170
+ kw["extra_body"] = extra
171
+ return kw
172
+
173
+
174
+ def _extract_documents(messages: list[dict]) -> list[dict]:
175
+ """Pull document-role messages into Granite's HF chat-template format.
176
+
177
+ Ollama's Modelfile template recognizes `role: "document <id>"` and
178
+ bundles the message into a <documents> block automatically. The HF
179
+ tokenizer chat template (used by vLLM) does *not* — it silently
180
+ drops non-standard roles. To make vLLM honor the same grounding
181
+ contract, we extract the documents into the chat-template kwarg
182
+ `documents=[{"doc_id": ..., "text": ...}]` while leaving the
183
+ original document-role messages in place so the Ollama backend
184
+ keeps working unchanged on the fallback path.
185
+ """
186
+ docs: list[dict] = []
187
+ for m in messages:
188
+ role = m.get("role", "")
189
+ if role.startswith("document "):
190
+ docs.append({
191
+ "doc_id": role.split(" ", 1)[1],
192
+ "text": m.get("content", ""),
193
+ })
194
+ return docs
195
+
196
+
197
+ # vLLM's Granite chat template emits citations as `[doc_id=foo]`; the rest
198
+ # of Riprap (Mellea checks, frontend chip rendering, citations regex) all
199
+ # expect the bare `[foo]` form that Ollama's Modelfile template produces.
200
+ # Normalize transparently so the two backends are interchangeable.
201
+ _CITE_NORMALIZE_RE = __import__("re").compile(r"\[doc_id=([A-Za-z0-9_]+)\]")
202
+
203
+
204
+ def _normalize_citations(text: str) -> str:
205
+ return _CITE_NORMALIZE_RE.sub(r"[\1]", text)
206
+
207
+
208
+ def _to_ollama_shape(resp) -> dict:
209
+ msg = resp.choices[0].message
210
+ content = _normalize_citations(msg.content or "")
211
+ return {"message": {"role": "assistant", "content": content}}
212
+
213
+
214
+ def _stream_to_ollama_shape(stream) -> Iterator[dict]:
215
+ for chunk in stream:
216
+ try:
217
+ delta = chunk.choices[0].delta
218
+ content = getattr(delta, "content", None) or ""
219
+ except (IndexError, AttributeError):
220
+ content = ""
221
+ # Per-chunk normalize is safe: `[doc_id=X]` arrives as a single
222
+ # token sequence inside one chunk in practice, and the regex is
223
+ # idempotent / no-op on partial matches.
224
+ if content:
225
+ content = _normalize_citations(content)
226
+ yield {"message": {"role": "assistant", "content": content}}
227
+
228
+
229
+ def _default_hardware_label() -> str:
230
+ """Best-guess hardware label for the UI badge.
231
+
232
+ Auto-detected from env. Operators can override with
233
+ RIPRAP_HARDWARE_LABEL (e.g. "AMD MI300X" / "NVIDIA T4" / "Apple M3 Pro").
234
+ """
235
+ if _PRIMARY == "vllm" and _VLLM_BASE:
236
+ return "AMD MI300X"
237
+ if os.environ.get("SPACE_ID") or os.environ.get("HF_SPACE_ID"):
238
+ return "NVIDIA T4"
239
+ return "Local"
240
+
241
+
242
+ def backend_info() -> dict[str, Any]:
243
+ """Static description of the active LLM routing for the /api/backend
244
+ endpoint and the UI badge. Does not perform a network call; the
245
+ /api/backend handler does its own reachability ping."""
246
+ primary_engine = "vLLM" if _PRIMARY == "vllm" and _VLLM_BASE else "Ollama"
247
+ fallback_engine = (
248
+ "Ollama" if (_PRIMARY == "vllm" and _FALLBACK == "ollama")
249
+ else None
250
+ )
251
+ return {
252
+ "primary": _PRIMARY if _VLLM_BASE or _PRIMARY != "vllm" else "ollama",
253
+ "engine": os.environ.get("RIPRAP_ENGINE_LABEL", primary_engine),
254
+ "hardware": os.environ.get("RIPRAP_HARDWARE_LABEL",
255
+ _default_hardware_label()),
256
+ "model": os.environ.get("RIPRAP_RECONCILER_MODEL", _OLLAMA_8B_TAG),
257
+ "vllm_base_url": _VLLM_BASE or None,
258
+ "ollama_base_url": _OLLAMA_BASE,
259
+ "fallback_engine": fallback_engine,
260
+ }
261
+
262
+
263
+ def chat(model: str, messages: list[dict], options: dict | None = None,
264
+ stream: bool = False, format: str | None = None):
265
+ """Drop-in replacement for ollama.chat with router-managed failover.
266
+
267
+ Returns:
268
+ - stream=False: dict shaped like ollama's response
269
+ ({"message": {"role": "assistant", "content": "..."}}).
270
+ - stream=True: iterator yielding chunk dicts of the same shape.
271
+
272
+ `format="json"` mirrors Ollama's JSON-mode forcing — translated to
273
+ OpenAI's response_format for vLLM, and passed through unchanged for
274
+ the Ollama backend.
275
+ """
276
+ alias = _resolve_alias(model)
277
+ kwargs = _opts_to_kwargs(options)
278
+ docs = _extract_documents(messages)
279
+ if docs:
280
+ # Merge into extra_body so Granite's HF chat template (vLLM)
281
+ # picks them up. Ollama backend ignores extra_body and keeps
282
+ # using the role="document <id>" messages already in `messages`.
283
+ eb = kwargs.setdefault("extra_body", {})
284
+ eb["documents"] = docs
285
+ eb.setdefault("chat_template_kwargs", {})["documents"] = docs
286
+ if format == "json":
287
+ # OpenAI/vLLM path
288
+ kwargs["response_format"] = {"type": "json_object"}
289
+ # Ollama path (LiteLLM forwards this via extra_body for ollama_chat)
290
+ kwargs.setdefault("extra_body", {})["format"] = "json"
291
+ if stream:
292
+ s = _router.completion(model=alias, messages=messages,
293
+ stream=True, **kwargs)
294
+ return _stream_to_ollama_shape(s)
295
+ resp = _router.completion(model=alias, messages=messages, **kwargs)
296
+ return _to_ollama_shape(resp)
app/mellea_validator.py ADDED
@@ -0,0 +1,501 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Mellea-validated reconciliation for Riprap.
2
+
3
+ Wraps the existing Granite-via-Ollama reconciliation in IBM Research's
4
+ Mellea framework: typed output + programmatic post-conditions +
5
+ rejection sampling. Replaces post-hoc sentence-dropping with
6
+ "don't accept output until requirements pass."
7
+
8
+ Streaming and rejection sampling are mutually exclusive — by the time
9
+ we'd validate, the user has watched the bad output appear. Strict mode
10
+ trades streaming for compliance; the UI shows a "validating" skeleton
11
+ instead of token-by-token render.
12
+
13
+ The four invariants ported from the parent project's mellea_probe:
14
+
15
+ 1. no_invented_numbers — every number in output appears in source
16
+ 2. no_placeholder_tokens — output never contains "[source]" or
17
+ raw <document> markup
18
+ 3. every_claim_cited — each numeric token has a [doc_id] within
19
+ ~40 chars
20
+ 4. referenced_doc_ids_exist — cited doc_ids ⊆ input doc_ids
21
+ """
22
+ from __future__ import annotations
23
+
24
+ import logging
25
+ import os
26
+ import re
27
+ import time
28
+ from typing import Any
29
+
30
+ from mellea import start_session
31
+ from mellea.stdlib.requirements import req, simple_validate
32
+ from mellea.stdlib.sampling import RejectionSamplingStrategy
33
+
34
+ from app import llm
35
+
36
+ log = logging.getLogger("riprap.mellea")
37
+
38
+ # Default reconciler model — same env-var contract as app/reconcile.py.
39
+ DEFAULT_MODEL = os.environ.get(
40
+ "RIPRAP_RECONCILER_MODEL",
41
+ os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:8b"),
42
+ )
43
+
44
+ # Loop budget — try up to N samples before falling back to the last
45
+ # candidate even if it didn't pass all requirements. Low ceiling so a
46
+ # pathological case can't run away with latency.
47
+ #
48
+ # Override at process start with RIPRAP_MELLEA_MAX_ATTEMPTS. We default
49
+ # to 2 on the local Ollama path (where each attempt is 30-90 s on the
50
+ # Mac) and 3 on remote/vLLM (where attempts are seconds). This caps
51
+ # worst-case demo latency without giving up the principal grounding
52
+ # guarantee — the first-attempt pass rate on the curated probes is >85%.
53
+ def _default_loop_budget() -> int:
54
+ try:
55
+ n = int(os.environ.get("RIPRAP_MELLEA_MAX_ATTEMPTS", "0"))
56
+ if n > 0:
57
+ return n
58
+ except ValueError:
59
+ pass
60
+ return 2 if os.environ.get("RIPRAP_LLM_PRIMARY", "ollama").lower() == "ollama" else 3
61
+
62
+
63
+ DEFAULT_LOOP_BUDGET = _default_loop_budget()
64
+
65
+ # Number tokens — \b enforces a word boundary so identifier codes like
66
+ # QN1206, B12 (community board), or M14 (bus route) are skipped entirely.
67
+ # Inside QN1206 there's no \b between any chars, so no submatch leaks.
68
+ _NUM_RE = re.compile(r"\b-?\d[\d,]*(?:\.\d+)?\b")
69
+ _CITE_RE = re.compile(r"\[(?P<id>[a-z][a-z0-9_]*)\]")
70
+ # Same trivial-numbers list as the post-hoc verifier — well-known service
71
+ # line numbers, single digits.
72
+ _TRIVIAL_NUMS = {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "100",
73
+ "311", "911", "211"}
74
+
75
+
76
+ def _strip_markdown_for_check(text: str) -> str:
77
+ """Drop bold markers + citation tags so the numeric scan is clean."""
78
+ text = re.sub(r"\*\*([^*]+)\*\*", r"\1", text)
79
+ text = re.sub(r"\[[a-z0-9_]+\]", "", text, flags=re.I)
80
+ return text
81
+
82
+
83
+ def _normalize_num(s: str) -> set[str]:
84
+ forms = {s}
85
+ no_comma = s.replace(",", "")
86
+ forms.add(no_comma)
87
+ if "." in no_comma:
88
+ forms.add(no_comma.rstrip("0").rstrip("."))
89
+ return {f for f in forms if f}
90
+
91
+
92
+ def _haystack(doc_msgs: list[dict]) -> str:
93
+ return "\n".join(m.get("content", "") for m in doc_msgs)
94
+
95
+
96
+ def _doc_ids(doc_msgs: list[dict]) -> set[str]:
97
+ """Each doc message has role like "document <id>"; extract ids."""
98
+ out = set()
99
+ for m in doc_msgs:
100
+ role = m.get("role", "")
101
+ if role.startswith("document "):
102
+ out.add(role.split(" ", 1)[1].strip())
103
+ return out
104
+
105
+
106
+ # --- the four invariants ---------------------------------------------------
107
+
108
+
109
+ def _check_no_invented_numbers(doc_msgs: list[dict]):
110
+ haystack = _haystack(doc_msgs)
111
+ def _fn(text: str):
112
+ clean = _strip_markdown_for_check(text)
113
+ invented = []
114
+ for n in _NUM_RE.findall(clean):
115
+ if n in _TRIVIAL_NUMS:
116
+ continue
117
+ forms = _normalize_num(n)
118
+ if not any(f in haystack for f in forms):
119
+ invented.append(n)
120
+ return not invented # pass = no invented numbers
121
+ return _fn
122
+
123
+
124
+ def _check_no_placeholder_tokens():
125
+ def _fn(text: str):
126
+ bad = []
127
+ if "[source]" in text.lower():
128
+ bad.append("[source]")
129
+ if "<document" in text:
130
+ bad.append("<document>")
131
+ if "</document" in text:
132
+ bad.append("</document>")
133
+ return not bad
134
+ return _fn
135
+
136
+
137
+ def _check_every_claim_cited():
138
+ """Each non-trivial numeric token must have a [doc_id] somewhere in
139
+ the same sentence. Sentence boundaries are conservative: a period
140
+ followed by whitespace, or end of text. This matches how a reader
141
+ actually attributes claims — the citation can be anywhere in the
142
+ sentence, not just adjacent to the number."""
143
+ # Sentence end = `. ` or `.\n` or end-of-string. Question/exclamation
144
+ # marks rarely appear in these briefings; period is enough.
145
+ _SENT_END = re.compile(r"\.[\s)]|\.$")
146
+
147
+ def _sentence_span(text: str, pos: int) -> tuple[int, int]:
148
+ # Walk backwards to the previous sentence terminator.
149
+ start = 0
150
+ for m in _SENT_END.finditer(text, 0, pos):
151
+ start = m.end()
152
+ # Walk forwards to the next.
153
+ m = _SENT_END.search(text, pos)
154
+ end = m.start() + 1 if m else len(text)
155
+ return start, end
156
+
157
+ def _fn(text: str):
158
+ clean = re.sub(r"\*\*([^*]+)\*\*", r"\1", text)
159
+ for m in _NUM_RE.finditer(clean):
160
+ n = m.group(0)
161
+ if n in _TRIVIAL_NUMS:
162
+ continue
163
+ s, e = _sentence_span(clean, m.start())
164
+ if not _CITE_RE.search(clean[s:e]):
165
+ return False
166
+ return True
167
+ return _fn
168
+
169
+
170
+ def _failing_sentences_for_citations(text: str) -> list[str]:
171
+ """Return the sentences in `text` that contain a non-trivial number
172
+ but no [doc_id] citation. Used to give the model targeted reroll
173
+ feedback so it can fix the exact spots that failed."""
174
+ clean = re.sub(r"\*\*([^*]+)\*\*", r"\1", text)
175
+ sents = re.split(r"\.[\s)]|\.$", clean)
176
+ bad = []
177
+ for s in sents:
178
+ nums = [n for n in _NUM_RE.findall(s) if n not in _TRIVIAL_NUMS]
179
+ if nums and not _CITE_RE.search(s):
180
+ bad.append(s)
181
+ return bad
182
+
183
+
184
+ def _check_referenced_doc_ids_exist(doc_msgs: list[dict]):
185
+ valid = _doc_ids(doc_msgs)
186
+ def _fn(text: str):
187
+ cited = {m.group("id") for m in _CITE_RE.finditer(text)}
188
+ rogue = cited - valid
189
+ return not rogue
190
+ return _fn
191
+
192
+
193
+ # --- main entry point ------------------------------------------------------
194
+
195
+
196
+ def reconcile_strict(doc_msgs: list[dict],
197
+ system_prompt: str,
198
+ user_prompt: str = "Write the cited briefing now.",
199
+ model: str | None = None,
200
+ loop_budget: int = DEFAULT_LOOP_BUDGET,
201
+ ollama_options: dict | None = None) -> dict[str, Any]:
202
+ """Run Granite reconciliation with Mellea rejection sampling.
203
+
204
+ Returns a dict with:
205
+ paragraph — final validated text
206
+ rerolls — number of resamples (0 = passed first try)
207
+ requirements_passed — list of requirement names that passed in the
208
+ accepted sample
209
+ requirements_failed — list of requirement names that failed
210
+ (empty on accepted sample)
211
+ elapsed_s — total seconds including rerolls
212
+ model — model id used
213
+ loop_budget — configured budget
214
+ """
215
+ model = model or DEFAULT_MODEL
216
+ t0 = time.time()
217
+
218
+ # Per-requirement closures wired with the doc context.
219
+ # Keep the validator functions in our own table so we can re-run them
220
+ # on the final paragraph to produce reliable pass/fail metadata for
221
+ # the report — Mellea's internal validation-result objects vary by
222
+ # version and aren't great for downstream display.
223
+ checks = [
224
+ ("numerics_grounded",
225
+ "All numbers in the output must appear verbatim in the source documents.",
226
+ _check_no_invented_numbers(doc_msgs)),
227
+ ("no_placeholder_tokens",
228
+ "The output must not contain placeholder tokens like [source] or raw <document> markup.",
229
+ _check_no_placeholder_tokens()),
230
+ ("citations_dense",
231
+ "Every numeric claim must have a [doc_id] citation within ~120 characters.",
232
+ _check_every_claim_cited()),
233
+ ("citations_resolve",
234
+ "Every cited [doc_id] must correspond to a real source document.",
235
+ _check_referenced_doc_ids_exist(doc_msgs)),
236
+ ]
237
+ requirements = [
238
+ req(desc, validation_fn=simple_validate(fn, reason=name))
239
+ for name, desc, fn in checks
240
+ ]
241
+
242
+ session = start_session(backend_name="ollama", model_id=model,
243
+ model_options=ollama_options or {})
244
+ try:
245
+ # Build the prompt: system + serialized doc context + user task.
246
+ # Mellea's instruct() takes the whole instruction; we serialize
247
+ # the doc messages into the description so the haystack is
248
+ # available to the model the same way it would be via
249
+ # ollama.chat with role="document <id>" messages.
250
+ doc_block = "\n\n".join(
251
+ f"<document id=\"{m['role'].split(' ', 1)[1] if m['role'].startswith('document ') else 'unknown'}\">\n"
252
+ f"{m['content']}\n</document>"
253
+ for m in doc_msgs
254
+ )
255
+ instruction = (
256
+ f"{system_prompt}\n\n"
257
+ f"DOCUMENTS:\n{doc_block}\n\n"
258
+ f"TASK: {user_prompt}"
259
+ )
260
+
261
+ result = session.instruct(
262
+ description=instruction,
263
+ strategy=RejectionSamplingStrategy(
264
+ loop_budget=loop_budget,
265
+ requirements=requirements,
266
+ ),
267
+ requirements=requirements,
268
+ return_sampling_results=True,
269
+ model_options={"temperature": 0,
270
+ "num_ctx": int(os.environ.get("RIPRAP_MELLEA_NUM_CTX", "4096")),
271
+ "num_predict": int(os.environ.get("RIPRAP_MELLEA_NUM_PREDICT", "400")),
272
+ **(ollama_options or {})},
273
+ )
274
+
275
+ paragraph = _extract_text(result).strip()
276
+ n_attempts = _extract_attempts(result)
277
+ rerolls = max(0, n_attempts - 1)
278
+ finally:
279
+ try:
280
+ session.cleanup()
281
+ except Exception:
282
+ pass
283
+
284
+ # Re-run our own checks on the final paragraph for clean pass/fail
285
+ # metadata. This is what shows up in the report's compliance section.
286
+ passed: list[str] = []
287
+ failed: list[str] = []
288
+ for name, _desc, fn in checks:
289
+ try:
290
+ if fn(paragraph):
291
+ passed.append(name)
292
+ else:
293
+ failed.append(name)
294
+ except Exception as e:
295
+ log.warning("requirement %s raised: %r", name, e)
296
+ failed.append(name)
297
+
298
+ return {
299
+ "paragraph": paragraph,
300
+ "rerolls": rerolls,
301
+ "n_attempts": n_attempts,
302
+ "requirements_total": len(checks),
303
+ "requirements_passed": passed,
304
+ "requirements_failed": failed,
305
+ "elapsed_s": round(time.time() - t0, 2),
306
+ "model": model,
307
+ "loop_budget": loop_budget,
308
+ }
309
+
310
+
311
+ def reconcile_strict_streaming(
312
+ doc_msgs: list[dict],
313
+ system_prompt: str,
314
+ user_prompt: str = "Write the cited briefing now.",
315
+ model: str | None = None,
316
+ loop_budget: int = DEFAULT_LOOP_BUDGET,
317
+ ollama_options: dict | None = None,
318
+ on_token=None,
319
+ on_attempt_end=None,
320
+ ) -> dict[str, Any]:
321
+ """Hand-rolled rejection sampler that *streams* each attempt to the
322
+ user instead of waiting silently for Mellea to validate behind the
323
+ scenes. Same compliance contract as reconcile_strict — runs the
324
+ same four checks, accepts the first attempt that passes, falls back
325
+ to the last attempt if the budget is exhausted.
326
+
327
+ Callbacks (both optional, both fire on the calling thread):
328
+ on_token(delta: str, attempt_idx: int)
329
+ — fires for every token chunk as it arrives from Granite.
330
+ on_attempt_end(attempt_idx: int, passed: list[str], failed: list[str])
331
+ — fires after each attempt with its per-requirement outcome.
332
+ The frontend uses this to render reroll banners + reset the
333
+ briefing buffer when a new attempt begins.
334
+ """
335
+ model = model or DEFAULT_MODEL
336
+ t0 = time.time()
337
+
338
+ checks = [
339
+ ("numerics_grounded",
340
+ _check_no_invented_numbers(doc_msgs)),
341
+ ("no_placeholder_tokens",
342
+ _check_no_placeholder_tokens()),
343
+ ("citations_dense",
344
+ _check_every_claim_cited()),
345
+ ("citations_resolve",
346
+ _check_referenced_doc_ids_exist(doc_msgs)),
347
+ ]
348
+
349
+ base_messages = doc_msgs + [
350
+ {"role": "system", "content": system_prompt},
351
+ {"role": "user", "content": user_prompt},
352
+ ]
353
+ # num_ctx 4096 fits a typical trimmed prompt (≈700 system + ≈2500 docs);
354
+ # num_predict 400 caps the 4-section briefing at ≈300-350 tokens. With
355
+ # RIPRAP_TRIM_DOCS=1 and the planner picking 6-9 specialists, the 4096
356
+ # window has been sufficient on every probe; the previous 6144/600 was
357
+ # sized for the *untrimmed* fan-out and was forcing Ollama to grow the
358
+ # KV cache (33% more memory + a full re-init) every Mellea attempt.
359
+ # Override with RIPRAP_MELLEA_NUM_CTX / RIPRAP_MELLEA_NUM_PREDICT.
360
+ base_opts = {"temperature": 0,
361
+ "num_ctx": int(os.environ.get("RIPRAP_MELLEA_NUM_CTX", "4096")),
362
+ "num_predict": int(os.environ.get("RIPRAP_MELLEA_NUM_PREDICT", "400")),
363
+ **(ollama_options or {})}
364
+
365
+ paragraph = ""
366
+ last_passed: list[str] = []
367
+ last_failed: list[str] = [name for name, _ in checks]
368
+ last_paragraph = ""
369
+ attempts = 0
370
+
371
+ for attempt_idx in range(loop_budget):
372
+ attempts = attempt_idx + 1
373
+ # On reroll, append a tight feedback message naming what failed AND
374
+ # the specific failing sentences (so the model knows exactly which
375
+ # ones to fix). Granite responds well to surgical corrections.
376
+ messages = list(base_messages)
377
+ if attempt_idx > 0 and last_failed:
378
+ feedback = [
379
+ f"Your previous draft failed: {', '.join(last_failed)}.",
380
+ ]
381
+ if "citations_dense" in last_failed and last_paragraph:
382
+ bad = _failing_sentences_for_citations(last_paragraph)
383
+ if bad:
384
+ feedback.append(
385
+ "Specific sentences with uncited numbers:"
386
+ )
387
+ for s in bad[:3]:
388
+ feedback.append(f" - {s.strip()}")
389
+ feedback.append(
390
+ "Add a [doc_id] citation at the end of each. "
391
+ "Re-emit the FULL briefing."
392
+ )
393
+ else:
394
+ feedback.append(
395
+ "Re-write so every sentence containing a number ends "
396
+ "with a [doc_id] citation."
397
+ )
398
+ messages.append({"role": "user", "content": "\n".join(feedback)})
399
+
400
+ chunks: list[str] = []
401
+ for chunk in llm.chat(model=model, messages=messages,
402
+ stream=True, options=base_opts):
403
+ delta = (chunk.get("message") or {}).get("content") or ""
404
+ if delta:
405
+ chunks.append(delta)
406
+ if on_token is not None:
407
+ try:
408
+ on_token(delta, attempt_idx)
409
+ except Exception:
410
+ log.exception("on_token callback raised")
411
+ paragraph = "".join(chunks).strip()
412
+
413
+ passed: list[str] = []
414
+ failed: list[str] = []
415
+ for name, fn in checks:
416
+ try:
417
+ (passed if fn(paragraph) else failed).append(name)
418
+ except Exception as e:
419
+ log.warning("requirement %s raised: %r", name, e)
420
+ failed.append(name)
421
+
422
+ last_passed = passed
423
+ last_failed = failed
424
+ last_paragraph = paragraph
425
+ if on_attempt_end is not None:
426
+ try:
427
+ on_attempt_end(attempt_idx, passed, failed)
428
+ except Exception:
429
+ log.exception("on_attempt_end callback raised")
430
+
431
+ if not failed:
432
+ break
433
+
434
+ return {
435
+ "paragraph": paragraph,
436
+ "rerolls": max(0, attempts - 1),
437
+ "n_attempts": attempts,
438
+ "requirements_total": len(checks),
439
+ "requirements_passed": last_passed,
440
+ "requirements_failed": last_failed,
441
+ "elapsed_s": round(time.time() - t0, 2),
442
+ "model": model,
443
+ "loop_budget": loop_budget,
444
+ }
445
+
446
+
447
+ def _extract_text(result) -> str:
448
+ """SamplingResult / ModelOutputThunk text extraction."""
449
+ for attr in ("sample", "result", "value", "content"):
450
+ v = getattr(result, attr, None)
451
+ if v is not None:
452
+ if hasattr(v, "value"):
453
+ return str(v.value)
454
+ return str(v)
455
+ return str(result)
456
+
457
+
458
+ def _extract_attempts(result) -> int:
459
+ """How many samples were drawn before stopping."""
460
+ for attr in ("n_attempts", "num_attempts", "attempts"):
461
+ v = getattr(result, attr, None)
462
+ if isinstance(v, int):
463
+ return v
464
+ samples = getattr(result, "sample_validations", None) or getattr(result, "samples", None)
465
+ if isinstance(samples, list):
466
+ return len(samples)
467
+ return 1
468
+
469
+
470
+ def _extract_pass_fail(result) -> tuple[list[str], list[str]]:
471
+ """Best-effort extraction of which requirements passed on the
472
+ accepted sample. mellea v0.4 exposes sample_validations as a list
473
+ where each entry is itself a list of (Requirement, ValidationResult)
474
+ tuples — duck-type defensively.
475
+ """
476
+ validations = getattr(result, "sample_validations", None)
477
+ if not validations:
478
+ return [], []
479
+ last = validations[-1] if isinstance(validations, list) else validations
480
+ passed: list[str] = []
481
+ failed: list[str] = []
482
+ items = last if isinstance(last, list) else [last]
483
+ for item in items:
484
+ # Item might be (Requirement, ValidationResult) tuple, or a single
485
+ # ValidationResult, or a Requirement, depending on mellea version.
486
+ ok = None
487
+ descr = ""
488
+ if isinstance(item, tuple) and len(item) >= 2:
489
+ descr = str(item[0])[:80]
490
+ v = item[1]
491
+ ok = bool(getattr(v, "passed", getattr(v, "is_valid",
492
+ getattr(v, "result", False))))
493
+ else:
494
+ descr = str(getattr(item, "requirement", item))[:80]
495
+ ok = bool(getattr(item, "passed", getattr(item, "is_valid",
496
+ getattr(item, "result", False))))
497
+ if ok:
498
+ passed.append(descr)
499
+ else:
500
+ failed.append(descr)
501
+ return passed, failed
app/planner.py ADDED
@@ -0,0 +1,250 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Riprap query planner — Granite 4.1 routes a natural-language query
2
+ to one of several intents and selects which specialists to invoke.
3
+
4
+ This is the agentic kernel: instead of running every specialist on
5
+ every query, the planner reads the query and emits a structured plan.
6
+ The executor then runs only the relevant specialists, in parallel
7
+ where dependencies permit.
8
+
9
+ Output is a single JSON object with a fixed schema (see PLAN_SCHEMA).
10
+ We use Ollama's `format='json'` constrained-decoding mode so Granite
11
+ 4.1 cannot emit malformed structure. A deterministic post-validator
12
+ sanity-checks the plan against the supported intents and specialists.
13
+ """
14
+ from __future__ import annotations
15
+
16
+ import json
17
+ import logging
18
+ import os
19
+ from dataclasses import dataclass
20
+ from typing import Any
21
+
22
+ from app import llm
23
+
24
+ log = logging.getLogger("riprap.planner")
25
+
26
+ # Routing is a small structured-output task; speed wins over depth here.
27
+ # Pin to the 3b variant explicitly — even if a deployment pulls 8b for
28
+ # reconciliation, the planner stays small to keep TTFB low.
29
+ OLLAMA_MODEL = os.environ.get("RIPRAP_PLANNER_MODEL",
30
+ os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:3b"))
31
+
32
+ # ---- Plan schema -----------------------------------------------------------
33
+ #
34
+ # The set of intents Riprap currently supports. Every plan picks exactly
35
+ # one; the executor maps intent → action graph in app/intents/.
36
+
37
+ INTENTS = {
38
+ "single_address": (
39
+ "Use ONLY when the query contains a specific street ADDRESS — "
40
+ "house number + street name (e.g. '116-50 Sutphin Blvd', '350 5th "
41
+ "Ave Manhattan'). If the query names only a neighborhood or "
42
+ "borough without a house number, the intent is 'neighborhood', "
43
+ "even if phrased as a yes/no question like 'is X at risk?' or "
44
+ "'is X safe?'."
45
+ ),
46
+ "neighborhood": (
47
+ "Use when the query names a NEIGHBORHOOD or BOROUGH with no "
48
+ "specific street address (e.g. 'Brighton Beach', 'Carroll "
49
+ "Gardens', 'Brooklyn', 'is Red Hook at risk?', 'show me Hollis "
50
+ "flooding'). Skip geocoding; resolve to NTA polygon(s) and run "
51
+ "polygon-level specialists."
52
+ ),
53
+ "live_now": (
54
+ "User asked about CURRENT CONDITIONS in NYC (e.g. 'is there "
55
+ "flooding right now', 'what's the surge tonight'). Skip historic "
56
+ "and modeled specialists; focus on live-data specialists."
57
+ ),
58
+ "development_check": (
59
+ "User asked about CURRENT/IN-PROGRESS CONSTRUCTION OR DEVELOPMENT "
60
+ "in a place, with implicit interest in flood risk for those projects "
61
+ "(e.g. 'what are they building in Gowanus and is it risky?', "
62
+ "'show me new construction in flood zones', 'are there projects "
63
+ "underway in Red Hook?'). Resolve target to NTA polygon, pull active "
64
+ "DOB construction permits inside it, cross-reference each project "
65
+ "with Sandy + DEP flood layers, return a flagged-projects list."
66
+ ),
67
+ }
68
+
69
+ SPECIALISTS = {
70
+ # name: (description, which intents may invoke it)
71
+ "geocode": ("Resolve address text to lat/lon via NYC DCP Geosearch.", ["single_address"]),
72
+ "nta_resolve": ("Resolve a neighborhood or borough name to NTA polygon(s).", ["neighborhood"]),
73
+ "sandy": ("2012 Sandy inundation extent (point-in-polygon or % of NTA).", ["single_address", "neighborhood"]),
74
+ "dep_stormwater":("DEP Stormwater Maps — 3 modeled scenarios.", ["single_address", "neighborhood"]),
75
+ "floodnet": ("Live FloodNet ultrasonic sensors + trigger history.", ["single_address", "neighborhood", "live_now"]),
76
+ "nyc311": ("NYC 311 flood-related complaints in buffer or polygon.", ["single_address", "neighborhood"]),
77
+ "noaa_tides": ("Live NOAA Battery / Kings Pt / Sandy Hook water level.", ["single_address", "neighborhood", "live_now"]),
78
+ "nws_alerts": ("Live NWS active flood-relevant alerts at point.", ["single_address", "neighborhood", "live_now"]),
79
+ "nws_obs": ("Live NWS hourly precip from nearest ASOS station.", ["single_address", "neighborhood", "live_now"]),
80
+ "ttm_forecast": ("Granite TTM r2 surge-residual nowcast at the Battery.", ["single_address", "neighborhood", "live_now"]),
81
+ "microtopo": ("LiDAR-derived terrain (HAND, TWI, percentile) at point or aggregated over polygon.", ["single_address", "neighborhood"]),
82
+ "ida_hwm": ("USGS Hurricane Ida 2021 high-water marks proximity.", ["single_address", "neighborhood"]),
83
+ "prithvi": ("Prithvi-EO 2.0 Hurricane Ida 2021 satellite flood polygons.", ["single_address", "neighborhood"]),
84
+ "rag": ("Retrieve relevant agency-report passages over the policy corpus.", ["single_address", "neighborhood", "development_check"]),
85
+ "dob_permits": ("Active NYC DOB construction permits inside a polygon, each cross-referenced with Sandy + DEP flood scenarios. Use for 'what are they building' / 'projects in progress' queries.", ["development_check"]),
86
+ }
87
+
88
+
89
+ @dataclass
90
+ class Plan:
91
+ intent: str
92
+ targets: list[dict[str, str]]
93
+ specialists: list[str]
94
+ rationale: str
95
+
96
+
97
+ PLAN_SCHEMA_DESC = """The output JSON must have exactly these keys:
98
+
99
+ {
100
+ "intent": one of [single_address, neighborhood, live_now, development_check],
101
+ "targets": [
102
+ // one or more target objects, each with:
103
+ // {"type": "address", "text": "<address text>"} when intent=single_address
104
+ // {"type": "nta", "text": "<neighborhood>"} when intent=neighborhood
105
+ // {"type": "borough", "text": "<borough>"} when intent=neighborhood (boro-wide)
106
+ // {"type": "nyc", "text": "NYC"} when intent=live_now (no specific place)
107
+ ],
108
+ "specialists": [list of specialist names from the SPECIALISTS catalog the executor should run],
109
+ "rationale": "<one sentence: why this intent + this set of specialists>"
110
+ }
111
+
112
+ Hard rules:
113
+ - Pick ONE intent only.
114
+ - Specialists must be drawn from the catalog and must be applicable to the chosen intent.
115
+ - For intent=single_address: ALWAYS include "geocode". Typically include all static + live specialists.
116
+ - For intent=neighborhood: ALWAYS include "nta_resolve". Skip "geocode". Include polygon-capable specialists.
117
+ - For intent=live_now: ONLY live specialists. Skip historic/modeled (sandy, dep_*, ida_hwm, prithvi).
118
+ - For intent=development_check: ALWAYS include "nta_resolve" AND "dob_permits". Sandy + DEP are also useful so the model can compare project locations to flood layers.
119
+ - IMPORTANT — TARGETS: extract neighborhood/borough names directly from the query text. If the query says "in Gowanus", "what about Brighton Beach", "around Carroll Gardens", etc., the target MUST be {"type": "nta", "text": "<the place name>"}. Use {"type": "nyc"} ONLY when the query mentions NYC as a whole and no specific place. Failing to extract a place name will cause the executor to give up — be explicit.
120
+ - "targets" is a list because the user may name multiple places (e.g. "compare Brighton Beach and Coney Island").
121
+ - "rationale" is one short sentence — what your reasoning was.
122
+ """
123
+
124
+
125
+ SYSTEM_PROMPT = f"""You are Riprap's query planner. You read a user's natural-language flood-risk query and emit a structured execution plan.
126
+
127
+ You do NOT have access to any data. You only decide which intent fits the query and which specialists are relevant. Another component (the executor) will run the specialists.
128
+
129
+ Available intents:
130
+ {chr(10).join(f" - {k}: {v}" for k, v in INTENTS.items())}
131
+
132
+ Available specialists (and which intents they apply to):
133
+ {chr(10).join(f" - {name}: {desc} (intents: {', '.join(intents)})" for name, (desc, intents) in SPECIALISTS.items())}
134
+
135
+ {PLAN_SCHEMA_DESC}
136
+
137
+ Output ONLY the JSON object. No commentary, no markdown."""
138
+
139
+
140
+ # ---- Planner call ----------------------------------------------------------
141
+
142
+ def plan(query: str, model: str = OLLAMA_MODEL, on_token=None) -> Plan:
143
+ """Ask Granite 4.1 to plan a query. Returns a validated Plan.
144
+
145
+ If on_token is provided, the planner runs in streaming mode and
146
+ on_token(delta) is called for each chunk of the JSON output as
147
+ Granite generates. The streaming endpoint uses this to show the
148
+ agent's reasoning forming live in the UI.
149
+ """
150
+ messages = [
151
+ {"role": "system", "content": SYSTEM_PROMPT},
152
+ {"role": "user", "content": query},
153
+ ]
154
+ if on_token is None:
155
+ resp = llm.chat(model=model, messages=messages,
156
+ format="json", options={"temperature": 0})
157
+ raw = resp["message"]["content"].strip()
158
+ else:
159
+ chunks: list[str] = []
160
+ for chunk in llm.chat(model=model, messages=messages,
161
+ format="json", stream=True,
162
+ options={"temperature": 0}):
163
+ delta = (chunk.get("message") or {}).get("content") or ""
164
+ if delta:
165
+ chunks.append(delta)
166
+ on_token(delta)
167
+ raw = "".join(chunks).strip()
168
+ log.info("planner raw: %s", raw[:400])
169
+ try:
170
+ d = json.loads(raw)
171
+ except json.JSONDecodeError as e:
172
+ raise ValueError(f"planner emitted non-JSON: {raw!r}") from e
173
+ return _validate(d, raw_query=query)
174
+
175
+
176
+ def _validate(d: dict[str, Any], raw_query: str) -> Plan:
177
+ """Defensive parse + sanitize. The model might pick an invalid intent
178
+ or a specialist that isn't applicable; fall back to single_address
179
+ with the raw query as the address (the most common case)."""
180
+ intent = d.get("intent")
181
+ if intent not in INTENTS:
182
+ log.warning("planner picked invalid intent %r; defaulting to single_address", intent)
183
+ intent = "single_address"
184
+
185
+ raw_targets = d.get("targets") or []
186
+ targets: list[dict[str, str]] = []
187
+ for t in raw_targets:
188
+ if not isinstance(t, dict):
189
+ continue
190
+ t_type = t.get("type")
191
+ t_text = (t.get("text") or "").strip()
192
+ if not t_text or t_type not in ("address", "nta", "borough", "nyc"):
193
+ continue
194
+ targets.append({"type": t_type, "text": t_text})
195
+ if not targets:
196
+ # Reasonable fallback: assume the raw query IS the target
197
+ if intent == "single_address":
198
+ targets = [{"type": "address", "text": raw_query}]
199
+ elif intent == "neighborhood":
200
+ targets = [{"type": "nta", "text": raw_query}]
201
+ else:
202
+ targets = [{"type": "nyc", "text": "NYC"}]
203
+
204
+ raw_specialists = d.get("specialists") or []
205
+ specialists: list[str] = []
206
+ for s in raw_specialists:
207
+ if isinstance(s, str) and s in SPECIALISTS:
208
+ _, applicable = SPECIALISTS[s]
209
+ if intent in applicable:
210
+ specialists.append(s)
211
+ # Enforce a floor: each intent has canonical specialists that should
212
+ # always run. The planner picks ADDITIONS; we ensure the minimum.
213
+ required = _required_specialists(intent)
214
+ added = [s for s in required if s not in specialists]
215
+ if added:
216
+ log.info("planner missed required %s for intent=%s; adding", added, intent)
217
+ specialists = list(dict.fromkeys(specialists + required))
218
+ if not specialists:
219
+ specialists = _default_specialists(intent)
220
+
221
+ rationale = (d.get("rationale") or "").strip() or "(no rationale provided)"
222
+ return Plan(intent=intent, targets=targets, specialists=specialists, rationale=rationale)
223
+
224
+
225
+ def _required_specialists(intent: str) -> list[str]:
226
+ """Floor: specialists that are ALWAYS run for an intent regardless of
227
+ what the planner emitted. Captures load-bearing signals the planner
228
+ sometimes forgets (sandy / dep for neighborhood; geocode for address)."""
229
+ if intent == "single_address":
230
+ return ["geocode", "sandy", "dep_stormwater", "microtopo"]
231
+ if intent == "neighborhood":
232
+ return ["nta_resolve", "sandy", "dep_stormwater", "nyc311"]
233
+ if intent == "live_now":
234
+ return ["nws_alerts", "noaa_tides"]
235
+ if intent == "development_check":
236
+ return ["nta_resolve", "dob_permits", "sandy", "dep_stormwater"]
237
+ return []
238
+
239
+
240
+ def _default_specialists(intent: str) -> list[str]:
241
+ if intent == "single_address":
242
+ return ["geocode", "sandy", "dep_stormwater", "floodnet", "nyc311",
243
+ "noaa_tides", "nws_alerts", "nws_obs", "ttm_forecast",
244
+ "microtopo", "ida_hwm", "prithvi", "rag"]
245
+ if intent == "neighborhood":
246
+ return ["nta_resolve", "sandy", "dep_stormwater", "nyc311",
247
+ "microtopo", "rag"]
248
+ if intent == "live_now":
249
+ return ["noaa_tides", "nws_alerts", "nws_obs", "ttm_forecast", "floodnet"]
250
+ return []
app/rag.py CHANGED
@@ -12,6 +12,7 @@ The index is small (~1k chunks across 5 PDFs).
12
  from __future__ import annotations
13
 
14
  import logging
 
15
  import re
16
  from dataclasses import dataclass
17
  from pathlib import Path
@@ -102,6 +103,18 @@ def _chunks_from_pdf(path: Path, target_chars: int = 700) -> list[Chunk]:
102
 
103
 
104
  _INDEX: dict | None = None
 
 
 
 
 
 
 
 
 
 
 
 
105
 
106
 
107
  def _ensure_index():
@@ -132,8 +145,28 @@ def _ensure_index():
132
  return _INDEX
133
 
134
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
135
  def warm():
136
  _ensure_index()
 
137
 
138
 
139
  def retrieve(query: str, k: int = 4, min_score: float = 0.30) -> list[dict]:
@@ -142,19 +175,57 @@ def retrieve(query: str, k: int = 4, min_score: float = 0.30) -> list[dict]:
142
  return []
143
  qv = idx["model"].encode([query], convert_to_numpy=True,
144
  normalize_embeddings=True).astype("float32")
145
- # cosine similarity (vectors are L2-normalized)
146
  sims = (idx["embs"] @ qv.T).ravel()
147
- top = np.argsort(-sims)[:k * 3] # over-fetch then de-dupe per doc
148
- out: list[dict] = []
149
- seen_per_doc: dict[str, int] = {}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
150
  for i in top:
151
  if sims[i] < min_score:
152
  continue
153
  c = idx["chunks"][i]
154
- if seen_per_doc.get(c.doc_id, 0) >= 1: # at most 1 chunk per doc
155
  continue
156
- seen_per_doc[c.doc_id] = seen_per_doc.get(c.doc_id, 0) + 1
157
- out.append({
158
  "doc_id": c.doc_id,
159
  "title": c.title,
160
  "citation": c.citation,
@@ -163,6 +234,6 @@ def retrieve(query: str, k: int = 4, min_score: float = 0.30) -> list[dict]:
163
  "text": c.text,
164
  "score": float(sims[i]),
165
  })
166
- if len(out) >= k:
167
  break
168
- return out
 
12
  from __future__ import annotations
13
 
14
  import logging
15
+ import os
16
  import re
17
  from dataclasses import dataclass
18
  from pathlib import Path
 
103
 
104
 
105
  _INDEX: dict | None = None
106
+ _RERANKER = None # lazy CrossEncoder
107
+
108
+ # Reranker switch: when "1", retrieve() over-fetches K*5 candidates without
109
+ # the per-doc dedup, scores them via the Granite Embedding Reranker R2
110
+ # cross-encoder, then dedups to K. Falls back to the baseline ranker when
111
+ # disabled. See experiments/03_granite_reranker/RESULTS.md for the
112
+ # reasoning behind inverting dedup vs rerank.
113
+ _RERANKER_ENABLE = os.environ.get("RIPRAP_RERANKER_ENABLE", "").lower() in ("1", "true", "yes")
114
+ _RERANKER_MODEL_NAME = os.environ.get(
115
+ "RIPRAP_RERANKER_MODEL",
116
+ "ibm-granite/granite-embedding-reranker-english-r2",
117
+ )
118
 
119
 
120
  def _ensure_index():
 
145
  return _INDEX
146
 
147
 
148
+ def _ensure_reranker():
149
+ """Lazy-load the cross-encoder. Returns None if disabled or load fails;
150
+ callers fall back to the baseline ranker silently."""
151
+ global _RERANKER
152
+ if not _RERANKER_ENABLE:
153
+ return None
154
+ if _RERANKER is not None:
155
+ return _RERANKER
156
+ try:
157
+ from sentence_transformers import CrossEncoder
158
+ log.info("rag: loading reranker %s", _RERANKER_MODEL_NAME)
159
+ _RERANKER = CrossEncoder(_RERANKER_MODEL_NAME)
160
+ log.info("rag: reranker ready")
161
+ except Exception:
162
+ log.exception("rag: reranker load failed; falling back to baseline")
163
+ _RERANKER = False # sentinel: don't retry every call
164
+ return _RERANKER or None
165
+
166
+
167
  def warm():
168
  _ensure_index()
169
+ _ensure_reranker()
170
 
171
 
172
  def retrieve(query: str, k: int = 4, min_score: float = 0.30) -> list[dict]:
 
175
  return []
176
  qv = idx["model"].encode([query], convert_to_numpy=True,
177
  normalize_embeddings=True).astype("float32")
 
178
  sims = (idx["embs"] @ qv.T).ravel()
179
+
180
+ reranker = _ensure_reranker()
181
+ if reranker is not None:
182
+ # Over-fetch K*5 candidates (no per-doc dedup yet), rerank, then
183
+ # dedup to K. This keeps high-relevance chunks alive long enough
184
+ # for the cross-encoder to see them — the legacy path's
185
+ # dedup-before-rank threw them away.
186
+ cand_n = min(len(idx["chunks"]), max(k * 5, 20))
187
+ top_idx = np.argsort(-sims)[:cand_n]
188
+ candidates = [(int(i), idx["chunks"][int(i)],
189
+ float(sims[int(i)])) for i in top_idx
190
+ if float(sims[int(i)]) >= min_score]
191
+ if not candidates:
192
+ return []
193
+ pairs = [[query, c.text] for _, c, _ in candidates]
194
+ scores = reranker.predict(pairs)
195
+ ranked = sorted(zip(candidates, scores, strict=True),
196
+ key=lambda x: float(x[1]), reverse=True)
197
+ out: list[dict] = []
198
+ seen_per_doc: dict[str, int] = {}
199
+ for (_i, c, retr_score), rerank_score in ranked:
200
+ if seen_per_doc.get(c.doc_id, 0) >= 1:
201
+ continue
202
+ seen_per_doc[c.doc_id] = 1
203
+ out.append({
204
+ "doc_id": c.doc_id,
205
+ "title": c.title,
206
+ "citation": c.citation,
207
+ "file": c.file,
208
+ "page": c.page,
209
+ "text": c.text,
210
+ "score": float(rerank_score),
211
+ "retriever_score": retr_score,
212
+ })
213
+ if len(out) >= k:
214
+ break
215
+ return out
216
+
217
+ # Baseline ranker (unchanged behaviour when reranker disabled)
218
+ top = np.argsort(-sims)[:k * 3]
219
+ out2: list[dict] = []
220
+ seen_per_doc2: dict[str, int] = {}
221
  for i in top:
222
  if sims[i] < min_score:
223
  continue
224
  c = idx["chunks"][i]
225
+ if seen_per_doc2.get(c.doc_id, 0) >= 1:
226
  continue
227
+ seen_per_doc2[c.doc_id] = 1
228
+ out2.append({
229
  "doc_id": c.doc_id,
230
  "title": c.title,
231
  "citation": c.citation,
 
234
  "text": c.text,
235
  "score": float(sims[i]),
236
  })
237
+ if len(out2) >= k:
238
  break
239
+ return out2
app/reconcile.py CHANGED
@@ -21,41 +21,73 @@ import os
21
  import re
22
  from typing import Any
23
 
24
- import ollama
25
 
26
  log = logging.getLogger("riprap.reconcile")
27
 
28
- OLLAMA_MODEL = os.environ.get("HELIOS_NYC_OLLAMA_MODEL", "granite4.1:3b")
29
-
30
- # Granite auto-prepends its own grounded-generation system prompt when the
31
- # message list contains "document" roles. This adds *additional* rules.
32
- EXTRA_SYSTEM_PROMPT = """You are Riprap's grounded reconciler. Produce a SHORT factual paragraph (4-7 sentences) summarising flood risk at a NYC address. Use ONLY information from the documents provided.
33
-
34
- Citation format — STRICT:
35
- - After every factual or numerical claim, cite the originating document by its doc_id in square brackets, e.g. [sandy] or [floodnet].
36
- - Use square brackets [ and ]. Never parentheses, never the word "source".
37
- - A claim drawn from multiple documents may carry multiple tags, e.g. [sandy][floodnet].
38
-
39
- Hard rules — non-negotiable:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  - Copy numerical values verbatim from documents. Do not round.
41
- - Do NOT name a specific weather event (Hurricane Sandy, Ida, Henri, Ophelia, etc.) unless THIS document set explicitly mentions that event applies to THIS address. The fact that a RAG passage discusses an event in passing is NOT licence to apply it to the address. If you mention an event, you must cite the specific document supporting that the event affected this address.
42
- - Do NOT invent dates, sensor IDs, hazard categories, or street/neighborhood names beyond what the documents contain.
43
- - For RAG documents whose id starts with `rag_`: paraphrase the retrieved passage at the policy / agency level talk about what the agency report SAYS about flood risk in general or for this asset class — do not assert findings the report did not make about this specific address. Cite with the doc_id.
44
- - Stay neutral. No editorialising. No future speculation.
45
- - If no documents are present, output exactly: No grounded data available for this address.
46
-
47
- Microtopo interpretation hint:
48
- - A LOW percentile (e.g. 5%) means the address is at a topographic LOW POINT in its surroundings — water tends to pool there. A HIGH percentile (e.g. 80%) means the address sits on relatively HIGH ground. Get this direction right or omit the percentile.
49
  """
50
 
51
 
52
  # ---- Hallucination guardrail: numeric grounding post-check -----------------
53
 
54
- _NUM_RE = re.compile(r"-?\d[\d,]*(?:\.\d+)?")
 
 
 
55
  _SENTENCE_END_RE = re.compile(r"(?<=[.!?])\s+(?=[A-Z\[])")
56
- # Strings that are too generic to be useful as grounding evidence; ignore
57
- # them when matching numeric tokens.
58
- _TRIVIAL_NUMS = {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "100"}
 
59
 
60
 
61
  def _normalize_num(s: str) -> set[str]:
@@ -77,44 +109,94 @@ def _docs_corpus(doc_msgs: list[dict]) -> str:
77
  return "\n".join(m.get("content", "") for m in doc_msgs)
78
 
79
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
  def verify_paragraph(paragraph: str, doc_msgs: list[dict]) -> tuple[str, list[dict]]:
81
  """Drop sentences whose numeric tokens don't appear in any source doc.
82
 
83
- Returns (clean_paragraph, dropped_sentences_with_reason). Sentences are
84
- split on sentence-end punctuation followed by whitespace + a capital
85
- letter or '['. The bracketed-citation tags `[doc_id]` and document
86
- roles in the source message list are excluded from the haystack so we
87
- don't accidentally accept fabricated values that happen to be
88
- substrings of doc_ids.
89
  """
 
90
  haystack = _docs_corpus(doc_msgs)
91
-
92
- sentences = _SENTENCE_END_RE.split(paragraph.strip())
93
- kept: list[str] = []
94
  dropped: list[dict] = []
95
-
96
- for sent in sentences:
97
- sent_stripped = sent.strip()
98
- if not sent_stripped:
99
- continue
100
- # remove citation tags before extracting numbers (they're not claims)
101
- sent_no_cites = re.sub(r"\[[a-z0-9_]+\]", "", sent_stripped, flags=re.I)
102
- nums = _NUM_RE.findall(sent_no_cites)
103
- ungrounded = []
104
- for n in nums:
105
- if n in _TRIVIAL_NUMS:
 
 
 
106
  continue
107
- forms = _normalize_num(n)
108
- if not any(f in haystack for f in forms):
109
- ungrounded.append(n)
110
-
111
- if ungrounded:
112
- dropped.append({"sentence": sent_stripped, "ungrounded_numbers": ungrounded})
113
- log.warning("dropped ungrounded sentence: %r (nums: %s)", sent_stripped, ungrounded)
114
- continue
115
- kept.append(sent_stripped)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
 
117
- cleaned = " ".join(kept).strip()
118
  if not cleaned:
119
  cleaned = "Could not produce a verifiable summary; see the data panels."
120
  return cleaned, dropped
@@ -127,15 +209,194 @@ def _doc_message(doc_id: str, body_lines: list[str]) -> dict:
127
  return {"role": f"document {doc_id}", "content": "\n".join(body_lines)}
128
 
129
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
130
  def build_documents(state: dict[str, Any]) -> list[dict]:
131
  """Build Granite-native document-role messages, gated so absent
132
- specialists emit no document at all."""
 
 
 
 
 
 
 
133
  docs: list[dict] = []
134
 
135
- geo = state.get("geocode")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
136
  if geo:
137
  body = [
138
- f"Source: NYC DCP Geosearch (geosearch.planninglabs.nyc).",
139
  f"Resolved address: {geo['address']}.",
140
  f"Borough: {geo.get('borough') or 'unknown'}.",
141
  f"Coordinates: {geo['lat']:.5f} N, {geo['lon']:.5f} W.",
@@ -148,7 +409,7 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
148
  # 2012 extent. Granite has a strong training prior associating NYC + flood
149
  # + Brooklyn with Sandy and will misread "outside" as "inside" if given
150
  # the chance — silence-over-confabulation rules.
151
- if state.get("sandy") is True:
152
  body = [
153
  "Source: NYC Sandy Inundation Zone (NYC OpenData 5xsi-dfpx, "
154
  "empirical extent of areas flooded by Hurricane Sandy in 2012).",
@@ -159,18 +420,18 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
159
  docs.append(_doc_message("sandy", body))
160
 
161
  dep = state.get("dep")
162
- if dep:
163
  for scen, info in dep.items():
164
  if info.get("depth_class", 0) > 0:
165
  body = [
166
  f"Source: {info['citation']}.",
167
- f"Address inside scenario footprint: yes.",
168
  f"Modeled depth class: {info['depth_label']}.",
169
  ]
170
  docs.append(_doc_message(scen, body))
171
 
172
  fn = state.get("floodnet")
173
- if fn and fn.get("n_sensors", 0) > 0:
174
  body = [
175
  "Source: FloodNet NYC ultrasonic depth sensor network (api.floodnet.nyc).",
176
  f"Sensors within {fn['radius_m']} m: {fn['n_sensors']}.",
@@ -187,7 +448,7 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
187
  docs.append(_doc_message("floodnet", body))
188
 
189
  pw = state.get("prithvi_water")
190
- if pw and pw.get("nearest_distance_m") is not None:
191
  body = [
192
  "Source: Prithvi-EO 2.0 (300M params, NASA/IBM, Apache-2.0). "
193
  "Sen1Floods11 fine-tune for water/flood semantic segmentation, "
@@ -211,7 +472,7 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
211
  docs.append(_doc_message("prithvi_water", body))
212
 
213
  ida = state.get("ida_hwm")
214
- if ida and (ida.get("n_within_radius") or 0) > 0:
215
  body = [
216
  "Source: USGS STN Hurricane Ida 2021 high-water marks (Event 312, NY State).",
217
  f"USGS HWMs within {ida['radius_m']} m: {ida['n_within_radius']}.",
@@ -225,7 +486,7 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
225
  docs.append(_doc_message("ida_hwm", body))
226
 
227
  mt = state.get("microtopo")
228
- if mt:
229
  # Compute a categorical topographic position so Granite can't flip
230
  # the directional reading of the percentile.
231
  p200 = mt["rel_elev_pct_200m"]
@@ -280,7 +541,7 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
280
  )
281
  docs.append(_doc_message("microtopo", body))
282
 
283
- rag_hits = state.get("rag") or []
284
  for h in rag_hits:
285
  body = [
286
  f"Source: {h['citation']}, page {h['page']}.",
@@ -288,8 +549,275 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
288
  ]
289
  docs.append(_doc_message(h["doc_id"], body))
290
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
291
  nyc311 = state.get("nyc311")
292
- if nyc311 and nyc311.get("n", 0) > 0:
293
  body = [
294
  "Source: NYC 311 service requests (Socrata erm2-nwe9, 2010-present).",
295
  f"311 flood-related complaints within {nyc311['radius_m']} m, last {nyc311['years']} years: {nyc311['n']}.",
@@ -302,13 +830,145 @@ def build_documents(state: dict[str, Any]) -> list[dict]:
302
  body.append(f"Per-year counts: {yrs}.")
303
  docs.append(_doc_message("nyc311", body))
304
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
305
  return docs
306
 
307
 
308
  def reconcile(state: dict[str, Any], model: str = OLLAMA_MODEL,
309
- return_audit: bool = False):
310
  """Run Granite reconciliation, then drop sentences with ungrounded numbers.
311
 
 
 
 
312
  If return_audit=True, returns (paragraph, audit_dict) where audit_dict
313
  has 'raw' (Granite's original output) and 'dropped' (list of dropped
314
  sentences with their ungrounded numeric tokens).
@@ -318,21 +978,28 @@ def reconcile(state: dict[str, Any], model: str = OLLAMA_MODEL,
318
  msg = "No grounded data available for this address."
319
  return (msg, {"raw": msg, "dropped": []}) if return_audit else msg
320
 
321
- messages = (
322
- doc_msgs
323
- + [
324
- {"role": "system", "content": EXTRA_SYSTEM_PROMPT},
325
- {"role": "user", "content": "Write the cited paragraph now."},
326
- ]
327
- )
328
- resp = ollama.chat(
329
- model=model,
330
- messages=messages,
331
- options={"temperature": 0, "num_ctx": 8192},
332
- )
333
- raw = resp["message"]["content"].strip()
334
- cleaned, dropped = verify_paragraph(raw, doc_msgs)
 
 
 
 
 
 
335
 
 
336
  if return_audit:
337
  return cleaned, {"raw": raw, "dropped": dropped}
338
  return cleaned
 
21
  import re
22
  from typing import Any
23
 
24
+ from app import llm
25
 
26
  log = logging.getLogger("riprap.reconcile")
27
 
28
+ # Reconciliation is the synthesis step — citation discipline + structured
29
+ # output adherence both improve materially with the 8b variant.
30
+ # RIPRAP_RECONCILER_MODEL is the canonical name; RIPRAP_OLLAMA_MODEL is
31
+ # kept as a back-compat fallback. Default is now 8b on production
32
+ # deployments (HF Space ships granite4.1:8b in the container).
33
+ OLLAMA_MODEL = os.environ.get("RIPRAP_RECONCILER_MODEL",
34
+ os.environ.get("RIPRAP_OLLAMA_MODEL", "granite4.1:8b"))
35
+
36
+ CITATION_NOAA_TIDES = ("NOAA CO-OPS Tides & Currents API "
37
+ "(api.tidesandcurrents.noaa.gov), 6-min cadence")
38
+ CITATION_NWS_ALERTS = ("NWS Public Alerts API (api.weather.gov/alerts/active), "
39
+ "filtered to flood-relevant event types")
40
+ CITATION_NWS_OBS = ("NWS Station Observations API "
41
+ "(api.weather.gov/stations/<id>/observations/latest)")
42
+ CITATION_TTM_FORECAST = (
43
+ "Granite TimeSeries TTM r2 (Ekambaram et al. 2024, NeurIPS) — "
44
+ "ibm-granite/granite-timeseries-ttm-r2 via granite-tsfm. "
45
+ "Zero-shot forecast of the surge residual (observed minus astronomical "
46
+ "tide) at the Battery, NY (NOAA station 8518750). 6-min cadence, "
47
+ "~51 h context, ~9.6 h horizon."
48
+ )
49
+
50
+ # The Ollama chat template auto-prepends Granite's own grounded-generation
51
+ # system suffix once the message list contains role="document" entries.
52
+ # This text is OUR additional system prompt, prepended to that suffix.
53
+ EXTRA_SYSTEM_PROMPT = """Write a flood-exposure briefing for an NYC address. Use ONLY the facts in the provided documents.
54
+
55
+ Output this markdown skeleton verbatim, filling each `<...>` with content drawn only from the documents. **Every sentence that contains a number MUST end with a `[doc_id]` citation — including derived measurements (TWI, percentile, ratio).** Repeat the source citation if the value is reused. Bold at most one phrase per section using `**...**`. Omit any section whose supporting facts are absent from the documents.
56
+
57
+ ```
58
+ **Status.**
59
+ <one sentence: dominant exposure signal(s) for this address, citing the strongest documents>.
60
+
61
+ **Empirical evidence.**
62
+ <1-3 sentences citing observed flood evidence: Sandy from [sandy], 311 counts from [nyc311], FloodNet from [floodnet], Ida HWMs from [ida_hwm], Prithvi polygons from [prithvi_water]>.
63
+
64
+ **Modeled scenarios.**
65
+ <1-2 sentences citing modeled flooding from [dep_*] and terrain from [microtopo] (HAND, TWI, percentile). When a [floodnet_forecast_*] doc is present, add one sentence on the forecast event recurrence at the cited sensor>.
66
+
67
+ **Policy context.**
68
+ <1 sentence per RAG hit, citing the agency name and [rag_*]>.
69
+ ```
70
+
71
+ Constraints:
72
  - Copy numerical values verbatim from documents. Do not round.
73
+ - Name a specific weather event only if a document explicitly applies it to this address.
74
+ - For RAG documents (doc_ids starting with `rag_`): describe what the report SAYS at the policy or asset-class level. Do not assert findings the report did not make about this specific address.
75
+ - Microtopo percentile direction: a LOW percentile means topographic LOW POINT (water pools); HIGH percentile means HIGH GROUND. State the direction correctly or omit the percentile.
76
+ - If no documents are present, output exactly: `No grounded data available for this address.`
 
 
 
 
77
  """
78
 
79
 
80
  # ---- Hallucination guardrail: numeric grounding post-check -----------------
81
 
82
+ # Numbers must be preceded by whitespace, start-of-string, or punctuation
83
+ # OTHER than '-'. This prevents `Extreme-2080` from being parsed as the
84
+ # negative number `-2080` (the hyphen is a word separator, not a sign).
85
+ _NUM_RE = re.compile(r"(?:(?<=^)|(?<=[\s(\[/]))-?\d[\d,]*(?:\.\d+)?")
86
  _SENTENCE_END_RE = re.compile(r"(?<=[.!?])\s+(?=[A-Z\[])")
87
+ # Strings that are too generic OR are well-known NYC system names rather
88
+ # than measurements (311, 911 are city service lines, not values).
89
+ _TRIVIAL_NUMS = {"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "100",
90
+ "311", "911", "211"}
91
 
92
 
93
  def _normalize_num(s: str) -> set[str]:
 
109
  return "\n".join(m.get("content", "") for m in doc_msgs)
110
 
111
 
112
+ # Recognise structured-output section headers like `**Status.**` on their
113
+ # own line. These are NOT sentences and are kept verbatim.
114
+ _SECTION_HEADER_RE = re.compile(r"^\s*\*\*[A-Z][A-Za-z\s/]+\.\*\*\s*$", re.MULTILINE)
115
+
116
+ # Granite sometimes emits the four headers inline rather than on their own
117
+ # lines (e.g. `**Status.** This address ... **Empirical evidence.** ...`).
118
+ # Normalise to one-per-line so the section-renderer regex matches.
119
+ _KNOWN_SECTION_HEADERS = ["Status", "Empirical evidence", "Modeled scenarios",
120
+ "Policy context"]
121
+ _INLINE_HEADER_RE = re.compile(
122
+ r"\*\*(" + "|".join(re.escape(h) for h in _KNOWN_SECTION_HEADERS) + r")\.\*\*"
123
+ )
124
+
125
+
126
+ def _split_inline_headers(text: str) -> str:
127
+ """Inject a newline before each `**Header.**` so headers sit on their own
128
+ line. The render path and verifier both depend on this."""
129
+ text = _INLINE_HEADER_RE.sub(lambda m: f"\n**{m.group(1)}.**\n", text)
130
+ text = re.sub(r"\n{3,}", "\n\n", text)
131
+ return text.strip()
132
+
133
+
134
+ def _strip_markdown(text: str) -> str:
135
+ """Remove bold markers and citation tags so the numeric scan operates on
136
+ raw content. Used only for the haystack-substring check, not the rendered
137
+ output."""
138
+ text = re.sub(r"\*\*([^*]+)\*\*", r"\1", text) # **bold** -> bold
139
+ text = re.sub(r"\[[a-z0-9_]+\]", "", text, flags=re.I) # drop [doc_id]
140
+ return text
141
+
142
+
143
  def verify_paragraph(paragraph: str, doc_msgs: list[dict]) -> tuple[str, list[dict]]:
144
  """Drop sentences whose numeric tokens don't appear in any source doc.
145
 
146
+ Section-header lines (e.g. `**Status.**`) and inline bold (`**foo**`)
147
+ are preserved verbatim; the verifier strips them only for the
148
+ numeric-grounding check. Sentences are split on sentence-end
149
+ punctuation followed by whitespace + a capital letter or '['.
150
+
151
+ Returns (clean_paragraph, dropped_sentences_with_reason).
152
  """
153
+ paragraph = _split_inline_headers(paragraph)
154
  haystack = _docs_corpus(doc_msgs)
155
+ out_blocks: list[str] = []
 
 
156
  dropped: list[dict] = []
157
+ body_buf: list[str] = []
158
+
159
+ def flush_body():
160
+ if not body_buf:
161
+ return
162
+ body = " ".join(body_buf).strip()
163
+ body_buf.clear()
164
+ if not body:
165
+ return
166
+ sentences = _SENTENCE_END_RE.split(body)
167
+ kept_sents: list[str] = []
168
+ for sent in sentences:
169
+ sent_stripped = sent.strip()
170
+ if not sent_stripped:
171
  continue
172
+ sent_clean = _strip_markdown(sent_stripped)
173
+ nums = _NUM_RE.findall(sent_clean)
174
+ ungrounded = []
175
+ for n in nums:
176
+ if n in _TRIVIAL_NUMS:
177
+ continue
178
+ forms = _normalize_num(n)
179
+ if not any(f in haystack for f in forms):
180
+ ungrounded.append(n)
181
+ if ungrounded:
182
+ dropped.append({"sentence": sent_stripped,
183
+ "ungrounded_numbers": ungrounded})
184
+ log.warning("dropped ungrounded sentence: %r (nums: %s)",
185
+ sent_stripped, ungrounded)
186
+ continue
187
+ kept_sents.append(sent_stripped)
188
+ if kept_sents:
189
+ out_blocks.append(" ".join(kept_sents))
190
+
191
+ for line in paragraph.splitlines():
192
+ if _SECTION_HEADER_RE.match(line):
193
+ flush_body()
194
+ out_blocks.append(line.strip())
195
+ else:
196
+ body_buf.append(line.strip())
197
+ flush_body()
198
 
199
+ cleaned = "\n".join(b for b in out_blocks if b).strip()
200
  if not cleaned:
201
  cleaned = "Could not produce a verifiable summary; see the data panels."
202
  return cleaned, dropped
 
209
  return {"role": f"document {doc_id}", "content": "\n".join(body_lines)}
210
 
211
 
212
+ def trim_docs_to_plan(doc_msgs: list[dict],
213
+ planned_specialists: set[str] | None) -> list[dict]:
214
+ """Drop document messages whose doc_id family wasn't in the planner's
215
+ specialist list.
216
+
217
+ The FSM's parallel fan-out runs every specialist regardless of what
218
+ the planner asked for; this lets the user see all the data come in
219
+ via the trace + map. But for the reconciler we want only what the
220
+ planner judged relevant, both to cut prompt tokens (≈30-50% on
221
+ typical single_address queries) and to keep the briefing focused.
222
+
223
+ Doc IDs are mapped to specialist family prefixes:
224
+ sandy -> {sandy}
225
+ dep_stormwater -> {dep_*}
226
+ floodnet -> {floodnet}
227
+ nyc311 -> {nyc311}
228
+ microtopo -> {microtopo}
229
+ mta_entrances -> {mta_entrance_*}
230
+ nycha_developments -> {nycha_dev_*}
231
+ doe_schools -> {doe_school_*}
232
+ doh_hospitals -> {nyc_hospital_*} # historical id naming
233
+ ida_hwm -> {ida_hwm}
234
+ prithvi_water -> {prithvi_water}
235
+ noaa_tides -> {noaa_tides}
236
+ nws_alerts -> {nws_alerts}
237
+ nws_obs -> {nws_obs}
238
+ ttm_forecast -> {ttm_forecast}
239
+ ttm_311_forecast -> {ttm_311_forecast}
240
+ floodnet_forecast -> {floodnet_forecast_*}
241
+ terramind -> {terramind_*, syn_*}
242
+ rag -> {rag_*}
243
+ nta_resolve -> {nta_resolve, nta_*}
244
+ dob_permits -> {dob_*}
245
+
246
+ Always preserved (never trimmed):
247
+ geocode, scope_note, nta_resolve — they orient the briefing or
248
+ gate scope and the planner doesn't always name them explicitly.
249
+
250
+ Set RIPRAP_TRIM_DOCS=0 to disable (defaults on).
251
+ """
252
+ import os as _os # local import to keep module top tidy
253
+ if not planned_specialists or not doc_msgs:
254
+ return doc_msgs
255
+ if _os.environ.get("RIPRAP_TRIM_DOCS", "1").lower() in ("0", "false", "no"):
256
+ return doc_msgs
257
+
258
+ # Build the allowed-prefix set from the planner's specialists.
259
+ PREFIXES_BY_SPECIALIST: dict[str, tuple[str, ...]] = {
260
+ "sandy": ("sandy",),
261
+ "dep_stormwater": ("dep_",),
262
+ "floodnet": ("floodnet",),
263
+ "nyc311": ("nyc311",),
264
+ "microtopo": ("microtopo",),
265
+ "ida_hwm": ("ida_hwm",),
266
+ "prithvi_water": ("prithvi_water",),
267
+ "noaa_tides": ("noaa_tides",),
268
+ "nws_alerts": ("nws_alerts",),
269
+ "nws_obs": ("nws_obs",),
270
+ "ttm_forecast": ("ttm_forecast",),
271
+ "ttm_311_forecast": ("ttm_311_forecast",),
272
+ "floodnet_forecast": ("floodnet_forecast",),
273
+ "terramind": ("terramind", "syn_"),
274
+ "rag": ("rag_",),
275
+ "rag_mta": ("rag_",),
276
+ "nta_resolve": ("nta_resolve", "nta_"),
277
+ "dob_permits": ("dob_",),
278
+ "mta_entrances": ("mta_entrance",),
279
+ "nycha_developments": ("nycha_dev",),
280
+ "doe_schools": ("doe_school", "nyc_school"),
281
+ "doh_hospitals": ("doh_hospital", "nyc_hospital"),
282
+ }
283
+ ALWAYS_KEEP = ("geocode", "scope_note", "nta_resolve")
284
+
285
+ allowed_prefixes: set[str] = set()
286
+ for spec in planned_specialists:
287
+ for p in PREFIXES_BY_SPECIALIST.get(spec, ()):
288
+ allowed_prefixes.add(p)
289
+ if not allowed_prefixes:
290
+ return doc_msgs # planner gave us nothing matchable; bail safely
291
+
292
+ kept: list[dict] = []
293
+ for m in doc_msgs:
294
+ role = m.get("role", "")
295
+ if not role.startswith("document "):
296
+ kept.append(m)
297
+ continue
298
+ doc_id = role[len("document "):].strip()
299
+ if doc_id.startswith(ALWAYS_KEEP):
300
+ kept.append(m)
301
+ continue
302
+ if any(doc_id.startswith(p) for p in allowed_prefixes):
303
+ kept.append(m)
304
+ return kept
305
+
306
+
307
  def build_documents(state: dict[str, Any]) -> list[dict]:
308
  """Build Granite-native document-role messages, gated so absent
309
+ specialists emit no document at all.
310
+
311
+ Scope guard: if the resolved address is OUTSIDE the NYC bbox, only
312
+ the geocode + live national specialists emit documents. NYC-specific
313
+ layers (Sandy, DEP, FloodNet, NYC 311, microtopo, Ida HWMs, Prithvi,
314
+ NYC RAG corpus) are suppressed and a `scope_note` doc is added telling
315
+ the reconciler not to invoke NYC content.
316
+ """
317
  docs: list[dict] = []
318
 
319
+ geo = state.get("geocode") or {}
320
+ NYC_S, NYC_W, NYC_N, NYC_E = 40.49, -74.27, 40.92, -73.69
321
+ out_of_nyc = (
322
+ geo.get("lat") is not None and geo.get("lon") is not None and not (
323
+ NYC_S <= geo["lat"] <= NYC_N and NYC_W <= geo["lon"] <= NYC_E
324
+ )
325
+ )
326
+ if out_of_nyc:
327
+ # Compose a single live-conditions snapshot from whatever the
328
+ # national specialists produced. This always emits when out_of_nyc,
329
+ # even on a calm day, so the reconciler has SOMETHING grounded to
330
+ # report instead of only a list of what doesn't apply.
331
+ place_label = (geo.get("borough") or geo.get("address") or
332
+ f"{geo['lat']:.4f}, {geo['lon']:.4f}")
333
+ body = [
334
+ "Source: Riprap planner + national live specialists. Scope "
335
+ "guard: this address is OUTSIDE NYC; NYC-specific datasets "
336
+ "are not in scope at this location.",
337
+ f"Resolved location: {place_label} ({geo['lat']:.4f}, "
338
+ f"{geo['lon']:.4f}).",
339
+ ]
340
+ tides = state.get("noaa_tides") or {}
341
+ if tides.get("station_id") and tides.get("error") is None:
342
+ tline = (f"NOAA Tides & Currents — nearest gauge: "
343
+ f"{tides.get('station_name')} (NOAA "
344
+ f"{tides.get('station_id')}, "
345
+ f"{tides.get('distance_km')} km from address).")
346
+ body.append(tline)
347
+ if tides.get("observed_ft_mllw") is not None:
348
+ body.append(
349
+ f"Observed water level: {tides['observed_ft_mllw']} ft "
350
+ f"above MLLW; predicted: "
351
+ f"{tides.get('predicted_ft_mllw')} ft; residual "
352
+ f"(observed minus predicted): "
353
+ f"{tides.get('residual_ft')} ft."
354
+ )
355
+ else:
356
+ body.append("No water-level observation reported by the "
357
+ "gauge in the last poll.")
358
+ alerts = state.get("nws_alerts") or {}
359
+ body.append(
360
+ f"NWS Public Alerts at point: {alerts.get('n_active', 0)} "
361
+ "active flood-relevant alert(s)."
362
+ )
363
+ if alerts.get("alerts"):
364
+ for a in alerts["alerts"][:3]:
365
+ body.append(
366
+ f"- {a.get('event','?')} (severity "
367
+ f"{a.get('severity','?')}, urgency "
368
+ f"{a.get('urgency','?')}); expires "
369
+ f"{(a.get('expires') or '')[:16]}; area: "
370
+ f"{(a.get('areaDesc') or '')[:120]}."
371
+ )
372
+ obs = state.get("nws_obs") or {}
373
+ if obs.get("station_id") and obs.get("error") is None:
374
+ line = (f"Nearest NWS ASOS: {obs.get('station_name')} "
375
+ f"({obs.get('station_id')}, "
376
+ f"{obs.get('distance_km')} km).")
377
+ body.append(line)
378
+ if obs.get("precip_last_hour_mm") is not None:
379
+ body.append(
380
+ f"Precipitation last 1 h: "
381
+ f"{obs['precip_last_hour_mm']} mm; last 6 h: "
382
+ f"{obs.get('precip_last_6h_mm')} mm."
383
+ )
384
+ else:
385
+ body.append("No precipitation reported in the last hourly "
386
+ "observation.")
387
+ ttm = state.get("ttm_forecast") or {}
388
+ if ttm.get("available") and ttm.get("interesting"):
389
+ body.append(
390
+ f"Granite TTM r2 surge forecast at the Battery: peak "
391
+ f"residual {ttm.get('forecast_peak_ft')} ft expected in "
392
+ f"{ttm.get('forecast_peak_minutes_ahead')} minutes — note "
393
+ f"this gauge is in NYC harbor, not local to this address."
394
+ )
395
+ docs.append(_doc_message("scope_note", body))
396
+
397
  if geo:
398
  body = [
399
+ "Source: NYC DCP Geosearch (geosearch.planninglabs.nyc).",
400
  f"Resolved address: {geo['address']}.",
401
  f"Borough: {geo.get('borough') or 'unknown'}.",
402
  f"Coordinates: {geo['lat']:.5f} N, {geo['lon']:.5f} W.",
 
409
  # 2012 extent. Granite has a strong training prior associating NYC + flood
410
  # + Brooklyn with Sandy and will misread "outside" as "inside" if given
411
  # the chance — silence-over-confabulation rules.
412
+ if not out_of_nyc and state.get("sandy") is True:
413
  body = [
414
  "Source: NYC Sandy Inundation Zone (NYC OpenData 5xsi-dfpx, "
415
  "empirical extent of areas flooded by Hurricane Sandy in 2012).",
 
420
  docs.append(_doc_message("sandy", body))
421
 
422
  dep = state.get("dep")
423
+ if not out_of_nyc and dep:
424
  for scen, info in dep.items():
425
  if info.get("depth_class", 0) > 0:
426
  body = [
427
  f"Source: {info['citation']}.",
428
+ "Address inside scenario footprint: yes.",
429
  f"Modeled depth class: {info['depth_label']}.",
430
  ]
431
  docs.append(_doc_message(scen, body))
432
 
433
  fn = state.get("floodnet")
434
+ if not out_of_nyc and fn and fn.get("n_sensors", 0) > 0:
435
  body = [
436
  "Source: FloodNet NYC ultrasonic depth sensor network (api.floodnet.nyc).",
437
  f"Sensors within {fn['radius_m']} m: {fn['n_sensors']}.",
 
448
  docs.append(_doc_message("floodnet", body))
449
 
450
  pw = state.get("prithvi_water")
451
+ if not out_of_nyc and pw and pw.get("nearest_distance_m") is not None:
452
  body = [
453
  "Source: Prithvi-EO 2.0 (300M params, NASA/IBM, Apache-2.0). "
454
  "Sen1Floods11 fine-tune for water/flood semantic segmentation, "
 
472
  docs.append(_doc_message("prithvi_water", body))
473
 
474
  ida = state.get("ida_hwm")
475
+ if not out_of_nyc and ida and (ida.get("n_within_radius") or 0) > 0:
476
  body = [
477
  "Source: USGS STN Hurricane Ida 2021 high-water marks (Event 312, NY State).",
478
  f"USGS HWMs within {ida['radius_m']} m: {ida['n_within_radius']}.",
 
486
  docs.append(_doc_message("ida_hwm", body))
487
 
488
  mt = state.get("microtopo")
489
+ if not out_of_nyc and mt:
490
  # Compute a categorical topographic position so Granite can't flip
491
  # the directional reading of the percentile.
492
  p200 = mt["rel_elev_pct_200m"]
 
541
  )
542
  docs.append(_doc_message("microtopo", body))
543
 
544
+ rag_hits = [] if out_of_nyc else (state.get("rag") or [])
545
  for h in rag_hits:
546
  body = [
547
  f"Source: {h['citation']}, page {h['page']}.",
 
549
  ]
550
  docs.append(_doc_message(h["doc_id"], body))
551
 
552
+ # ---- GLiNER typed extractions (Phase 2 specialist) -------------------
553
+ # Per-source structured fields the reconciler can cite as
554
+ # [gliner_<source>] in addition to the parent [rag_<source>].
555
+ gliner = (state.get("gliner") or {})
556
+ if not out_of_nyc and gliner:
557
+ for source, payload in gliner.items():
558
+ ents = payload.get("entities") or []
559
+ if not ents:
560
+ continue
561
+ body = [
562
+ f"Source PDF (parent retriever doc_id: {payload.get('rag_doc_id', '?')}, "
563
+ f"title: {payload.get('title', '?')}).",
564
+ f"Paragraph excerpt: \"{payload.get('paragraph_excerpt', '')}\"",
565
+ "Typed entities extracted by GLiNER (verbatim spans):",
566
+ ]
567
+ for e in ents:
568
+ body.append(
569
+ f" - [{e['label']}] {e['text']} (score={e.get('score', 0):.2f})"
570
+ )
571
+ docs.append(_doc_message(f"gliner_{source}", body))
572
+
573
+ # ---- TerraMind synthesis (Phase 4 cognitive engine) ------------------
574
+ # Synthetic-prior tier — explicitly fourth epistemic class alongside
575
+ # empirical / modeled / proxy. Reconciler narration must frame this
576
+ # as "TerraMind generated a plausible land-cover map from terrain
577
+ # context", never "imaged" or "reconstructed". Class labels are
578
+ # tentative against ESRI Land Cover 2020-2022 schema.
579
+ tm = state.get("terramind")
580
+ if not out_of_nyc and tm and tm.get("ok"):
581
+ body = [
582
+ "Source: TerraMind 1.0 base (IBM/ESA, Apache-2.0) any-to-any "
583
+ "generative foundation model. This is a SYNTHETIC PRIOR, "
584
+ "not a measurement: TerraMind generates plausible categorical "
585
+ "land-cover maps from terrain context, never observations.",
586
+ f"Chain: {' -> '.join(tm.get('tim_chain') or ['DEM', 'LULC_synthetic'])}.",
587
+ f"Diffusion steps: {tm.get('diffusion_steps', '?')}.",
588
+ f"Diffusion seed (reproducibility): {tm.get('diffusion_seed', '?')}.",
589
+ f"Input DEM mean elevation at this address: "
590
+ f"{tm.get('dem_mean_m', 0):.2f} m (NYC 30 m LiDAR raster).",
591
+ f"Label schema: {tm.get('label_schema', 'ESRI Land Cover, tentative')}.",
592
+ f"Dominant synthetic land-cover class: "
593
+ f"{tm.get('dominant_class_display') or tm.get('dominant_class', 'unknown')} at "
594
+ f"{tm.get('dominant_pct', 0):.1f}% of the 5 km area.",
595
+ f"Synthetic class fractions ({tm.get('n_classes_observed', 0)} "
596
+ f"classes observed):",
597
+ ]
598
+ for label, pct in (tm.get("class_fractions") or {}).items():
599
+ body.append(f" - {label}: {pct:.1f}%")
600
+ body.extend([
601
+ "synthetic_modality: true",
602
+ "Use only the careful framing 'TerraMind generated a "
603
+ "plausible synthetic land-cover prior from the terrain "
604
+ "context, with class labels tentatively aligned to ESRI "
605
+ "schema'. Do NOT claim measurement, imaging, observation, "
606
+ "or reconstruction.",
607
+ ])
608
+ docs.append(_doc_message("terramind_synthetic", body))
609
+
610
+ # ---- Prithvi-EO live water (Phase 1 specialist) ----------------------
611
+ # Per-query Sentinel-2 water-segmentation observation. Distinct from
612
+ # `prithvi_water` (the offline 2021 Ida polygons) — this one fires
613
+ # against today's imagery and emits a dated observation.
614
+ plive = state.get("prithvi_live")
615
+ if not out_of_nyc and plive and plive.get("ok"):
616
+ body = [
617
+ "Source: Prithvi-EO 2.0 (Sen1Floods11 fine-tune) live "
618
+ "segmentation over a Sentinel-2 L2A scene from Microsoft "
619
+ "Planetary Computer.",
620
+ f"Sentinel-2 scene id: {plive.get('item_id', 'unknown')}.",
621
+ f"Observation date: {(plive.get('item_datetime') or 'unknown')[:10]}.",
622
+ f"Cloud cover: {plive.get('cloud_cover', 0):.3f}%.",
623
+ f"% water within 500 m of address: "
624
+ f"{plive.get('pct_water_within_500m', 0):.2f}.",
625
+ f"% water across 5 km chip: "
626
+ f"{plive.get('pct_water_full', 0):.2f}.",
627
+ ]
628
+ docs.append(_doc_message("prithvi_live", body))
629
+
630
+ # ---- live signals -------------------------------------------------------
631
+ # NOAA tides, NWS alerts, NWS hourly obs change by the minute; reconciler
632
+ # treats these as "right now" context, not historical record.
633
+
634
+ # Live signals fold into scope_note for out-of-NYC; only emit standalone
635
+ # docs when the address is inside NYC (where the briefing has multiple
636
+ # sections that each cite different live sources).
637
+ tides = state.get("noaa_tides")
638
+ if not out_of_nyc and tides and tides.get("observed_ft_mllw") is not None:
639
+ body = [
640
+ f"Source: {CITATION_NOAA_TIDES}.",
641
+ f"Nearest tide gauge: {tides['station_name']} (NOAA station "
642
+ f"{tides['station_id']}, {tides['distance_km']} km away).",
643
+ f"Observation time (LST/LDT): {tides.get('obs_time') or 'unknown'}.",
644
+ f"Current observed water level above MLLW: {tides['observed_ft_mllw']} ft.",
645
+ ]
646
+ if tides.get("predicted_ft_mllw") is not None:
647
+ body.append(
648
+ f"Astronomical tide prediction at the same instant: "
649
+ f"{tides['predicted_ft_mllw']} ft above MLLW."
650
+ )
651
+ if tides.get("residual_ft") is not None:
652
+ interp = (
653
+ "approximately at predicted level"
654
+ if abs(tides["residual_ft"]) < 0.5 else
655
+ "elevated above prediction (positive residual is consistent with "
656
+ "wind-driven setup or storm surge)"
657
+ if tides["residual_ft"] > 0 else
658
+ "below prediction (negative residual is consistent with offshore wind)"
659
+ )
660
+ body.append(
661
+ f"Residual (observed minus predicted): {tides['residual_ft']} ft — "
662
+ f"{interp}."
663
+ )
664
+ body.append(
665
+ "Note: this is real-time tidal context for nearby coastal water level. "
666
+ "The address itself may be inland — the reading describes the bay/harbor "
667
+ "level the gauge is in, not the address."
668
+ )
669
+ docs.append(_doc_message("noaa_tides", body))
670
+
671
+ alerts = state.get("nws_alerts") or {}
672
+ active = alerts.get("alerts") or []
673
+ if not out_of_nyc and active:
674
+ body = [
675
+ f"Source: {CITATION_NWS_ALERTS}.",
676
+ f"Active flood-relevant alerts at this address right now: {len(active)}.",
677
+ ]
678
+ for a in active[:4]:
679
+ body.append(
680
+ f"- {a.get('event','(event)')} (severity: {a.get('severity','?')}, "
681
+ f"urgency: {a.get('urgency','?')}); issued {a.get('sent','')[:16]}, "
682
+ f"expires {a.get('expires','')[:16]}; "
683
+ f"sender: {a.get('sender_name','NWS')}; "
684
+ f"area: {(a.get('areaDesc') or '')[:120]}."
685
+ )
686
+ if a.get("headline"):
687
+ body.append(f" Headline (verbatim): {a['headline'][:240]}")
688
+ body.append(
689
+ "These are official NWS alerts retrieved live; if any FLOOD or "
690
+ "FLASH FLOOD WARNING/WATCH is in this list, it applies to the "
691
+ "address right now and should be foregrounded."
692
+ )
693
+ docs.append(_doc_message("nws_alerts", body))
694
+
695
+ ttm = state.get("ttm_forecast")
696
+ if not out_of_nyc and ttm and ttm.get("available") and ttm.get("interesting"):
697
+ body = [
698
+ f"Source: {CITATION_TTM_FORECAST}.",
699
+ f"Gauge: {ttm['station_name']} (NOAA {ttm['station_id']}, "
700
+ f"{ttm.get('distance_km', '?')} km from address — closest of "
701
+ "Battery / Kings Point / Sandy Hook).",
702
+ f"Context window: {ttm['context_length']} samples (~"
703
+ f"{ttm['context_length']*6/60:.1f} h of 6-min residual).",
704
+ f"Forecast horizon: {ttm['horizon_steps']} samples (~"
705
+ f"{ttm['horizon_steps']*6/60:.1f} h ahead).",
706
+ f"Recent residual: {ttm['history_recent_ft']} ft "
707
+ f"(residual = observed water level minus astronomical prediction).",
708
+ f"Recent peak |residual| in context: {ttm['history_peak_abs_ft']} ft.",
709
+ f"Forecast peak residual: {ttm['forecast_peak_ft']} ft, expected "
710
+ f"{ttm['forecast_peak_minutes_ahead']} minutes from now "
711
+ f"(at {ttm['forecast_peak_time_utc']} UTC).",
712
+ "INTERPRETATION: positive residual is a wind-driven setup or "
713
+ "storm-surge component on top of the tide; the model predicts the "
714
+ "non-tidal part NOAA's astronomical predictor does not cover.",
715
+ ]
716
+ docs.append(_doc_message("ttm_forecast", body))
717
+
718
+ # Per-address 311 flood-complaint forecast — different time scale,
719
+ # different signal entirely. TTM r2 zero-shot on daily counts
720
+ # (~17 months of history → ~3 months of forecast). Aggregated to
721
+ # weekly for the narration since readers think in weeks.
722
+ ttm311 = state.get("ttm_311_forecast")
723
+ if not out_of_nyc and ttm311 and ttm311.get("available"):
724
+ accel = ('YES — forecast > 50% above recent 30-day baseline'
725
+ if ttm311.get('accelerating')
726
+ else 'no — forecast in line with recent baseline')
727
+ body = [
728
+ "Source: IBM Granite TimeSeries TTM r2 (Ekambaram et al. 2024, "
729
+ "NeurIPS) zero-shot forecast on NYC 311 flood-complaint history "
730
+ "(Sewer Backup, Catch Basin Clogged/Flooding, Street Flooding, "
731
+ "Manhole Overflow) within "
732
+ f"{ttm311.get('radius_m', 200)} m of the address.",
733
+ f"Context window: {ttm311['days_context']} days "
734
+ f"({ttm311['days_context'] // 7} weeks) ending "
735
+ f"{ttm311.get('context_window_end', '?')}.",
736
+ f"Total complaints in context window: "
737
+ f"{ttm311['history_total_complaints']}.",
738
+ f"History recent 30-day rate: {ttm311['history_recent_30d_mean']} "
739
+ f"complaints/day "
740
+ f"(≈{ttm311['history_weekly_equivalent']} per week).",
741
+ f"Forecast horizon: {ttm311['days_horizon']} days "
742
+ f"({ttm311['days_horizon'] // 7} weeks) ahead.",
743
+ f"Forecast rate: {ttm311['forecast_mean_per_day']} complaints/day "
744
+ f"(≈{ttm311['forecast_weekly_equivalent']} per week).",
745
+ f"Forecast peak day: {ttm311['forecast_peak_day']} complaints, "
746
+ f"day +{ttm311['forecast_peak_day_offset']}.",
747
+ f"Acceleration cue: {accel}.",
748
+ "INTERPRETATION: this is a per-address pattern forecast, not "
749
+ "a city-wide trend. Zero-history addresses get a zero-baseline "
750
+ "forecast (legitimate); the more relevant cite is when there's "
751
+ "a multi-month complaint history that the model is extrapolating.",
752
+ ]
753
+ docs.append(_doc_message("ttm_311_forecast", body))
754
+
755
+ # FloodNet sensor forecast — TTM r2 on the nearest sensor's
756
+ # historical flood-event recurrence. Reuses the (512, 96)
757
+ # singleton from ttm_311_forecast — same model class, different
758
+ # data stream. Doc id includes the sensor deployment id so the
759
+ # citation is unambiguous when multiple sensors are nearby.
760
+ fnf = state.get("floodnet_forecast")
761
+ if not out_of_nyc and fnf and fnf.get("available"):
762
+ accel = ("YES — next-28-day forecast > 50% above prior-28-day "
763
+ "observed count"
764
+ if fnf.get("accelerating")
765
+ else "no — forecast in line with recent baseline")
766
+ doc_id = fnf.get("doc_id") or "floodnet_forecast"
767
+ body = [
768
+ "Source: FloodNet NYC ultrasonic depth sensor network "
769
+ "(api.floodnet.nyc) historical flood events, forecast by "
770
+ "IBM Granite TimeSeries TTM r2 (Ekambaram et al. 2024, "
771
+ "NeurIPS).",
772
+ f"Sensor: {fnf['sensor_name']} (deployment "
773
+ f"{fnf['sensor_id']}) at {fnf['sensor_street']}, "
774
+ f"{fnf['sensor_borough']}.",
775
+ f"Distance from query: {fnf['distance_from_query_m']} m.",
776
+ f"History window: {fnf['history_window_days']} days; "
777
+ f"{fnf['history_total_events']} flood events observed total, "
778
+ f"{fnf['history_recent_28d_events']} in the most recent "
779
+ f"28 days.",
780
+ f"Forecast horizon: {fnf['forecast_horizon_days']} days.",
781
+ f"Forecast next-28-day expected events: "
782
+ f"{fnf['forecast_28d_expected_events']}.",
783
+ f"Forecast peak day offset: +{fnf['forecast_peak_day_offset']} "
784
+ f"(value {fnf['forecast_peak_day_value']}).",
785
+ f"Acceleration cue: {accel}.",
786
+ "INTERPRETATION: this is a per-sensor recurrence forecast — "
787
+ "expected count of labelled flood events at that specific "
788
+ "deployment over the horizon, not an above-curb-event "
789
+ "probability. CUSP/Brooklyn College operates the sensors and "
790
+ "publishes the historical events; this forecast is Riprap's "
791
+ "extension to the same dataset, computable per-query.",
792
+ ]
793
+ docs.append(_doc_message(doc_id, body))
794
+
795
+ obs = state.get("nws_obs")
796
+ if not out_of_nyc and obs and obs.get("station_id") and obs.get("error") is None and (
797
+ obs.get("precip_last_hour_mm") is not None or
798
+ obs.get("precip_last_6h_mm") is not None
799
+ ):
800
+ body = [
801
+ f"Source: {CITATION_NWS_OBS}.",
802
+ f"Nearest hourly METAR station: {obs['station_name']} ({obs['station_id']}, "
803
+ f"{obs['distance_km']} km away).",
804
+ f"Observation time: {obs.get('obs_time') or 'unknown'}.",
805
+ ]
806
+ if obs.get("precip_last_hour_mm") is not None:
807
+ body.append(f"Precipitation last 1 h: {obs['precip_last_hour_mm']} mm.")
808
+ if obs.get("precip_last_3h_mm") is not None:
809
+ body.append(f"Precipitation last 3 h: {obs['precip_last_3h_mm']} mm.")
810
+ if obs.get("precip_last_6h_mm") is not None:
811
+ body.append(f"Precipitation last 6 h: {obs['precip_last_6h_mm']} mm.")
812
+ body.append(
813
+ "Heavy short-duration rainfall (e.g. >25 mm/h or >50 mm/6 h) is the "
814
+ "primary driver of NYC pluvial / sewer-backup flooding; the static "
815
+ "DEP scenarios assume specific rainfall intensities."
816
+ )
817
+ docs.append(_doc_message("nws_obs", body))
818
+
819
  nyc311 = state.get("nyc311")
820
+ if not out_of_nyc and nyc311 and nyc311.get("n", 0) > 0:
821
  body = [
822
  "Source: NYC 311 service requests (Socrata erm2-nwe9, 2010-present).",
823
  f"311 flood-related complaints within {nyc311['radius_m']} m, last {nyc311['years']} years: {nyc311['n']}.",
 
830
  body.append(f"Per-year counts: {yrs}.")
831
  docs.append(_doc_message("nyc311", body))
832
 
833
+ # ---- Register specialists (transit / housing / education / healthcare) ----
834
+ # Each emits one doc per asset so the reconciler can cite specifically
835
+ # (e.g. [mta_entrance_54], [nycha_dev_004]). Caps keep the total payload
836
+ # bounded; specialists already truncated to their per-query maxes.
837
+ mta = state.get("mta_entrances")
838
+ if not out_of_nyc and mta and mta.get("available"):
839
+ for e in mta.get("entrances", [])[:6]:
840
+ sid = e.get("station_id")
841
+ body = [
842
+ "Source: MTA Open Data subway entrances "
843
+ "+ NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) "
844
+ "+ NYC DEP Stormwater Flood Maps + USGS 3DEP DEM.",
845
+ (f"Station {e.get('station_name')} ({e.get('daytime_routes')}), "
846
+ f"entrance type {e.get('entrance_type')}, "
847
+ f"{e.get('distance_m')} m from query."),
848
+ (f"Entrance elevation {e.get('elevation_m')} m, "
849
+ f"HAND (height above nearest drainage) {e.get('hand_m')} m."),
850
+ ]
851
+ if e.get("inside_sandy_2012"):
852
+ body.append("This entrance is inside the 2012 Sandy "
853
+ "Inundation Zone (empirical).")
854
+ else:
855
+ body.append("This entrance is NOT inside the 2012 Sandy "
856
+ "Inundation Zone.")
857
+ if (e.get("dep_extreme_2080_class") or 0) > 0:
858
+ body.append(
859
+ f"NYC DEP Extreme-2080 scenario: "
860
+ f"{e.get('dep_extreme_2080_label')}.")
861
+ if (e.get("dep_moderate_2050_class") or 0) > 0:
862
+ body.append(
863
+ f"NYC DEP Moderate-2050 scenario: "
864
+ f"{e.get('dep_moderate_2050_label')}.")
865
+ body.append("ADA-accessible (heuristic from entrance_type): "
866
+ f"{'yes' if e.get('ada_accessible') else 'no'}.")
867
+ docs.append(_doc_message(f"mta_entrance_{sid}", body))
868
+
869
+ nycha = state.get("nycha_developments")
870
+ if not out_of_nyc and nycha and nycha.get("available"):
871
+ for d in nycha.get("developments", [])[:4]:
872
+ tds = d.get("tds_num")
873
+ body = [
874
+ "Source: NYC Open Data NYCHA Developments (phvi-damg) "
875
+ "+ NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) "
876
+ "+ NYC DEP Stormwater Flood Maps + USGS 3DEP DEM.",
877
+ (f"NYCHA development {d.get('development')} (TDS {tds}, "
878
+ f"{d.get('borough')}), footprint {d.get('footprint_km2')} km², "
879
+ f"{d.get('distance_m')} m from query."),
880
+ (f"Representative-point elevation {d.get('rep_elevation_m')} m, "
881
+ f"HAND {d.get('rep_hand_m')} m."),
882
+ (f"{d.get('pct_inside_sandy_2012')}% of footprint inside the "
883
+ "2012 Sandy Inundation Zone (empirical)."),
884
+ ]
885
+ if (d.get("pct_in_dep_extreme_2080") or 0) > 0:
886
+ body.append(
887
+ f"{d.get('pct_in_dep_extreme_2080')}% of footprint inside "
888
+ "NYC DEP Extreme-2080 scenario "
889
+ f"(of which {d.get('pct_in_dep_extreme_2080_deep')}% in the "
890
+ "deepest >4 ft band).")
891
+ if (d.get("pct_in_dep_moderate_2050") or 0) > 0:
892
+ body.append(
893
+ f"{d.get('pct_in_dep_moderate_2050')}% of footprint inside "
894
+ "NYC DEP Moderate-2050 scenario.")
895
+ docs.append(_doc_message(f"nycha_dev_{tds}", body))
896
+
897
+ schools = state.get("doe_schools")
898
+ if not out_of_nyc and schools and schools.get("available"):
899
+ for s in schools.get("schools", [])[:5]:
900
+ lc = s.get("loc_code")
901
+ body = [
902
+ "Source: NYC DOE Locations Points "
903
+ "+ NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) "
904
+ "+ NYC DEP Stormwater Flood Maps + USGS 3DEP DEM.",
905
+ (f"School {s.get('loc_name')} ({lc}, {s.get('address')}, "
906
+ f"{s.get('borough')}), {s.get('distance_m')} m from query."),
907
+ (f"School-point elevation {s.get('elevation_m')} m, "
908
+ f"HAND {s.get('hand_m')} m."),
909
+ ]
910
+ if s.get("inside_sandy_2012"):
911
+ body.append("This school is inside the 2012 Sandy "
912
+ "Inundation Zone (empirical).")
913
+ else:
914
+ body.append("This school is NOT inside the 2012 Sandy "
915
+ "Inundation Zone (centroid-point join; "
916
+ "building-footprint join is a documented "
917
+ "follow-up).")
918
+ if (s.get("dep_extreme_2080_class") or 0) > 0:
919
+ body.append(
920
+ f"NYC DEP Extreme-2080 scenario: "
921
+ f"{s.get('dep_extreme_2080_label')}.")
922
+ if (s.get("dep_moderate_2050_class") or 0) > 0:
923
+ body.append(
924
+ f"NYC DEP Moderate-2050 scenario: "
925
+ f"{s.get('dep_moderate_2050_label')}.")
926
+ docs.append(_doc_message(f"doe_school_{lc}", body))
927
+
928
+ hospitals = state.get("doh_hospitals")
929
+ if not out_of_nyc and hospitals and hospitals.get("available"):
930
+ for h in hospitals.get("hospitals", [])[:4]:
931
+ fid = h.get("fac_id")
932
+ body = [
933
+ "Source: NYS DOH Health Facility Certification (vn5v-hh5r) "
934
+ "+ NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) "
935
+ "+ NYC DEP Stormwater Flood Maps + USGS 3DEP DEM.",
936
+ (f"Hospital {h.get('facility_name')} (NYS DOH facility "
937
+ f"{fid}, {h.get('address')}, {h.get('borough')}), "
938
+ f"operator {h.get('operator_name')}, "
939
+ f"ownership {h.get('ownership_type')}, "
940
+ f"{h.get('distance_m')} m from query."),
941
+ (f"Hospital-point elevation {h.get('elevation_m')} m, "
942
+ f"HAND {h.get('hand_m')} m."),
943
+ ]
944
+ if h.get("inside_sandy_2012"):
945
+ body.append("This hospital is inside the 2012 Sandy "
946
+ "Inundation Zone (empirical).")
947
+ else:
948
+ body.append("This hospital is NOT inside the 2012 Sandy "
949
+ "Inundation Zone (centroid-point join; "
950
+ "building-footprint join is a documented "
951
+ "follow-up).")
952
+ if (h.get("dep_extreme_2080_class") or 0) > 0:
953
+ body.append(
954
+ f"NYC DEP Extreme-2080 scenario: "
955
+ f"{h.get('dep_extreme_2080_label')}.")
956
+ if (h.get("dep_moderate_2050_class") or 0) > 0:
957
+ body.append(
958
+ f"NYC DEP Moderate-2050 scenario: "
959
+ f"{h.get('dep_moderate_2050_label')}.")
960
+ docs.append(_doc_message(f"nyc_hospital_{fid}", body))
961
+
962
  return docs
963
 
964
 
965
  def reconcile(state: dict[str, Any], model: str = OLLAMA_MODEL,
966
+ return_audit: bool = False, on_token=None):
967
  """Run Granite reconciliation, then drop sentences with ungrounded numbers.
968
 
969
+ If on_token is provided, the model is run in streaming mode and
970
+ on_token(delta) is called for each chunk as Granite generates.
971
+
972
  If return_audit=True, returns (paragraph, audit_dict) where audit_dict
973
  has 'raw' (Granite's original output) and 'dropped' (list of dropped
974
  sentences with their ungrounded numeric tokens).
 
978
  msg = "No grounded data available for this address."
979
  return (msg, {"raw": msg, "dropped": []}) if return_audit else msg
980
 
981
+ messages = doc_msgs + [
982
+ {"role": "system", "content": EXTRA_SYSTEM_PROMPT},
983
+ {"role": "user", "content": "Write the cited paragraph now."},
984
+ ]
985
+ # single_address: 13 specialists may fire, doc bodies are short.
986
+ # num_ctx 4096 covers ~700 system + ~2500 docs. num_predict 400 caps
987
+ # the 4-section briefing at ~300-350 tokens.
988
+ OPTS = {"temperature": 0, "num_ctx": 4096, "num_predict": 400}
989
+ if on_token is None:
990
+ resp = llm.chat(model=model, messages=messages, options=OPTS)
991
+ raw = resp["message"]["content"].strip()
992
+ else:
993
+ chunks: list[str] = []
994
+ for chunk in llm.chat(model=model, messages=messages, stream=True,
995
+ options=OPTS):
996
+ delta = (chunk.get("message") or {}).get("content") or ""
997
+ if delta:
998
+ chunks.append(delta)
999
+ on_token(delta)
1000
+ raw = "".join(chunks).strip()
1001
 
1002
+ cleaned, dropped = verify_paragraph(raw, doc_msgs)
1003
  if return_audit:
1004
  return cleaned, {"raw": raw, "dropped": dropped}
1005
  return cleaned
app/register_builder.py CHANGED
@@ -9,15 +9,16 @@ from __future__ import annotations
9
  import json
10
  import sys
11
  import time
 
12
  from pathlib import Path
13
- from typing import Any, Callable
14
 
15
  import geopandas as gpd
16
- from shapely.geometry import Point
17
 
18
  from app.context import floodnet, microtopo, nyc311
19
  from app.flood_layers import dep_stormwater, ida_hwm, sandy_inundation
20
- from app.rag import retrieve as rag_retrieve, warm as rag_warm
 
21
  from app.reconcile import reconcile as run_reconcile
22
  from app.score import score_frame
23
 
 
9
  import json
10
  import sys
11
  import time
12
+ from collections.abc import Callable
13
  from pathlib import Path
14
+ from typing import Any
15
 
16
  import geopandas as gpd
 
17
 
18
  from app.context import floodnet, microtopo, nyc311
19
  from app.flood_layers import dep_stormwater, ida_hwm, sandy_inundation
20
+ from app.rag import retrieve as rag_retrieve
21
+ from app.rag import warm as rag_warm
22
  from app.reconcile import reconcile as run_reconcile
23
  from app.score import score_frame
24
 
app/registers/__init__.py ADDED
File without changes
app/registers/_footprint.py ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Buffered point-overlap helpers for the register specialists.
2
+
3
+ The four register specialists (MTA entrances, NYCHA developments,
4
+ DOE schools, NYS DOH hospitals) all need to test whether an asset
5
+ intersects a flood polygon. NYCHA developments are already polygons
6
+ (real building-group footprints), so polygon-vs-polygon `intersects`
7
+ is correct. The other three are stored as point centroids:
8
+
9
+ - MTA entrances are physical entrances; the point is the centerline
10
+ - DOE schools are address centroids (administrative point), but the
11
+ actual building extends ~50 m around it
12
+ - NYS DOH hospitals are address centroids; campuses are 80–250 m wide
13
+
14
+ Pure point-in-polygon on the centroid produces false negatives at
15
+ the boundary: NYU Langone, Stuyvesant HS, P.S. 89 all sit on
16
+ buildings whose footprints overlap the 2012 Sandy zone but whose
17
+ recorded centroid points just miss it.
18
+
19
+ The honest fix is a join against the actual NYC Building Footprints
20
+ + PLUTO BBL → footprint dataset (~400 MB). That's a separate
21
+ ingestion task. This module is the surgical-and-shippable
22
+ intermediate fix: buffer the centroid by an asset-class-appropriate
23
+ radius, then ask `intersects` against the same Sandy / DEP polygons
24
+ the existing helpers use. The `footprint_buffer_m` is recorded in
25
+ the specialist output so the trace UI shows what radius was used —
26
+ auditability over hidden assumptions.
27
+ """
28
+ from __future__ import annotations
29
+
30
+ import logging
31
+
32
+ log = logging.getLogger("riprap.register.footprint")
33
+
34
+ # Per-asset-class footprint buffer (metres). Conservative enough to
35
+ # catch known canonical false-negatives (NYU Langone, Stuyvesant HS,
36
+ # P.S. 89) without sweeping in obviously-distant buildings.
37
+ BUFFER_MTA_ENTRANCE_M = 8
38
+ BUFFER_DOE_SCHOOL_M = 50
39
+ BUFFER_DOH_HOSPITAL_M = 100
40
+
41
+
42
+ def inside_sandy_buffered(lat: float, lon: float, buffer_m: float) -> bool:
43
+ """True if the buffer of (lat, lon) by buffer_m metres intersects
44
+ the 2012 Sandy Inundation Zone."""
45
+ try:
46
+ import geopandas as gpd
47
+ from shapely.geometry import Point
48
+
49
+ from app.flood_layers import sandy_inundation
50
+ # Project before buffering so the buffer is metric. EPSG:2263
51
+ # is NYC State Plane (feet) — convert metres to feet for buffer.
52
+ ft = buffer_m * 3.280839895
53
+ pt = gpd.GeoDataFrame(
54
+ geometry=[Point(lon, lat)], crs="EPSG:4326"
55
+ ).to_crs("EPSG:2263")
56
+ pt["geometry"] = pt.geometry.buffer(ft)
57
+ return bool(sandy_inundation.join(pt).iloc[0])
58
+ except Exception:
59
+ log.exception("buffered sandy join failed")
60
+ return False
61
+
62
+
63
+ def dep_class_buffered(lat: float, lon: float, buffer_m: float,
64
+ scenario: str) -> tuple[int | None, str | None]:
65
+ """Max DEP depth class within `buffer_m` of (lat, lon).
66
+
67
+ Returns (depth_class, depth_label). Higher class wins on overlap,
68
+ matching `dep_stormwater.join`'s semantics. None on failure.
69
+ """
70
+ try:
71
+ import geopandas as gpd
72
+ from shapely.geometry import Point
73
+
74
+ from app.flood_layers import dep_stormwater
75
+ ft = buffer_m * 3.280839895
76
+ pt = gpd.GeoDataFrame(
77
+ geometry=[Point(lon, lat)], crs="EPSG:4326"
78
+ ).to_crs("EPSG:2263")
79
+ pt["geometry"] = pt.geometry.buffer(ft)
80
+ j = dep_stormwater.join(pt, scenario).iloc[0]
81
+ return int(j["depth_class"]), str(j["depth_label"])
82
+ except Exception:
83
+ log.exception("buffered dep join failed for %s", scenario)
84
+ return None, None
app/registers/doe_schools.py ADDED
@@ -0,0 +1,200 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """doe_school_exposure — flood-exposure briefing per NYC public school.
2
+
3
+ Point-based register specialist (1992 NYC DOE school points). Same
4
+ join pattern as the MTA-entrance specialist. Per queried (lat, lon),
5
+ returns up to N schools within `radius_m`, enriched with:
6
+
7
+ - inside_sandy_2012 (point-in-polygon, empirical)
8
+ - dep_extreme_2080_class (point-in-polygon, modeled)
9
+ - dep_moderate_2050_class (point-in-polygon, modeled)
10
+ - elevation_m (USGS 3DEP DEM, proxy)
11
+ - hand_m (derived HAND raster, proxy)
12
+
13
+ doc_id format: `doe_school_<loc_code>`. Schools are physical
14
+ buildings that serve as evacuation hubs in city OEM plans, so
15
+ "this school sits inside the 2012 Sandy zone" is a structural
16
+ claim that's directly relevant to flood planning.
17
+ """
18
+
19
+ from __future__ import annotations
20
+
21
+ import json
22
+ import logging
23
+ import math
24
+ import sys
25
+ from dataclasses import dataclass
26
+ from functools import lru_cache
27
+ from pathlib import Path
28
+
29
+ _ROOT = Path(__file__).resolve().parents[2]
30
+ if str(_ROOT) not in sys.path:
31
+ sys.path.insert(0, str(_ROOT))
32
+
33
+ log = logging.getLogger("riprap.doe_school")
34
+
35
+ DATA = _ROOT / "data"
36
+ SCHOOLS = DATA / "schools.geojson"
37
+
38
+ DEFAULT_RADIUS_M = 1500
39
+ DEFAULT_MAX_PER_QUERY = 6
40
+
41
+ BORO_NAME = {"1": "MANHATTAN", "2": "BRONX", "3": "BROOKLYN",
42
+ "4": "QUEENS", "5": "STATEN ISLAND"}
43
+
44
+ MANAGED_BY_LABEL = {"1": "DOE-managed", "2": "Charter or other"}
45
+
46
+
47
+ @dataclass
48
+ class SchoolFinding:
49
+ loc_code: str
50
+ loc_name: str
51
+ address: str
52
+ borough: str
53
+ bin: str
54
+ bbl: str
55
+ managed_by: str
56
+ school_lat: float
57
+ school_lon: float
58
+ distance_m: float
59
+ elevation_m: float | None
60
+ hand_m: float | None
61
+ inside_sandy_2012: bool
62
+ dep_extreme_2080_class: int | None
63
+ dep_extreme_2080_label: str | None
64
+ dep_moderate_2050_class: int | None
65
+ dep_moderate_2050_label: str | None
66
+
67
+
68
+ def _haversine_m(lat1, lon1, lat2, lon2) -> float:
69
+ R = 6371000.0
70
+ p1, p2 = math.radians(lat1), math.radians(lat2)
71
+ dp = math.radians(lat2 - lat1); dl = math.radians(lon2 - lon1)
72
+ a = math.sin(dp / 2) ** 2 + math.cos(p1) * math.cos(p2) * math.sin(dl / 2) ** 2
73
+ return 2 * R * math.asin(math.sqrt(a))
74
+
75
+
76
+ @lru_cache(maxsize=1)
77
+ def _load_schools():
78
+ import geopandas as gpd
79
+ gdf = gpd.read_file(SCHOOLS)
80
+ gdf["lat"] = gdf.geometry.y
81
+ gdf["lon"] = gdf.geometry.x
82
+ return gdf.reset_index(drop=True)
83
+
84
+
85
+ def _schools_near(lat: float, lon: float, radius_m: float):
86
+ gdf = _load_schools()
87
+ deg = radius_m / 90_000
88
+ sub = gdf[(gdf["lat"].between(lat - deg, lat + deg))
89
+ & (gdf["lon"].between(lon - deg, lon + deg))].copy()
90
+ if sub.empty:
91
+ return sub
92
+ sub["distance_m"] = sub.apply(
93
+ lambda r: _haversine_m(lat, lon, r["lat"], r["lon"]), axis=1)
94
+ return sub[sub["distance_m"] <= radius_m].sort_values("distance_m")
95
+
96
+
97
+ def _sample_raster(raster_path: Path, lat: float, lon: float) -> float | None:
98
+ if not raster_path.exists():
99
+ return None
100
+ try:
101
+ import rasterio
102
+ with rasterio.open(raster_path) as src:
103
+ v = next(src.sample([(lon, lat)]))[0]
104
+ v = float(v)
105
+ if math.isnan(v) or v == src.nodata:
106
+ return None
107
+ return v
108
+ except Exception:
109
+ log.exception("raster sample failed for %s", raster_path)
110
+ return None
111
+
112
+
113
+ from app.registers._footprint import (
114
+ BUFFER_DOE_SCHOOL_M,
115
+ dep_class_buffered,
116
+ inside_sandy_buffered,
117
+ )
118
+
119
+
120
+ def _inside_sandy(lat: float, lon: float) -> bool:
121
+ return inside_sandy_buffered(lat, lon, BUFFER_DOE_SCHOOL_M)
122
+
123
+
124
+ def _dep_class(lat: float, lon: float, scenario: str):
125
+ return dep_class_buffered(lat, lon, BUFFER_DOE_SCHOOL_M, scenario)
126
+
127
+
128
+ def summary_for_point(lat: float, lon: float,
129
+ radius_m: float = DEFAULT_RADIUS_M,
130
+ max_schools: int = DEFAULT_MAX_PER_QUERY) -> dict:
131
+ near = _schools_near(lat, lon, radius_m)
132
+ if near.empty:
133
+ return {"available": False,
134
+ "n_schools": 0,
135
+ "radius_m": radius_m,
136
+ "schools": []}
137
+
138
+ near = near.head(max_schools)
139
+ findings: list[SchoolFinding] = []
140
+ for _, row in near.iterrows():
141
+ slat, slon = float(row["lat"]), float(row["lon"])
142
+ elev = _sample_raster(DATA / "nyc_dem_30m.tif", slat, slon)
143
+ hand = _sample_raster(DATA / "hand.tif", slat, slon)
144
+ in_sandy = _inside_sandy(slat, slon)
145
+ d80c, d80l = _dep_class(slat, slon, "dep_extreme_2080")
146
+ d50c, d50l = _dep_class(slat, slon, "dep_moderate_2050")
147
+ boronum = str(row.get("boronum", ""))
148
+ findings.append(SchoolFinding(
149
+ loc_code=str(row["loc_code"]),
150
+ loc_name=str(row["loc_name"]),
151
+ address=str(row["address"]).strip(),
152
+ borough=BORO_NAME.get(boronum, boronum),
153
+ bin=str(row["bin"]),
154
+ bbl=str(row["bbl"]),
155
+ managed_by=MANAGED_BY_LABEL.get(str(row["managed_by"]),
156
+ str(row["managed_by"])),
157
+ school_lat=round(slat, 5),
158
+ school_lon=round(slon, 5),
159
+ distance_m=round(float(row["distance_m"]), 1),
160
+ elevation_m=round(elev, 2) if elev is not None else None,
161
+ hand_m=round(hand, 2) if hand is not None else None,
162
+ inside_sandy_2012=in_sandy,
163
+ dep_extreme_2080_class=d80c,
164
+ dep_extreme_2080_label=d80l,
165
+ dep_moderate_2050_class=d50c,
166
+ dep_moderate_2050_label=d50l,
167
+ ))
168
+
169
+ n_in_sandy = sum(1 for f in findings if f.inside_sandy_2012)
170
+ n_dep_2080 = sum(1 for f in findings
171
+ if (f.dep_extreme_2080_class or 0) > 0)
172
+ return {
173
+ "available": True,
174
+ "n_schools": len(findings),
175
+ "radius_m": radius_m,
176
+ "footprint_buffer_m": BUFFER_DOE_SCHOOL_M,
177
+ "n_inside_sandy_2012": n_in_sandy,
178
+ "n_in_dep_extreme_2080": n_dep_2080,
179
+ "schools": [vars(f) for f in findings],
180
+ "citation": ("NYC DOE Locations Points + NYC OEM Sandy 2012 "
181
+ "Inundation Zone (5xsi-dfpx) + NYC DEP Stormwater "
182
+ "Flood Maps + USGS 3DEP DEM"),
183
+ }
184
+
185
+
186
+ def main() -> int:
187
+ import argparse
188
+ ap = argparse.ArgumentParser()
189
+ ap.add_argument("--lat", type=float, required=True)
190
+ ap.add_argument("--lon", type=float, required=True)
191
+ ap.add_argument("--radius", type=float, default=DEFAULT_RADIUS_M)
192
+ ap.add_argument("--max", type=int, default=DEFAULT_MAX_PER_QUERY)
193
+ args = ap.parse_args()
194
+ s = summary_for_point(args.lat, args.lon, args.radius, args.max)
195
+ print(json.dumps(s, indent=2, default=str))
196
+ return 0
197
+
198
+
199
+ if __name__ == "__main__":
200
+ sys.exit(main())
app/registers/doh_hospitals.py ADDED
@@ -0,0 +1,194 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """nys_doh_hospital_exposure — flood-exposure briefing per NYC hospital.
2
+
3
+ Point-based register specialist on 67 NYC hospitals from the NYS DOH
4
+ Health Facility Certification Information dataset (Article 28
5
+ hospitals only, filtered to the 5 NYC counties). Same join pattern
6
+ as MTA entrances and DOE schools.
7
+
8
+ Hospitals are essential infrastructure: a hospital inside the 2012
9
+ Sandy Inundation Zone tells planners and emergency-management
10
+ audiences something concrete about lifeline-asset exposure. NYU
11
+ Langone, Bellevue, and Coney Island Hospital all evacuated patients
12
+ during Sandy — those events are public-record and well-documented.
13
+
14
+ doc_id format: `nyc_hospital_<fac_id>` (NYS DOH facility ID).
15
+ """
16
+
17
+ from __future__ import annotations
18
+
19
+ import json
20
+ import logging
21
+ import math
22
+ import sys
23
+ from dataclasses import dataclass
24
+ from functools import lru_cache
25
+ from pathlib import Path
26
+
27
+ _ROOT = Path(__file__).resolve().parents[2]
28
+ if str(_ROOT) not in sys.path:
29
+ sys.path.insert(0, str(_ROOT))
30
+
31
+ log = logging.getLogger("riprap.hospital")
32
+
33
+ DATA = _ROOT / "data"
34
+ HOSPITALS = DATA / "hospitals.geojson"
35
+
36
+ DEFAULT_RADIUS_M = 3000 # hospitals are sparse; wider radius
37
+ DEFAULT_MAX_PER_QUERY = 5
38
+
39
+ COUNTY_TO_BOROUGH = {
40
+ "New York": "MANHATTAN", "Kings": "BROOKLYN", "Bronx": "BRONX",
41
+ "Queens": "QUEENS", "Richmond": "STATEN ISLAND",
42
+ }
43
+
44
+
45
+ @dataclass
46
+ class HospitalFinding:
47
+ fac_id: str
48
+ facility_name: str
49
+ address: str
50
+ borough: str
51
+ operator_name: str
52
+ ownership_type: str
53
+ hospital_lat: float
54
+ hospital_lon: float
55
+ distance_m: float
56
+ elevation_m: float | None
57
+ hand_m: float | None
58
+ inside_sandy_2012: bool
59
+ dep_extreme_2080_class: int | None
60
+ dep_extreme_2080_label: str | None
61
+ dep_moderate_2050_class: int | None
62
+ dep_moderate_2050_label: str | None
63
+
64
+
65
+ def _haversine_m(lat1, lon1, lat2, lon2) -> float:
66
+ R = 6371000.0
67
+ p1, p2 = math.radians(lat1), math.radians(lat2)
68
+ dp = math.radians(lat2 - lat1); dl = math.radians(lon2 - lon1)
69
+ a = math.sin(dp / 2) ** 2 + math.cos(p1) * math.cos(p2) * math.sin(dl / 2) ** 2
70
+ return 2 * R * math.asin(math.sqrt(a))
71
+
72
+
73
+ @lru_cache(maxsize=1)
74
+ def _load_hospitals():
75
+ import geopandas as gpd
76
+ gdf = gpd.read_file(HOSPITALS)
77
+ gdf["lat"] = gdf.geometry.y
78
+ gdf["lon"] = gdf.geometry.x
79
+ return gdf.reset_index(drop=True)
80
+
81
+
82
+ def _hospitals_near(lat: float, lon: float, radius_m: float):
83
+ gdf = _load_hospitals()
84
+ deg = radius_m / 90_000
85
+ sub = gdf[(gdf["lat"].between(lat - deg, lat + deg))
86
+ & (gdf["lon"].between(lon - deg, lon + deg))].copy()
87
+ if sub.empty:
88
+ return sub
89
+ sub["distance_m"] = sub.apply(
90
+ lambda r: _haversine_m(lat, lon, r["lat"], r["lon"]), axis=1)
91
+ return sub[sub["distance_m"] <= radius_m].sort_values("distance_m")
92
+
93
+
94
+ def _sample_raster(raster_path: Path, lat: float, lon: float) -> float | None:
95
+ if not raster_path.exists():
96
+ return None
97
+ try:
98
+ import rasterio
99
+ with rasterio.open(raster_path) as src:
100
+ v = next(src.sample([(lon, lat)]))[0]
101
+ v = float(v)
102
+ if math.isnan(v) or v == src.nodata:
103
+ return None
104
+ return v
105
+ except Exception:
106
+ log.exception("raster sample failed for %s", raster_path)
107
+ return None
108
+
109
+
110
+ from app.registers._footprint import (
111
+ BUFFER_DOH_HOSPITAL_M,
112
+ dep_class_buffered,
113
+ inside_sandy_buffered,
114
+ )
115
+
116
+
117
+ def _inside_sandy(lat: float, lon: float) -> bool:
118
+ return inside_sandy_buffered(lat, lon, BUFFER_DOH_HOSPITAL_M)
119
+
120
+
121
+ def _dep_class(lat: float, lon: float, scenario: str):
122
+ return dep_class_buffered(lat, lon, BUFFER_DOH_HOSPITAL_M, scenario)
123
+
124
+
125
+ def summary_for_point(lat: float, lon: float,
126
+ radius_m: float = DEFAULT_RADIUS_M,
127
+ max_hospitals: int = DEFAULT_MAX_PER_QUERY) -> dict:
128
+ near = _hospitals_near(lat, lon, radius_m)
129
+ if near.empty:
130
+ return {"available": False,
131
+ "n_hospitals": 0,
132
+ "radius_m": radius_m,
133
+ "hospitals": []}
134
+
135
+ near = near.head(max_hospitals)
136
+ findings: list[HospitalFinding] = []
137
+ for _, row in near.iterrows():
138
+ hlat, hlon = float(row["lat"]), float(row["lon"])
139
+ elev = _sample_raster(DATA / "nyc_dem_30m.tif", hlat, hlon)
140
+ hand = _sample_raster(DATA / "hand.tif", hlat, hlon)
141
+ in_sandy = _inside_sandy(hlat, hlon)
142
+ d80c, d80l = _dep_class(hlat, hlon, "dep_extreme_2080")
143
+ d50c, d50l = _dep_class(hlat, hlon, "dep_moderate_2050")
144
+ findings.append(HospitalFinding(
145
+ fac_id=str(row["fac_id"]),
146
+ facility_name=str(row["facility_name"]),
147
+ address=f"{row['address1']}, {row['city']}".strip(", "),
148
+ borough=COUNTY_TO_BOROUGH.get(str(row["county"]), str(row["county"])),
149
+ operator_name=str(row["operator_name"]),
150
+ ownership_type=str(row["ownership_type"]),
151
+ hospital_lat=round(hlat, 5),
152
+ hospital_lon=round(hlon, 5),
153
+ distance_m=round(float(row["distance_m"]), 1),
154
+ elevation_m=round(elev, 2) if elev is not None else None,
155
+ hand_m=round(hand, 2) if hand is not None else None,
156
+ inside_sandy_2012=in_sandy,
157
+ dep_extreme_2080_class=d80c,
158
+ dep_extreme_2080_label=d80l,
159
+ dep_moderate_2050_class=d50c,
160
+ dep_moderate_2050_label=d50l,
161
+ ))
162
+
163
+ n_in_sandy = sum(1 for f in findings if f.inside_sandy_2012)
164
+ n_dep_2080 = sum(1 for f in findings
165
+ if (f.dep_extreme_2080_class or 0) > 0)
166
+ return {
167
+ "available": True,
168
+ "n_hospitals": len(findings),
169
+ "radius_m": radius_m,
170
+ "footprint_buffer_m": BUFFER_DOH_HOSPITAL_M,
171
+ "n_inside_sandy_2012": n_in_sandy,
172
+ "n_in_dep_extreme_2080": n_dep_2080,
173
+ "hospitals": [vars(f) for f in findings],
174
+ "citation": ("NYS DOH Health Facility Certification (vn5v-hh5r) + "
175
+ "NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) + "
176
+ "NYC DEP Stormwater Flood Maps + USGS 3DEP DEM"),
177
+ }
178
+
179
+
180
+ def main() -> int:
181
+ import argparse
182
+ ap = argparse.ArgumentParser()
183
+ ap.add_argument("--lat", type=float, required=True)
184
+ ap.add_argument("--lon", type=float, required=True)
185
+ ap.add_argument("--radius", type=float, default=DEFAULT_RADIUS_M)
186
+ ap.add_argument("--max", type=int, default=DEFAULT_MAX_PER_QUERY)
187
+ args = ap.parse_args()
188
+ s = summary_for_point(args.lat, args.lon, args.radius, args.max)
189
+ print(json.dumps(s, indent=2, default=str))
190
+ return 0
191
+
192
+
193
+ if __name__ == "__main__":
194
+ sys.exit(main())
app/registers/mta_entrances.py ADDED
@@ -0,0 +1,244 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """mta_entrance_exposure — flood-exposure briefing per subway entrance.
2
+
3
+ The headline new specialist for the IBM senior technical staffer's
4
+ "subway entrances" reaction. Joins:
5
+
6
+ - MTA Open Data subway-entrance geometry (data/mta_entrances.geojson,
7
+ 2120 entrances city-wide).
8
+ - NYC OEM Sandy 2012 Inundation Zone (data/sandy_inundation.geojson)
9
+ — empirical evidence (a flood actually happened here).
10
+ - NYC DEP Stormwater Flood Maps for Extreme-2080, Moderate-2050,
11
+ Moderate-current scenarios — modeled evidence.
12
+ - USGS 3DEP DEM (data/nyc_dem_30m.tif) for entrance-level elevation.
13
+ - HAND raster (data/hand.tif) for height above nearest drainage.
14
+ - Entrance type → ADA-status heuristic (Elevator / Ramp = accessible).
15
+
16
+ Per queried address, returns the entrances within a configurable
17
+ radius (default 800 m) with structured per-entrance claims the
18
+ reconciler can cite. doc_id format: `mta_entrance_<station_id>`.
19
+
20
+ Honest scope (per Riprap discipline):
21
+ - This is an EXPOSURE specialist, not a damage forecast. We say
22
+ "this entrance sits inside the 2012 Sandy zone" — we don't say
23
+ "this entrance will flood again in the next storm".
24
+ - The Sandy / DEP layers are point-in-polygon over public-record
25
+ geometry; ADA status from the MTA Open Data `entrance_type`
26
+ column is a heuristic, not the authoritative MTA accessibility
27
+ list.
28
+ - Documented MTA Sandy-recovery records for specific stations are
29
+ NOT included in this first cut — only the empirical-inundation
30
+ membership. Adding station-level recovery citations requires
31
+ parsing the MTA's "Hurricane Sandy: Three Years Later" report
32
+ and is a follow-up.
33
+ """
34
+
35
+ from __future__ import annotations
36
+
37
+ import json
38
+ import logging
39
+ import math
40
+ import sys
41
+ from dataclasses import dataclass
42
+ from functools import lru_cache
43
+ from pathlib import Path
44
+
45
+ # Ensure `app/` is importable when this experiment is invoked directly
46
+ # from its own subdir.
47
+ _ROOT = Path(__file__).resolve().parents[2]
48
+ if str(_ROOT) not in sys.path:
49
+ sys.path.insert(0, str(_ROOT))
50
+
51
+ log = logging.getLogger("riprap.mta_entrance")
52
+
53
+ DATA = Path(__file__).resolve().parents[2] / "data"
54
+ MTA_ENTRANCES = DATA / "mta_entrances.geojson"
55
+
56
+ ADA_ACCESSIBLE_TYPES = {"Elevator", "Ramp"}
57
+
58
+ DEFAULT_RADIUS_M = 800
59
+ DEFAULT_MAX_PER_QUERY = 8 # cap per station so doc payload stays small
60
+
61
+
62
+ @dataclass
63
+ class EntranceFinding:
64
+ station_id: str
65
+ station_name: str
66
+ daytime_routes: str
67
+ borough: str
68
+ entrance_type: str
69
+ entrance_lat: float
70
+ entrance_lon: float
71
+ distance_m: float
72
+ ada_accessible: bool
73
+ elevation_m: float | None
74
+ hand_m: float | None # height above nearest drainage
75
+ inside_sandy_2012: bool
76
+ dep_extreme_2080_class: int | None # 0/1/2/3
77
+ dep_extreme_2080_label: str | None
78
+ dep_moderate_2050_class: int | None
79
+ dep_moderate_2050_label: str | None
80
+
81
+
82
+ def _haversine_m(lat1, lon1, lat2, lon2) -> float:
83
+ R = 6371000.0
84
+ p1, p2 = math.radians(lat1), math.radians(lat2)
85
+ dp = math.radians(lat2 - lat1); dl = math.radians(lon2 - lon1)
86
+ a = math.sin(dp / 2) ** 2 + math.cos(p1) * math.cos(p2) * math.sin(dl / 2) ** 2
87
+ return 2 * R * math.asin(math.sqrt(a))
88
+
89
+
90
+ @lru_cache(maxsize=1)
91
+ def _load_entrances():
92
+ import geopandas as gpd
93
+ import pandas as pd
94
+ gdf = gpd.read_file(MTA_ENTRANCES)
95
+ # The lat/lon columns are strings in this GeoJSON; coerce so we
96
+ # can do range comparisons in the bbox prefilter.
97
+ gdf["entrance_latitude"] = pd.to_numeric(gdf["entrance_latitude"],
98
+ errors="coerce")
99
+ gdf["entrance_longitude"] = pd.to_numeric(gdf["entrance_longitude"],
100
+ errors="coerce")
101
+ gdf = gdf[gdf["entrance_latitude"].notna()
102
+ & gdf["entrance_longitude"].notna()].copy()
103
+ return gdf.reset_index(drop=True)
104
+
105
+
106
+ def _entrances_near(lat: float, lon: float, radius_m: float):
107
+ gdf = _load_entrances()
108
+ # Coarse bbox prefilter to avoid haversine on 2120 rows every call.
109
+ deg = radius_m / 90_000 # generous degree padding at NYC latitude
110
+ sub = gdf[
111
+ (gdf["entrance_latitude"].between(lat - deg, lat + deg))
112
+ & (gdf["entrance_longitude"].between(lon - deg, lon + deg))
113
+ ].copy()
114
+ if sub.empty:
115
+ return sub
116
+ sub["distance_m"] = sub.apply(
117
+ lambda r: _haversine_m(lat, lon, r["entrance_latitude"],
118
+ r["entrance_longitude"]),
119
+ axis=1,
120
+ )
121
+ sub = sub[sub["distance_m"] <= radius_m].sort_values("distance_m")
122
+ return sub
123
+
124
+
125
+ def _sample_raster(raster_path: Path, lat: float, lon: float) -> float | None:
126
+ """Read one pixel from a raster at (lat, lon). Returns None if the
127
+ point is outside the raster or the raster is missing.
128
+
129
+ The cached NYC rasters are all EPSG:4326. rasterio.sample handles
130
+ coordinate-to-pixel translation directly — simpler than building
131
+ a windowed read."""
132
+ if not raster_path.exists():
133
+ return None
134
+ try:
135
+ import rasterio
136
+ with rasterio.open(raster_path) as src:
137
+ v = next(src.sample([(lon, lat)]))[0]
138
+ if v is None:
139
+ return None
140
+ v = float(v)
141
+ if math.isnan(v) or v == src.nodata:
142
+ return None
143
+ return v
144
+ except Exception:
145
+ log.exception("raster sample failed for %s", raster_path)
146
+ return None
147
+
148
+
149
+ from app.registers._footprint import (
150
+ BUFFER_MTA_ENTRANCE_M,
151
+ dep_class_buffered,
152
+ inside_sandy_buffered,
153
+ )
154
+
155
+
156
+ def _inside_sandy(lat: float, lon: float) -> bool:
157
+ """Sandy join with a small (8 m) buffer to capture entrances at the
158
+ polygon edge — the entrance point is the centerline of the stair
159
+ well, the actual opening is wider."""
160
+ return inside_sandy_buffered(lat, lon, BUFFER_MTA_ENTRANCE_M)
161
+
162
+
163
+ def _dep_class(lat: float, lon: float, scenario: str) -> tuple[int | None, str | None]:
164
+ return dep_class_buffered(lat, lon, BUFFER_MTA_ENTRANCE_M, scenario)
165
+
166
+
167
+ def summary_for_point(lat: float, lon: float,
168
+ radius_m: float = DEFAULT_RADIUS_M,
169
+ max_entrances: int = DEFAULT_MAX_PER_QUERY) -> dict:
170
+ """Return all subway entrances within `radius_m` of (lat, lon),
171
+ enriched with flood-exposure fields. Empty list when no entrances
172
+ are nearby (silence over confabulation)."""
173
+ near = _entrances_near(lat, lon, radius_m)
174
+ if near.empty:
175
+ return {"available": False,
176
+ "n_entrances": 0,
177
+ "radius_m": radius_m,
178
+ "entrances": []}
179
+
180
+ near = near.head(max_entrances)
181
+ findings: list[EntranceFinding] = []
182
+ for _, row in near.iterrows():
183
+ elat, elon = float(row["entrance_latitude"]), float(row["entrance_longitude"])
184
+ ada = str(row["entrance_type"]) in ADA_ACCESSIBLE_TYPES
185
+ elev = _sample_raster(DATA / "nyc_dem_30m.tif", elat, elon)
186
+ hand = _sample_raster(DATA / "hand.tif", elat, elon)
187
+ in_sandy = _inside_sandy(elat, elon)
188
+ dep_2080_class, dep_2080_label = _dep_class(elat, elon, "dep_extreme_2080")
189
+ dep_2050_class, dep_2050_label = _dep_class(elat, elon, "dep_moderate_2050")
190
+ findings.append(EntranceFinding(
191
+ station_id=str(row["station_id"]),
192
+ station_name=str(row["stop_name"]),
193
+ daytime_routes=str(row["daytime_routes"]),
194
+ borough=str(row["borough"]),
195
+ entrance_type=str(row["entrance_type"]),
196
+ entrance_lat=elat, entrance_lon=elon,
197
+ distance_m=round(float(row["distance_m"]), 1),
198
+ ada_accessible=ada,
199
+ elevation_m=round(elev, 2) if elev is not None else None,
200
+ hand_m=round(hand, 2) if hand is not None else None,
201
+ inside_sandy_2012=in_sandy,
202
+ dep_extreme_2080_class=dep_2080_class,
203
+ dep_extreme_2080_label=dep_2080_label,
204
+ dep_moderate_2050_class=dep_2050_class,
205
+ dep_moderate_2050_label=dep_2050_label,
206
+ ))
207
+
208
+ # Citywide rollups across the returned entrances.
209
+ n_in_sandy = sum(1 for f in findings if f.inside_sandy_2012)
210
+ n_in_dep_2080 = sum(1 for f in findings
211
+ if (f.dep_extreme_2080_class or 0) > 0)
212
+ n_ada = sum(1 for f in findings if f.ada_accessible)
213
+ return {
214
+ "available": True,
215
+ "n_entrances": len(findings),
216
+ "radius_m": radius_m,
217
+ "footprint_buffer_m": BUFFER_MTA_ENTRANCE_M,
218
+ "n_inside_sandy_2012": n_in_sandy,
219
+ "n_in_dep_extreme_2080": n_in_dep_2080,
220
+ "n_ada_accessible": n_ada,
221
+ "entrances": [vars(f) for f in findings],
222
+ "citation": ("MTA Open Data subway entrances + NYC OEM Sandy 2012 "
223
+ "Inundation Zone (5xsi-dfpx) + NYC DEP Stormwater "
224
+ "Flood Maps + USGS 3DEP DEM"),
225
+ }
226
+
227
+
228
+ def main() -> int:
229
+ """CLI smoke test."""
230
+ import argparse
231
+ ap = argparse.ArgumentParser()
232
+ ap.add_argument("--lat", type=float, required=True)
233
+ ap.add_argument("--lon", type=float, required=True)
234
+ ap.add_argument("--radius", type=float, default=DEFAULT_RADIUS_M)
235
+ ap.add_argument("--max", type=int, default=DEFAULT_MAX_PER_QUERY)
236
+ args = ap.parse_args()
237
+ s = summary_for_point(args.lat, args.lon, args.radius, args.max)
238
+ print(json.dumps(s, indent=2, default=str))
239
+ return 0
240
+
241
+
242
+ if __name__ == "__main__":
243
+ import sys
244
+ sys.exit(main())
app/registers/nycha.py ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """nycha_development_exposure — flood-exposure briefing per NYCHA development.
2
+
3
+ Same pattern as the MTA-entrance specialist, but NYCHA developments are
4
+ *polygons* not points, so the metrics shift to overlap fractions:
5
+
6
+ - % of footprint inside the 2012 Sandy Inundation Zone (empirical)
7
+ - % of footprint inside DEP Extreme-2080 / Moderate-2050 scenarios
8
+ (modeled, broken out by depth class)
9
+ - Representative-point elevation, HAND, TWI (proxy)
10
+ - Footprint area (km²)
11
+ - Distance from query point to development boundary
12
+
13
+ Joins:
14
+ - data/nycha.geojson (NYC Open Data, 218 NYCHA developments)
15
+ - data/sandy_inundation.geojson
16
+ - DEP Stormwater Flood Map polygons (3 scenarios)
17
+ - data/nyc_dem_30m.tif, data/hand.tif
18
+
19
+ Per queried (lat, lon), returns developments whose centroid is within
20
+ the radius (default 2000 m — NYCHA developments are sparser than
21
+ subway entrances, so the radius is wider).
22
+
23
+ Honest scope:
24
+ - This is exposure, not damage forecast. We say "85% of this
25
+ development's footprint is inside the 2012 Sandy zone" — not
26
+ "this development will flood next storm".
27
+ - All overlap fractions are computed in EPSG:2263 (NYC State Plane,
28
+ feet) for accurate area arithmetic in the city.
29
+ """
30
+
31
+ from __future__ import annotations
32
+
33
+ import json
34
+ import logging
35
+ import math
36
+ import sys
37
+ from dataclasses import dataclass
38
+ from functools import lru_cache
39
+ from pathlib import Path
40
+
41
+ _ROOT = Path(__file__).resolve().parents[2]
42
+ if str(_ROOT) not in sys.path:
43
+ sys.path.insert(0, str(_ROOT))
44
+
45
+ log = logging.getLogger("riprap.nycha")
46
+
47
+ DATA = _ROOT / "data"
48
+ NYCHA = DATA / "nycha.geojson"
49
+
50
+ DEFAULT_RADIUS_M = 2000
51
+ DEFAULT_MAX_PER_QUERY = 5
52
+
53
+
54
+ @dataclass
55
+ class DevelopmentFinding:
56
+ development: str
57
+ tds_num: str
58
+ borough: str
59
+ centroid_lat: float
60
+ centroid_lon: float
61
+ distance_m: float
62
+ footprint_km2: float
63
+ rep_elevation_m: float | None
64
+ rep_hand_m: float | None
65
+ pct_inside_sandy_2012: float
66
+ pct_in_dep_extreme_2080: float # any-depth (class>=1)
67
+ pct_in_dep_extreme_2080_deep: float # class==3 only ("Deep Contiguous")
68
+ pct_in_dep_moderate_2050: float
69
+
70
+
71
+ @lru_cache(maxsize=1)
72
+ def _load_nycha():
73
+ import geopandas as gpd
74
+ gdf = gpd.read_file(NYCHA).to_crs("EPSG:2263") # feet, accurate areas
75
+ gdf["centroid_2263"] = gdf.geometry.centroid
76
+ return gdf.reset_index(drop=True)
77
+
78
+
79
+ @lru_cache(maxsize=1)
80
+ def _load_sandy_2263():
81
+ """Load the Sandy zone in EPSG:2263 once. Already used by
82
+ app.flood_layers.sandy_inundation but we want the geometry directly
83
+ for overlap-fraction math."""
84
+ import geopandas as gpd
85
+ g = gpd.read_file(DATA / "sandy_inundation.geojson").to_crs("EPSG:2263")
86
+ # Some NYC OEM Sandy polygons have hole-orientation issues that
87
+ # blow up unary_union. buffer(0) fixes self-intersections without
88
+ # changing the footprint at sub-foot precision.
89
+ g["geometry"] = g.geometry.buffer(0)
90
+ return g.geometry.union_all()
91
+
92
+
93
+ @lru_cache(maxsize=4)
94
+ def _load_dep_2263(scenario: str):
95
+ """DEP scenario polygons in EPSG:2263, with depth_class column."""
96
+ import geopandas as gpd
97
+ p = DATA / "dep" / f"{scenario}.geojson"
98
+ if not p.exists():
99
+ # Fallback to whatever the existing dep_stormwater module loaded.
100
+ from app.flood_layers import dep_stormwater
101
+ gdf = dep_stormwater.load(scenario)
102
+ return gdf.to_crs("EPSG:2263") if gdf.crs is not None else gdf
103
+ return gpd.read_file(p).to_crs("EPSG:2263")
104
+
105
+
106
+ def _haversine_m(lat1, lon1, lat2, lon2) -> float:
107
+ R = 6371000.0
108
+ p1, p2 = math.radians(lat1), math.radians(lat2)
109
+ dp = math.radians(lat2 - lat1); dl = math.radians(lon2 - lon1)
110
+ a = math.sin(dp / 2) ** 2 + math.cos(p1) * math.cos(p2) * math.sin(dl / 2) ** 2
111
+ return 2 * R * math.asin(math.sqrt(a))
112
+
113
+
114
+ def _sample_raster(raster_path: Path, lat: float, lon: float) -> float | None:
115
+ if not raster_path.exists():
116
+ return None
117
+ try:
118
+ import rasterio
119
+ with rasterio.open(raster_path) as src:
120
+ v = next(src.sample([(lon, lat)]))[0]
121
+ v = float(v)
122
+ if math.isnan(v) or v == src.nodata:
123
+ return None
124
+ return v
125
+ except Exception:
126
+ log.exception("raster sample failed for %s", raster_path)
127
+ return None
128
+
129
+
130
+ def _developments_near(lat: float, lon: float, radius_m: float):
131
+ """Return developments whose centroid is within `radius_m` of
132
+ (lat, lon). Uses haversine on centroids re-projected back to
133
+ EPSG:4326 — the bbox prefilter gets us close, then exact distance."""
134
+ import geopandas as gpd
135
+ gdf = _load_nycha()
136
+ # Re-project centroids to 4326 for haversine
137
+ cents_4326 = gpd.GeoSeries(gdf["centroid_2263"], crs="EPSG:2263").to_crs("EPSG:4326")
138
+ deg = radius_m / 90_000
139
+ cent_lat = cents_4326.y
140
+ cent_lon = cents_4326.x
141
+ mask = ((cent_lat >= lat - deg) & (cent_lat <= lat + deg)
142
+ & (cent_lon >= lon - deg) & (cent_lon <= lon + deg))
143
+ sub = gdf[mask].copy()
144
+ if sub.empty:
145
+ return sub, []
146
+ sub["clat"] = cent_lat[mask].values
147
+ sub["clon"] = cent_lon[mask].values
148
+ sub["distance_m"] = sub.apply(
149
+ lambda r: _haversine_m(lat, lon, r["clat"], r["clon"]),
150
+ axis=1,
151
+ )
152
+ sub = sub[sub["distance_m"] <= radius_m].sort_values("distance_m")
153
+ return sub, sub.index.tolist()
154
+
155
+
156
+ def _overlap_pct(geom_2263, mask_geom_2263) -> float:
157
+ """% of geom_2263's area that intersects mask_geom_2263."""
158
+ if mask_geom_2263 is None or mask_geom_2263.is_empty:
159
+ return 0.0
160
+ inter = geom_2263.intersection(mask_geom_2263)
161
+ if inter.is_empty:
162
+ return 0.0
163
+ return round(100.0 * inter.area / max(geom_2263.area, 1e-9), 2)
164
+
165
+
166
+ def _dep_overlap(geom_2263, scenario: str) -> tuple[float, float]:
167
+ """Return (pct_any_depth, pct_deep_contiguous) of a polygon's area
168
+ inside the DEP scenario."""
169
+ try:
170
+ gdf = _load_dep_2263(scenario)
171
+ except Exception:
172
+ log.exception("DEP load failed for %s", scenario)
173
+ return 0.0, 0.0
174
+ if gdf is None or gdf.empty:
175
+ return 0.0, 0.0
176
+ # Bbox-prefilter the DEP polygons to those near our development.
177
+ minx, miny, maxx, maxy = geom_2263.bounds
178
+ cand = gdf.cx[minx:maxx, miny:maxy]
179
+ if cand.empty:
180
+ return 0.0, 0.0
181
+ # DEP NYC stormwater FGDB uses `Flooding_Category` (int16):
182
+ # 1=nuisance, 2=shallow, 3=deep contiguous (>4 ft).
183
+ cat_col = "Flooding_Category" if "Flooding_Category" in cand.columns else None
184
+ any_geom = cand.geometry.buffer(0).union_all()
185
+ if cat_col:
186
+ deep = cand[cand[cat_col] == 3]
187
+ deep_geom = deep.geometry.buffer(0).union_all() if not deep.empty else None
188
+ else:
189
+ deep_geom = None
190
+ pct_any = _overlap_pct(geom_2263, any_geom)
191
+ pct_deep = _overlap_pct(geom_2263, deep_geom) if deep_geom is not None else 0.0
192
+ return pct_any, pct_deep
193
+
194
+
195
+ def summary_for_point(lat: float, lon: float,
196
+ radius_m: float = DEFAULT_RADIUS_M,
197
+ max_developments: int = DEFAULT_MAX_PER_QUERY) -> dict:
198
+ near, _ = _developments_near(lat, lon, radius_m)
199
+ if near.empty:
200
+ return {"available": False,
201
+ "n_developments": 0,
202
+ "radius_m": radius_m,
203
+ "developments": []}
204
+
205
+ near = near.head(max_developments)
206
+ sandy_2263 = _load_sandy_2263()
207
+
208
+ findings: list[DevelopmentFinding] = []
209
+ for _, row in near.iterrows():
210
+ geom = row.geometry
211
+ # Representative interior point gives a more meaningful elevation
212
+ # than the centroid for irregular development footprints.
213
+ rep = geom.representative_point()
214
+ # Re-project the rep point to 4326 for raster sampling
215
+ import geopandas as gpd
216
+ rep_4326 = gpd.GeoSeries([rep], crs="EPSG:2263").to_crs("EPSG:4326").iloc[0]
217
+ rep_lat, rep_lon = rep_4326.y, rep_4326.x
218
+
219
+ elev = _sample_raster(DATA / "nyc_dem_30m.tif", rep_lat, rep_lon)
220
+ hand = _sample_raster(DATA / "hand.tif", rep_lat, rep_lon)
221
+ pct_sandy = _overlap_pct(geom, sandy_2263)
222
+ pct_2080_any, pct_2080_deep = _dep_overlap(geom, "dep_extreme_2080")
223
+ pct_2050_any, _ = _dep_overlap(geom, "dep_moderate_2050")
224
+
225
+ findings.append(DevelopmentFinding(
226
+ development=str(row["developmen"]),
227
+ tds_num=str(row["tds_num"]),
228
+ borough=str(row["borough"]),
229
+ centroid_lat=round(float(row["clat"]), 5),
230
+ centroid_lon=round(float(row["clon"]), 5),
231
+ distance_m=round(float(row["distance_m"]), 1),
232
+ footprint_km2=round(geom.area / 10.7639 / 1_000_000, 4), # sq-ft -> km²
233
+ rep_elevation_m=round(elev, 2) if elev is not None else None,
234
+ rep_hand_m=round(hand, 2) if hand is not None else None,
235
+ pct_inside_sandy_2012=pct_sandy,
236
+ pct_in_dep_extreme_2080=pct_2080_any,
237
+ pct_in_dep_extreme_2080_deep=pct_2080_deep,
238
+ pct_in_dep_moderate_2050=pct_2050_any,
239
+ ))
240
+
241
+ n_majority_sandy = sum(1 for f in findings if f.pct_inside_sandy_2012 >= 50)
242
+ n_any_2080 = sum(1 for f in findings if f.pct_in_dep_extreme_2080 > 0)
243
+ return {
244
+ "available": True,
245
+ "n_developments": len(findings),
246
+ "radius_m": radius_m,
247
+ "n_majority_inside_sandy_2012": n_majority_sandy,
248
+ "n_with_dep_2080_overlap": n_any_2080,
249
+ "developments": [vars(f) for f in findings],
250
+ "citation": ("NYC Open Data NYCHA Developments (phvi-damg) + "
251
+ "NYC OEM Sandy 2012 Inundation Zone (5xsi-dfpx) + "
252
+ "NYC DEP Stormwater Flood Maps + USGS 3DEP DEM"),
253
+ }
254
+
255
+
256
+ def main() -> int:
257
+ import argparse
258
+ ap = argparse.ArgumentParser()
259
+ ap.add_argument("--lat", type=float, required=True)
260
+ ap.add_argument("--lon", type=float, required=True)
261
+ ap.add_argument("--radius", type=float, default=DEFAULT_RADIUS_M)
262
+ ap.add_argument("--max", type=int, default=DEFAULT_MAX_PER_QUERY)
263
+ args = ap.parse_args()
264
+ s = summary_for_point(args.lat, args.lon, args.radius, args.max)
265
+ print(json.dumps(s, indent=2, default=str))
266
+ return 0
267
+
268
+
269
+ if __name__ == "__main__":
270
+ sys.exit(main())
app/score.py CHANGED
@@ -1,47 +1,345 @@
1
- """Transparent exposure scoring rubric. Published, not a black box.
2
 
3
- Each signal contributes a small integer; sum -> tier 1..4.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  """
5
  from __future__ import annotations
6
 
7
  import pandas as pd
8
 
9
- WEIGHTS = {
10
- "sandy": 3, # empirical Sandy 2012 inundation
11
- "dep_extreme_2080": 2, # pluvial scenario, 3.66 in/hr + 2080 SLR
12
- "dep_moderate_2050": 2, # pluvial scenario, 2.13 in/hr + 2050 SLR
13
- "dep_moderate_current": 1, # pluvial scenario, 2.13 in/hr current
14
- "complaints_3plus": 1, # >=3 flood-related 311s within 200m, last 5 years
15
- "floodnet_trigger": 1, # FloodNet sensor within 400m with >=1 trigger event
16
- "policy_named": 1, # named in HMP/NPCC4/agency plan paragraph (RAG hit)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  }
18
 
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  def tier(score: int) -> int:
21
- if score >= 6:
22
- return 1
23
- if score >= 4:
24
- return 2
25
- if score >= 2:
26
- return 3
27
- if score >= 1:
28
- return 4
29
  return 0
30
 
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  def score_row(signals: dict) -> tuple[int, int]:
33
- s = 0
34
- for k, w in WEIGHTS.items():
35
- if signals.get(k):
36
- s += w
37
- return s, tier(s)
38
 
39
 
40
  def score_frame(df: pd.DataFrame) -> pd.DataFrame:
 
 
 
 
 
 
 
41
  out = df.copy()
42
- out["score"] = 0
43
- for k, w in WEIGHTS.items():
44
- if k in out.columns:
45
- out["score"] += out[k].astype(bool).astype(int) * w
46
- out["tier"] = out["score"].map(tier)
 
 
 
 
47
  return out
 
1
+ """Riprap exposure scoring research-grounded deterministic rubric.
2
 
3
+ This is an EXPOSURE index, not a damage probability. It produces a tier
4
+ 1-4 from a thematic additive composite over min-max-normalized indicators
5
+ within sub-indices. The same input always produces the same tier; live
6
+ signals (NWS alerts, surge residual, hourly precip) are NOT in this
7
+ score — they are surfaced as a separate "current conditions" badge per
8
+ NPCC4 / IPCC AR6 WG II's distinction between exposure (quasi-stationary
9
+ property of place) and event occurrence (time-varying).
10
+
11
+ Methodology:
12
+ - Cutter, Boruff & Shirley, 2003. "Social Vulnerability to Environmental
13
+ Hazards." Social Science Quarterly 84(2): 242-261. — hazards-of-place
14
+ composite construction.
15
+ - Tate, 2012. "Social Vulnerability Indices: A Comparative Assessment
16
+ Using Uncertainty and Sensitivity Analysis." Natural Hazards 63: 325-
17
+ 347. — equal weights within thematic groups are the most rank-stable
18
+ default; differential weighting is hard to defend.
19
+ - Balica, Wright & van der Meulen, 2012. "A Flood Vulnerability Index
20
+ for Coastal Cities." Natural Hazards 64: 73-105. — multiplicative
21
+ override behaviour; we recover the important part as a "max-empirical
22
+ floor" rather than a full multiplicative form.
23
+
24
+ Per-indicator citations:
25
+ - HAND breakpoints: Nobre et al., 2011. "Height Above the Nearest
26
+ Drainage." J. Hydrology 404: 13-29.
27
+ - TWI: Beven & Kirkby, 1979. Hydrological Sciences Bulletin 24; Sørensen,
28
+ Zinko & Seibert, 2006. HESS 10: 101-112. (Half-weight because TWI is
29
+ noisier than HAND in flat urban DEMs; we percentile-bin rather than
30
+ use absolute cutoffs.)
31
+ - Zone hierarchy: NYC NPCC4 (2024) Ch. 3; NYC Hazard Mitigation Plan 2024.
32
+ - USGS HWM proximity floor: USGS HWM positional uncertainty is typically
33
+ 5-30 m horizontal, so 100 m gives ~3σ headroom for a true "this
34
+ address was inundated" signal.
35
+
36
+ Scope limit: We have no labeled flood-damage outcomes. The tier is a
37
+ literature-grounded exposure prior, not a calibrated loss prediction.
38
+ For insurance pricing, use FEMA Risk Rating 2.0 (claims-driven GLM).
39
  """
40
  from __future__ import annotations
41
 
42
  import pandas as pd
43
 
44
+ # ---------- Indicator schemas ----------------------------------------------
45
+ #
46
+ # Each sub-index is a mapping {indicator_name: weight}. Within a sub-index,
47
+ # the weighted sum is normalized by the maximum possible weight, giving a
48
+ # 0-1 score per sub-index. The composite is the sum of the three sub-index
49
+ # scores (range 0-3), then mapped to tiers.
50
+ #
51
+ # Why equal weights within thematic groups: Tate 2012's uncertainty
52
+ # analysis showed that differential weighting is the most-attacked axis
53
+ # of any composite vulnerability/exposure index. Equal weights are the
54
+ # safest default; agency tiering (which puts FEMA 1% above 0.2%, Sandy
55
+ # above modeled scenarios) supplies the remaining structure.
56
+
57
+ REGULATORY = {
58
+ # FEMA NFHL — regulatory baseline. SFHA (1%) is the mandate threshold.
59
+ "fema_1pct": 1.00,
60
+ "fema_02pct": 0.50,
61
+ # NYC DEP Stormwater Maps (2021) — modeled pluvial scenarios.
62
+ # Moderate-2050 is treated heavier than Extreme-2080 because NPCC4
63
+ # explicitly designates 2080 SLR + 7 in/hr as a TAIL scenario.
64
+ "dep_moderate_2050": 0.75,
65
+ "dep_extreme_2080": 0.50,
66
+ "dep_tidal_2050": 0.75,
67
+ }
68
+
69
+ HYDROLOGICAL = {
70
+ # HAND (Height Above Nearest Drainage), banded per Nobre et al. 2011.
71
+ # Bands: <1 m (channel/floodplain near-certain wet) → 1.0
72
+ # 1-3 m (floodplain) → 0.66
73
+ # 3-10 m (transitional) → 0.33
74
+ # >10 m (hillslope, dry) → 0
75
+ "hand_band": 1.00,
76
+ # TWI quartile (top quartile = saturation-prone). Half-weight
77
+ # because TWI is noisier than HAND in urban DEMs; we percentile-bin
78
+ # within NYC rather than using absolute cutoffs.
79
+ "twi_quartile": 0.50,
80
+ # Local-relief inversions: low percentile = topographic low point.
81
+ # Bins: <10th=1.0, 10-25th=0.66, 25-50th=0.33, ≥50th=0.
82
+ "elev_pct_200m_inv": 0.50,
83
+ "elev_pct_750m_inv": 0.50,
84
+ # Basin relief contributes a small additional terrain term.
85
+ "basin_relief_band": 0.25,
86
+ }
87
+
88
+ EMPIRICAL = {
89
+ # Sandy 2012 inundation — empirical post-event extent. Also triggers
90
+ # the max-empirical FLOOR rule below.
91
+ "sandy": 1.00,
92
+ # USGS Hurricane Ida 2021 high-water marks. Within 100 m → "direct"
93
+ # (also triggers the floor); 100-800 m → "neighborhood proximity".
94
+ "ida_hwm_within_100m": 1.00,
95
+ "ida_hwm_within_800m": 0.50,
96
+ # Prithvi-EO 2.0 satellite-derived inundation polygon (Hurricane Ida
97
+ # pre/post diff) — semi-empirical because model-derived but
98
+ # conditioned on observed Sentinel-2 imagery.
99
+ "prithvi_polygon": 0.75,
100
+ # NYC 311 flood-related complaint count, banded over 5-year window:
101
+ # ≥10 → 1.0, 3-9 → 0.66, 1-2 → 0.33, 0 → 0
102
+ # Weight capped at 0.75 because 311 has documented socio-economic
103
+ # reporting bias (engagement varies by neighborhood).
104
+ "complaints_band": 0.75,
105
+ # FloodNet trigger flag (any labeled flood event at any sensor
106
+ # within 600 m, last 3 years). Same 0.75 cap as 311 since both have
107
+ # spatial coverage bias.
108
+ "floodnet_trigger": 0.75,
109
+ }
110
+
111
+
112
+ def _hand_band(hand_m: float | None) -> float:
113
+ """Nobre et al. 2011 HAND classes adapted for NYC's flat urban terrain."""
114
+ if hand_m is None:
115
+ return 0.0
116
+ if hand_m < 1.0:
117
+ return 1.0
118
+ if hand_m < 3.0:
119
+ return 0.66
120
+ if hand_m < 10.0:
121
+ return 0.33
122
+ return 0.0
123
+
124
+
125
+ def _percentile_inv_band(pct: float | None) -> float:
126
+ """Inverted relief percentile: lower = more exposed (water pools here)."""
127
+ if pct is None:
128
+ return 0.0
129
+ if pct < 10:
130
+ return 1.0
131
+ if pct < 25:
132
+ return 0.66
133
+ if pct < 50:
134
+ return 0.33
135
+ return 0.0
136
+
137
+
138
+ def _twi_quartile(twi: float | None) -> float:
139
+ """TWI thresholds calibrated to NYC's flat 30 m DEM. Top quartile
140
+ cutoff comes from the NYC-wide TWI distribution; here we approximate
141
+ with literature-typical breakpoints (Sørensen 2006 site-specific
142
+ advice)."""
143
+ if twi is None:
144
+ return 0.0
145
+ if twi >= 12:
146
+ return 1.0
147
+ if twi >= 10:
148
+ return 0.66
149
+ if twi >= 8:
150
+ return 0.33
151
+ return 0.0
152
+
153
+
154
+ def _basin_relief_band(relief_m: float | None) -> float:
155
+ if relief_m is None:
156
+ return 0.0
157
+ # Higher basin relief in a flat area means the address sits in a real
158
+ # depression. Banding is empirical for NYC.
159
+ if relief_m >= 8:
160
+ return 1.0
161
+ if relief_m >= 4:
162
+ return 0.66
163
+ if relief_m >= 2:
164
+ return 0.33
165
+ return 0.0
166
+
167
+
168
+ def _complaints_band(n: int | None) -> float:
169
+ if not n:
170
+ return 0.0
171
+ if n >= 10:
172
+ return 1.0
173
+ if n >= 3:
174
+ return 0.66
175
+ if n >= 1:
176
+ return 0.33
177
+ return 0.0
178
+
179
+
180
+ # ---------- Sub-index computation ------------------------------------------
181
+
182
+ def _normalize(weighted: float, weights: dict[str, float]) -> float:
183
+ max_w = sum(weights.values())
184
+ return weighted / max_w if max_w else 0.0
185
+
186
+
187
+ def regulatory_subindex(s: dict) -> float:
188
+ """0..1. All inputs are binary (inside zone or not)."""
189
+ w = REGULATORY
190
+ raw = sum(w[k] * (1.0 if s.get(k) else 0.0) for k in w)
191
+ return _normalize(raw, w)
192
+
193
+
194
+ def hydrological_subindex(s: dict) -> float:
195
+ """0..1. Inputs are continuous; convert to ordinal bands first."""
196
+ w = HYDROLOGICAL
197
+ bands = {
198
+ "hand_band": _hand_band(s.get("hand_m")),
199
+ "twi_quartile": _twi_quartile(s.get("twi")),
200
+ "elev_pct_200m_inv": _percentile_inv_band(s.get("rel_elev_pct_200m")),
201
+ "elev_pct_750m_inv": _percentile_inv_band(s.get("rel_elev_pct_750m")),
202
+ "basin_relief_band": _basin_relief_band(s.get("basin_relief_m")),
203
+ }
204
+ raw = sum(w[k] * bands[k] for k in w)
205
+ return _normalize(raw, w)
206
+
207
+
208
+ def empirical_subindex(s: dict) -> float:
209
+ """0..1. Mix of binary and banded count signals."""
210
+ w = EMPIRICAL
211
+ vals = {
212
+ "sandy": 1.0 if s.get("sandy") else 0.0,
213
+ "ida_hwm_within_100m": 1.0 if s.get("ida_hwm_within_100m") else 0.0,
214
+ "ida_hwm_within_800m": 1.0 if s.get("ida_hwm_within_800m") else 0.0,
215
+ "prithvi_polygon": 1.0 if s.get("prithvi_polygon") else 0.0,
216
+ "complaints_band": _complaints_band(s.get("complaints_count")),
217
+ "floodnet_trigger": 1.0 if s.get("floodnet_trigger") else 0.0,
218
+ }
219
+ raw = sum(w[k] * vals[k] for k in w)
220
+ return _normalize(raw, w)
221
+
222
+
223
+ # ---------- Composite + tier mapping ---------------------------------------
224
+
225
+ # Tier breakpoints over the composite (range 0-3, since each sub-index is
226
+ # 0-1). Tuned so that "Sandy + DEP-2050 + HAND<1m" lands in Tier 1, and a
227
+ # single positive signal lands in Tier 4. Documented in METHODOLOGY.md.
228
+ TIER_BREAKPOINTS = [
229
+ (1.50, 1), # high — multiple sub-indices saturated
230
+ (1.00, 2), # elevated — at least one strong sub-index
231
+ (0.50, 3), # moderate — partial signals across categories
232
+ (0.01, 4), # limited — a single contextual signal
233
+ ]
234
+
235
+ TIER_LABELS = {
236
+ 1: ("High exposure", "Multiple sub-indices saturated; empirical and/or "
237
+ "modeled scenarios both indicate substantial exposure."),
238
+ 2: ("Elevated exposure", "At least one sub-index near saturation; significant "
239
+ "overlap with empirical or modeled scenarios."),
240
+ 3: ("Moderate exposure", "Partial signals across categories; scenario- or "
241
+ "neighborhood-specific exposure."),
242
+ 4: ("Limited exposure", "A single contextual signal; no positive scenario hits."),
243
+ 0: ("No flagged exposure", "No positive flood signal across the assessed sources."),
244
  }
245
 
246
 
247
+ def composite(signals: dict) -> dict:
248
+ """Compute sub-indices, composite score, and tier with the floor rule.
249
+
250
+ Returns: {
251
+ 'subindices': {'regulatory': 0..1, 'hydrological': 0..1, 'empirical': 0..1},
252
+ 'composite': 0..3,
253
+ 'tier': 0..4,
254
+ 'floor_applied': bool,
255
+ }
256
+
257
+ Max-empirical floor: if Sandy 2012 inundation OR a USGS Ida HWM within
258
+ 100 m fired, the tier is capped at 2 (cannot be worse). This recovers
259
+ the multiplicative behavior — empirical evidence overrides terrain or
260
+ modeled scenarios — without giving up additive transparency.
261
+ """
262
+ reg = regulatory_subindex(signals)
263
+ hyd = hydrological_subindex(signals)
264
+ emp = empirical_subindex(signals)
265
+ composite_score = reg + hyd + emp
266
+
267
+ raw_tier = 0
268
+ for breakpoint, t in TIER_BREAKPOINTS:
269
+ if composite_score >= breakpoint:
270
+ raw_tier = t
271
+ break
272
+
273
+ floor_applied = bool(signals.get("sandy") or signals.get("ida_hwm_within_100m"))
274
+ if floor_applied and (raw_tier == 0 or raw_tier > 2):
275
+ final_tier = 2
276
+ else:
277
+ final_tier = raw_tier
278
+
279
+ return {
280
+ "subindices": {
281
+ "regulatory": round(reg, 3),
282
+ "hydrological": round(hyd, 3),
283
+ "empirical": round(emp, 3),
284
+ },
285
+ "composite": round(composite_score, 3),
286
+ "tier": final_tier,
287
+ "floor_applied": floor_applied,
288
+ }
289
+
290
+
291
+ # ---------- Backward-compat shims ------------------------------------------
292
+ # Register CLI and register_builder consume a flat `tier` column on a
293
+ # DataFrame. The shim materializes composite() over rows and writes back
294
+ # `score` (composite scaled 0-100) and `tier`.
295
+
296
  def tier(score: int) -> int:
297
+ """Legacy bridge for callers that still pass a small-integer score.
298
+ Maps the OLD additive-integer score to the new tier breakpoints by
299
+ scaling. Prefer composite() for new code."""
300
+ if score >= 6: return 1
301
+ if score >= 4: return 2
302
+ if score >= 2: return 3
303
+ if score >= 1: return 4
 
304
  return 0
305
 
306
 
307
+ # Legacy WEIGHTS map kept so riprap.py and any external consumer
308
+ # continue to import without breaking. The new composite() is the
309
+ # authoritative scorer.
310
+ WEIGHTS = {
311
+ "sandy": 3,
312
+ "dep_extreme_2080": 2,
313
+ "dep_moderate_2050": 2,
314
+ "dep_moderate_current": 1,
315
+ "complaints_3plus": 1,
316
+ "floodnet_trigger": 1,
317
+ "policy_named": 1,
318
+ }
319
+
320
+
321
  def score_row(signals: dict) -> tuple[int, int]:
322
+ """Legacy-shape wrapper around composite(). Returns (composite_x100, tier)."""
323
+ c = composite(signals)
324
+ return int(round(c["composite"] * 100)), c["tier"]
 
 
325
 
326
 
327
  def score_frame(df: pd.DataFrame) -> pd.DataFrame:
328
+ """Vectorized composite over a DataFrame whose columns name our
329
+ indicators. Missing columns are treated as 0 / None.
330
+
331
+ Adds columns: subindex_regulatory, subindex_hydrological,
332
+ subindex_empirical, composite, score, tier, floor_applied.
333
+ `score` is the composite scaled 0-100 for register CSV legibility.
334
+ """
335
  out = df.copy()
336
+ rows = out.to_dict(orient="records")
337
+ results = [composite(r) for r in rows]
338
+ out["subindex_regulatory"] = [r["subindices"]["regulatory"] for r in results]
339
+ out["subindex_hydrological"] = [r["subindices"]["hydrological"] for r in results]
340
+ out["subindex_empirical"] = [r["subindices"]["empirical"] for r in results]
341
+ out["composite"] = [r["composite"] for r in results]
342
+ out["score"] = (out["composite"] * 100).round().astype(int)
343
+ out["tier"] = [r["tier"] for r in results]
344
+ out["floor_applied"] = [r["floor_applied"] for r in results]
345
  return out
audit/2026-05-03-evening-audit.md ADDED
@@ -0,0 +1,183 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Riprap Hackathon Week Audit — 2026-05-03 Evening
2
+
3
+ ## TL;DR
4
+
5
+ The four register specialists (MTA entrances, NYCHA, DOE schools, DOH hospitals) are **shipped, FSM-wired, and validated end-to-end** as of commit `86861be` — the "subway-entrance specialist drift" hypothesis is wrong; it landed Sunday afternoon. The TerraMind-NYC fine-tune is running in its dedicated session (eval spec v2 in place, v1 postmortemed). The biggest real drift is in **pitch artifacts**: `pitch/cold_open.md` was deleted by commit `1cb5ee6` (Sunday 18:59 ET) along with the entire `pitch/` directory — `MONDAY.md` still shows it as ✓. The Build-in-Public posts, methodology paper PDF, ASCE materials, historical-event mode, and the four extra TTM specialists are **not started**. Visual identity v0.4.1/v0.4.2 is largely landed in `web/sveltekit/`. The planner-level refusal shim from Phase 6 is **documented as shipping in the FSM but is not actually wired into `app/planner.py`**.
6
+
7
+ ## Specialist roster
8
+
9
+ | Specialist | Exists | Wired into FSM | Tested | Tier | Last touched | Notes |
10
+ |---|---|---|---|---|---|---|
11
+ | `geocode` | ✓ | ✓ | ✓ (integration) | reference | baseline | `app/geocode.py` |
12
+ | `sandy_inundation` | ✓ | ✓ | ✓ | empirical | baseline | NYC-only gated |
13
+ | `dep_stormwater` | ✓ | ✓ | ✓ | modeled | baseline | 3 scenarios |
14
+ | `floodnet` | ✓ | ✓ | ✓ | empirical | baseline | |
15
+ | `nyc311` | ✓ | ✓ | ✓ | empirical | baseline | |
16
+ | `noaa_tides` | ✓ | ✓ | ✓ | empirical | baseline | |
17
+ | `nws_alerts` | ✓ | ✓ | ✓ | empirical | baseline | |
18
+ | `nws_obs` | ✓ | ✓ | ✓ | empirical | baseline | |
19
+ | `ttm_forecast` (Battery surge) | ✓ | ✓ | ✓ | modeled | baseline | TTM r2 |
20
+ | `ttm_311_forecast` | ✓ | ✓ | ✓ | modeled | baseline | per-address TTM r2 |
21
+ | `microtopo_lidar` | ✓ | ✓ | ✓ | proxy | baseline | |
22
+ | `ida_hwm_2021` | ✓ | ✓ | ✓ | empirical | baseline | |
23
+ | `prithvi_eo_v2` (baked Ida polys) | ✓ | ✓ | ✓ | empirical | baseline | |
24
+ | `prithvi_eo_live` (Sentinel-2) | ✓ | ✓ (heavy) | ✓ | empirical | baseline | gated by `RIPRAP_HEAVY_SPECIALISTS` |
25
+ | `terramind_synthesis` (DEM→LULC) | ✓ | ✓ (heavy) | — | synthetic | baseline | |
26
+ | `rag_granite_embedding` | ✓ | ✓ | ✓ | reference | baseline | |
27
+ | `gliner_extract` | ✓ | ✓ | ✓ | reference | baseline | |
28
+ | **`mta_entrance_exposure`** | ✓ | ✓ | ✗ (no per-specialist test) | mixed | 2026-05-03 | first output Sheepshead Bay |
29
+ | **`nycha_development_exposure`** | ✓ | ✓ (heavy) | ✗ | mixed | 2026-05-03 | first output Red Hook |
30
+ | **`doe_school_exposure`** | ✓ | ✓ (heavy) | ✗ | mixed | 2026-05-03 | first output Coney Island |
31
+ | **`doh_hospital_exposure`** | ✓ | ✓ (heavy) | ✗ | mixed | 2026-05-03 | first output Coney Island |
32
+ | FEMA OpenFEMA NFIP claims | ✗ | ✗ | ✗ | — | — | not started |
33
+ | NWS NWPS reach forecast | ✗ | ✗ | ✗ | — | — | not started |
34
+ | USGS NWIS streamgages | ✗ | ✗ | ✗ | — | — | not started |
35
+ | NYC DEP CSO/Bluebelt/GI | ✗ | ✗ | ✗ | — | — | not started |
36
+ | TTM streamgage stage forecast | ✗ | ✗ | ✗ | — | — | not started |
37
+ | TTM FloodNet sensor depth | ✗ | ✗ | ✗ | — | — | not started |
38
+ | TTM NWS rainfall accumulation | ✗ | ✗ | ✗ | — | — | not started |
39
+ | TTM citywide 311 sewer-backup | ✗ | ✗ | ✗ | — | — | not started |
40
+ | Granite Guardian terminal check | ✗ | ✗ (pivoted) | — | — | — | replaced by planner-level refusal in design — but see Anomalies |
41
+
42
+ ## Foundation models
43
+
44
+ | Model | Imported | Instantiated | Called | Routed via LiteLLM | Notes |
45
+ |---|---|---|---|---|---|
46
+ | Granite 4.1:3b (planner) | ✓ | ✓ | ✓ | ✓ | `app/planner.py` via `app.llm.chat` |
47
+ | Granite 4.1:8b (reconciler) | ✓ | ✓ | ✓ | ✓ | `app/reconcile.py`, `app/mellea_validator.py` |
48
+ | Granite Embedding 278M | ✓ | ✓ | ✓ | n/a (HF transformers) | `app/rag.py` |
49
+ | Granite Reranker R2 | ✓ | ✓ | ✓ (when enabled) | n/a | gated; see test_phase3 |
50
+ | GLiNER medium v2.1 | ✓ | ✓ | ✓ | n/a | `app/context/gliner_extract.py` |
51
+ | Prithvi-EO 2.0 Sen1Floods11 | ✓ | ✓ (heavy) | ✓ (live) | n/a | `app/flood_layers/prithvi_live.py` |
52
+ | TerraMind 1.0 base | ✓ | ✓ (heavy) | ✓ | n/a | `app/context/terramind_synthesis.py` |
53
+ | Granite TTM r2 (surge + 311) | ✓ | ✓ | ✓ | n/a | `app/live/ttm_forecast.py` — only 2 of planned 6 instances |
54
+ | Granite Guardian 3.2 3B-A800M | ✗ | ✗ | ✗ | — | dropped per Phase 6 pivot |
55
+
56
+ ## Data sources
57
+
58
+ | Source | Status | Consumer | Notes |
59
+ |---|---|---|---|
60
+ | Sandy Inundation 2012 (NYC OEM) | implemented | `sandy_inundation`, all 4 registers | `data/sandy_inundation.geojson` |
61
+ | NYC DEP Stormwater Flood Map | implemented | `dep_stormwater`, registers | 3 scenarios |
62
+ | FloodNet sensors | implemented | `floodnet` | |
63
+ | NYC 311 service requests | implemented | `nyc311`, `ttm_311_forecast` | |
64
+ | NOAA CO-OPS tides | implemented | `noaa_tides`, `ttm_forecast` | |
65
+ | NWS alerts + obs | implemented | `nws_alerts`, `nws_obs` | |
66
+ | Hurricane Ida HWMs (USGS) | implemented | `ida_hwm` | |
67
+ | Prithvi-EO Ida polygons (baked) | implemented | `prithvi_water` | |
68
+ | Sentinel-2 via Planetary Computer | implemented | `prithvi_live` | heavy |
69
+ | MTA Subway Entrances 2024 | implemented | `mta_entrances` | `data/mta_entrances.geojson` (2120 entrances) |
70
+ | USGS 3DEP DEM 1m / HAND | implemented | registers | `data/nyc_dem_30m.tif`, `data/hand.tif` |
71
+ | NYCHA Development Data Book | implemented | `nycha` | per `b196bd8`+ |
72
+ | NYC DOE School Locations | implemented | `doe_schools` | |
73
+ | NYS DOH / NYC hospitals | implemented | `doh_hospitals` | |
74
+ | MTA Sandy-recovery report | not started | (queued) | Monday plan — `[mta_recovery_<station_id>]` doc messages |
75
+ | FEMA OpenFEMA NFIP claims | not started | — | |
76
+ | NWS NWPS reach forecast API | not started | — | |
77
+ | USGS NWIS streamgages | not started | — | |
78
+ | NYC DEP CSO outfalls | not started | — | |
79
+ | NYC DEP Bluebelt | not started | — | |
80
+ | NYC DEP Green Infrastructure DB | not started | — | |
81
+ | PLUTO building footprints | not started | (queued — fixes centroid-edge) | NYU Langone/Stuyvesant/P.S. 89 false-negatives |
82
+
83
+ ## Design system v0.4.1/v0.4.2 integration
84
+
85
+ | Item | Status | Notes |
86
+ |---|---|---|
87
+ | Carto Positron / Voyager basemap | ✓ | `web/sveltekit/src/lib/components/map/baseStyle.ts` |
88
+ | IBM Plex Sans/Mono/Serif | ✓ | `tokens.css` |
89
+ | Four-tier color palette (CSS vars, WCAG-fixed values) | ✓ | `tokens.css` matches the spec hex codes |
90
+ | Epistemic-tier glyph SVG | ✓ | `lib/components/glyphs/TierGlyph.svelte` |
91
+ | Per-claim margin glyph rendering | ✓ | `Briefing.svelte` + `Claim.svelte` |
92
+ | Section-head tier badges | ✓ | `SectionHead.svelte`, `TierBadge.svelte` |
93
+ | Hoverable inline citations + drawer | ✓ | `Cite.svelte`, `CitationDrawer.svelte` |
94
+ | Trace UI as `<details>` tree with tier badges | ✓ | `TraceUI.svelte`, `TraceRow.svelte` |
95
+ | Layers panel with tier badges | ✓ | `MapLegend.svelte` (4 layer entries hit by demo deck) |
96
+ | Cold-start with sample queries | ✓ | `ColdStart.svelte` |
97
+ | Trust-signal footer | ✓ | `AppFooter.svelte` |
98
+ | WeasyPrint PDF template | ✗ | only routed at `/print/{query_id}` (browser print); no WeasyPrint dep in `requirements.txt` |
99
+ | Browser print stylesheet | ✓ | `lib/print.css` |
100
+ | Loading / skeleton states | ✓ | `SkeletonBriefing.svelte` |
101
+ | Error states | ✓ | `ErrorCard.svelte` |
102
+ | Refusal state UI | ✓ component exists | `GuardianRefusal.svelte` — but back-end refusal classifier not wired (see Anomalies) |
103
+ | Reroll banner | ✓ | `RerollBanner.svelte` |
104
+ | Synthetic-stripe SVG pattern | ✓ | `synStripe.ts`, `ThumbStripe.svelte` |
105
+ | Granite version string = 4.1 | ✓ (sampled) | |
106
+ | RegisterCard evidence format | ✓ | `RegisterCard.svelte` (rendered in `nyu-langone` demo run) |
107
+ | Dark mode | unverified | not searched |
108
+
109
+ ## Accessibility
110
+
111
+ | WCAG 2.2 AA item | Status | Notes |
112
+ |---|---|---|
113
+ | Tier color contrast verified | ✓ | tokens.css comments document per-color ratios + AA/AAA passes |
114
+ | Color independence (glyph shape) | ✓ | TierGlyph component exists |
115
+ | Skip-links | ✓ | `SkipLinks.svelte` |
116
+ | Focus rings | unverified | `--accent-graphical: #D17C00` token exists but per-element outline rules not audited |
117
+ | Heading hierarchy | unverified | not audited |
118
+ | Touch-target sizing | unverified | not audited |
119
+ | `role="log"` aria-live polite for streaming | ✓ | found in `agentStream.ts`, `Briefing.svelte`, `RerollBanner.svelte`, `SkeletonBriefing.svelte` |
120
+ | Map `role="application"` + alt-text | unverified | grep didn't surface — needs walk through `RipMap.svelte` |
121
+ | `prefers-reduced-motion` respected | ✓ (partial) | rules in `tokens.css` and `styles.css`; per-component coverage unverified |
122
+ | Plain-language redirect for resident queries | unverified | `ColdStart.svelte` mentions FloodHelpNY redirect per spec — not visually verified |
123
+ | Glyph alt-text (`role="img"`) | unverified | not audited |
124
+
125
+ ## Keep-list and pitch artifacts
126
+
127
+ - ✓ `experiments/05_terramind_nyc_finetune/eval/eval_spec.md` — present (also `eval_spec_v2.md` with v1 postmortem at `eval/v1_synth_sar_postmortem.md`)
128
+ - ⚠ `pitch/cold_open.md` — **DELETED** by commit `1cb5ee6` (2026-05-03 18:59 ET, "Demo deck: 10/10 live SSE tests"). MONDAY.md still says ✓. Last good content is in commit `b4239de`.
129
+ - ✓ `experiments/06_granite_guardian/adversarial_queries.jsonl` — present + planner-pivot results in `planner_refusal_summary.md` and `RESULTS.md`
130
+ - ✗ `experiments/07_historical_event_mode/` — does not exist
131
+ - ✗ Methodology paper draft (6-8 page PDF) — only `METHODOLOGY.md` (264 lines, scoring-methodology only, not the publication paper draft)
132
+ - ✗ `pitch/` directory — gone (deleted with cold_open.md). Demo-side artifacts now live in `web/sveltekit/tests/e2e/demo-script.md` and the (gitignored) `pitch/screenshots-2026-05-03/`
133
+ - ✗ `asce/` — does not exist
134
+ - ✗ Build-in-Public posts — no `posts/`, `build_in_public/`, or comparable directory
135
+
136
+ ## Integration tests
137
+
138
+ - 26 tests collected in `tests/test_integration.py` (parametrized over `brighton`, `hollis`, `hunts`); plus `test_agent_e2e.py`, `test_agent_full.py`, `test_sample_queries.py`. Not executed (would exceed 30 s budget — they hit the live SSE stream).
139
+ - The 4 new register specialists have **no per-specialist integration test** in `tests/`. Coverage is via the e2e `demo-queries.spec.ts` Playwright suite (`web/sveltekit/tests/e2e/`), which runs them in the FSM during `nyu-langone`, `red-hook-houses-nycha`, `coney-island`, `sheepshead-bay` queries.
140
+ - Frontend Playwright suites: `coldstart`, `demo-queries`, `layers`, `print`, `sample`, `states`, `sticky-map` (7 spec files).
141
+
142
+ ## experiments/ directory
143
+
144
+ - `00_endpoints` — completed (RESULTS.md, 8/8 endpoint smokes)
145
+ - `01_prithvi_live_water` — completed
146
+ - `02_gliner_extraction` — completed
147
+ - `03_granite_reranker` — completed
148
+ - `04_terramind_synthetic_sar` — parked-as-research per commit `271e673`
149
+ - `05_sam2_promptable` — empty directory (mid-flight or abandoned scaffolding)
150
+ - `05_terramind_finetune` — early micro-FT scaffold (`micro.py` + `RESULTS.md`); superseded by `05_terramind_nyc_finetune/`
151
+ - `05_terramind_nyc_finetune` — **active in another session**; eval_spec_v2 in place, training subdir present
152
+ - `06_chronos_bolt_forecast` — empty directory (not started)
153
+ - `06_granite_guardian` — completed-as-pivot (Guardian → planner shim; `planner_refusal_summary.md` documents FAIL on 5% FP gate)
154
+ - `07_mta_entrances` — completed and migrated to `app/registers/mta_entrances.py`
155
+ - `08_nycha_developments` — completed and migrated
156
+ - `09_doe_schools` — completed and migrated
157
+ - `10_doh_hospitals` — completed and migrated
158
+
159
+ ## Anomalies and weird things
160
+
161
+ - `experiments/05_sam2_promptable/` and `experiments/06_chronos_bolt_forecast/` are **empty directories** — either abandoned scaffolds or interrupted sessions. There is also `experiments/05_terramind_finetune/` (early micro-FT) sitting next to `experiments/05_terramind_nyc_finetune/` (current).
162
+ - **Numbering collision at `05_*` and `06_*`** between the empty/legacy dirs and the active ones.
163
+ - **Planner-level refusal shim is documented as shipping but is not in `app/`.** `experiments/06_granite_guardian/RESULTS.md` and MONDAY.md both say "the planner-level refusal shim still ships in the FSM as a polite-refusal layer." A grep for `refusal|guardian` across `app/` (including `app/planner.py`) returns no hits. The frontend `GuardianRefusal.svelte` component exists but has no backend signal to display.
164
+ - **`pitch/cold_open.md` deletion** by `1cb5ee6` is almost certainly accidental — that commit's message describes adding 6 demo queries and a `demo-script.md`; deleting the cold-open is unrelated and not mentioned. Likely casualty of moving screenshots into a gitignored path.
165
+ - **`Riprap.zip` at repo root** is untracked — leftover archive.
166
+ - **CLAUDE.md / MONDAY.md disagree on AMD droplet IP**: CLAUDE.md never mentions an IP (uses `<droplet-ip>` placeholders); MONDAY.md explicitly says CLAUDE.md is wrong (cites `165.245.134.44`) and that `129.212.182.52` is production. CLAUDE.md grep doesn't surface the wrong IP, so the MONDAY.md note may itself be stale.
167
+ - **MONDAY.md status table out-of-sync** with the deletion of `pitch/cold_open.md`.
168
+ - No TODO/FIXME/XXX comments in `app/` Python or in `web/sveltekit/src/`.
169
+ - No imports from `experiments.*` inside `app/` or `web/`.
170
+ - No specialists registered in `app/fsm.py` are missing from `app/registers/` or `app/context/` (vice-versa clean).
171
+ - WeasyPrint is referenced in MONDAY.md / spec but is **not in `requirements.txt`** — `/print/{query_id}` route serves a browser-print page only.
172
+
173
+ ## The single most important gap
174
+
175
+ The originally-suspected subway-entrance specialist gap is not real — that work shipped Sunday afternoon and is wired through the FSM, the reconciler, and the demo Playwright suite. The single most important *actual* gap is the **deletion of `pitch/cold_open.md` (and the entire `pitch/` directory) in commit `1cb5ee6`**. The cold-open phrasing was an explicit Sunday keep-list item ("seven-tunnels framing, no inflated dollar figure"), Sunday's MONDAY.md handoff still treats it as ✓, and the AMD demo on May 10 will require it. The content is recoverable from `git show b4239de:pitch/cold_open.md` and should be restored before any other Monday work begins.
176
+
177
+ ## Recommended next-session priorities
178
+
179
+ 1. **Restore `pitch/cold_open.md`** — `git show b4239de:pitch/cold_open.md > pitch/cold_open.md` and commit. ~5 min. Done = file present, MONDAY.md row still accurate, content matches the seven-tunnels framing.
180
+ 2. **Wire the planner-level refusal shim into `app/planner.py`** — the documented contract from Phase 6 (FN=0% safety-critical) is not actually live. ~30–60 min. Done = planner returns `refusal_reason` field on the 50 should-refuse adversarial queries; `GuardianRefusal.svelte` renders end-to-end on at least one out-of-scope query in the e2e suite.
181
+ 3. **PLUTO building-footprint join for register centroid-edge cases** — single change unlocks NYU Langone / Stuyvesant / P.S. 89 flipping to `inside_sandy_2012=true` across all four register specialists. Pre-existing queue from MONDAY.md. ~2–3 hr. Done = the three known-failing addresses each show `inside_sandy_2012=true` in the FSM trace and the briefing cites `[doh_hospital_*]` / `[doe_school_*]` accordingly.
182
+ 4. **MapLibre rendering for the 4 register specialists** — entrance points coloured by Sandy/DEP, NYCHA polygon fills graded by `pct_inside_sandy`, school + hospital points in the same color ramp. The data is in state; the map layers aren't yet drawing them. ~2–4 hr. Done = layers panel shows ≥4 new layer entries on `red-hook-houses-nycha` and `nyu-langone`; e2e screenshot diff captures them.
183
+ 5. **Remove dead/empty experiment dirs and clarify numbering** — delete `experiments/05_sam2_promptable/`, `experiments/06_chronos_bolt_forecast/`, and decide whether `experiments/05_terramind_finetune/` should be folded into the NYC fine-tune dir or kept as a separate phase artifact. ~15 min. Done = no empty dirs; numbering collision documented or resolved.
audit/2026-05-04-morning-handoff.md ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Riprap overnight handoff — Monday 2026-05-04
2
+
3
+ Continuation point for the wake-up session. Read this first; everything
4
+ points outwards from here.
5
+
6
+ ## TL;DR
7
+
8
+ All eight priorities from the overnight wiring pass landed. The
9
+ audit-flagged drift items are closed: cold-open restored, Guardian
10
+ gone, trace UI now clickable, register-specialist Sandy false-negatives
11
+ fixed, register pins on the map, FloodNet TTM forecast wired,
12
+ TTM specialists grouped in trace, `experiments/` cleaned. The single
13
+ load-bearing UX feature to verify in the morning is the trace-UI
14
+ drilldown — clicking any specialist row reveals its raw structured
15
+ output, which is the auditability contract for the entire system.
16
+
17
+ ## Commits landed (overnight)
18
+
19
+ | Commit | Priority | What |
20
+ |---------|----------|------|
21
+ | `a2143fc` | P1 | Restore `pitch/cold_open.md` from `b4239de` (accidentally deleted in `1cb5ee6`). |
22
+ | `4b9e55e` | P2 | Remove `GuardianRefusal.svelte`, `RefusalCategory` type, `.guardian-*` CSS, Playwright assertion. Mellea is the sole grounding mechanism. |
23
+ | `3e4f922` | P3 | **Trace UI clickable drilldown.** Click any row → raw structured output panel (formatted JSON, copy button, status-aware label, max-height + scroll). |
24
+ | `47ed3fb` | P4 | Buffered-footprint overlap (`app/registers/_footprint.py`) — MTA 8m / DOE 50m / DOH 100m. NYU Langone, Stuyvesant HS, P.S. 89 flip to `inside_sandy_2012=true`. |
25
+ | `792f4ee` | P5 | Map: register-asset pins (subway 4px / school 5px / hospital 6px / NYCHA-centroid 7px), colored by Sandy exposure, click popup with name + `[doc_id]`. |
26
+ | `3d991e9` | P6 | **`floodnet_forecast` specialist.** TTM r2 (512, 96) forecast on nearest FloodNet sensor's daily flood-event series — reuses the existing model singleton, no new model class loaded. |
27
+ | `90644e4` | P7 | Trace UI groups TTM specialists under `forecasting.granite-timeseries-ttm-r2 [N instances]`. `leafSteps` walks recursively so children still count toward fired/silent/errors. |
28
+ | `36e28d1` | P8 | Drop `Riprap.zip`, empty `05_sam2_promptable/`, empty `06_chronos_bolt_forecast/`. Rename `05_terramind_finetune` → `05a_terramind_finetune_micro` (dedupe with active NYC fine-tune dir). |
29
+
30
+ Two further commits update MONDAY.md and add this handoff.
31
+
32
+ ## Verify first when you wake
33
+
34
+ Run a Red Hook query (rich output, exercises everything) and check:
35
+
36
+ ```bash
37
+ .venv/bin/uvicorn web.main:app --host 127.0.0.1 --port 7860 --log-level info
38
+ # then visit http://127.0.0.1:7860/q/red%20hook%20houses
39
+ ```
40
+
41
+ 1. **Trace drilldown.** Click any specialist row in the run-trace
42
+ panel. You should see a structured output panel with formatted
43
+ JSON, a "Copy" button, and a status-coloured label
44
+ (Output / Silent reason / Error). Multiple rows can be expanded
45
+ simultaneously. *This is the load-bearing feature.* If clicking
46
+ doesn't expand the row, check the browser console; the build is
47
+ committed in `web/sveltekit/build/`.
48
+ 2. **TTM grouping.** The trace should show
49
+ `forecasting.granite-timeseries-ttm-r2 · 3 instances` (or 2 if
50
+ floodnet_forecast finds no usable sensor) as a single
51
+ auto-expanded parent with the TTM children nested under it. The
52
+ top-of-trace fired/silent/errors counters should still include
53
+ the TTM children — that's the recursion fix in `TraceUI.svelte`.
54
+ 3. **Register pins on the map.** Click a subway/school/hospital pin.
55
+ Popup should show name, kind, `inside_sandy_2012`, and
56
+ `[mta_entrance_…]` / `[doe_school_…]` / `[nyc_hospital_…]` /
57
+ `[nycha_dev_…]` doc-id, the same one cited in the briefing.
58
+ 4. **Buffered Sandy join.** Run the NYU Langone single-address query
59
+ (`570 First Ave Manhattan` or similar). The hospital row should
60
+ show `inside_sandy_2012=true` in its trace drilldown panel and
61
+ the briefing should cite `[nyc_hospital_…]` accordingly.
62
+ 5. **No Guardian card anywhere.** No `GuardianRefusal.svelte`, no
63
+ `.guardian-*` CSS class, no `RefusalCategory` import. Mellea
64
+ reroll banner is the only integrity-narration UI.
65
+
66
+ ## What's queued next (Monday morning, in priority order)
67
+
68
+ 1. **NYCHA polygon-fill on the map.** Add `geometry_geojson` field
69
+ to `app/registers/nycha.py:DevelopmentFinding` (serialise the
70
+ polygon as GeoJSON). The frontend `RipMap.svelte` already has
71
+ the `register-polygons` source + fill/line layers wired and
72
+ waiting for non-empty data. ~30 min. Done = NYCHA developments
73
+ render as graded fills (denser if more of the footprint inside
74
+ Sandy) at the Red Hook query.
75
+ 2. **TerraMind-NYC fine-tune morning routine.** From MONDAY.md:
76
+ refresh PC signed URLs on the AMD droplet, then proceed with the
77
+ eval-spec gates. That session is independent of overnight work.
78
+ 3. **MTA Sandy-recovery citation layer** (per MONDAY.md: parse the
79
+ "Hurricane Sandy: Three Years Later" report into per-station-id
80
+ facts → emit `[mta_recovery_<station_id>]` doc messages). 1–2 hr.
81
+ 4. **PLUTO + NYC Building Footprints** for the very-large-campus
82
+ register cases that the buffered-overlap doesn't catch
83
+ (Stuyvesant Town in particular — it's not in `nycha.geojson`
84
+ because it's privately owned post-Met-Life). Either a new
85
+ "large_residential_complex" register or an actual footprint join.
86
+ 5. **3 more TTM r2 specialists**: USGS streamgage stage, NWS rainfall
87
+ accumulation, citywide 311 sewer-backup rate. Each one reuses the
88
+ same singleton — same architectural template as
89
+ `floodnet_forecast`.
90
+
91
+ ## What was deliberately kept out of scope tonight
92
+
93
+ Per the wiring-pass priorities document:
94
+
95
+ - USGS NWIS Bronx/Saw Mill/Hutchinson river forecasts.
96
+ - FEMA OpenFEMA NFIP claims tract-aggregated specialist.
97
+ - DEP CSO outfalls / Bluebelt / Green Infrastructure specialist.
98
+ - WCAG 2.2 AA full audit.
99
+ - Methodology paper draft (Saturday work).
100
+ - Historical-event mode (Saturday work).
101
+ - Build-in-Public posts.
102
+ - ASCE talk materials.
103
+ - Dark mode (explicit defer to v0.5).
104
+ - WeasyPrint server-side PDF (browser print is sufficient for demo).
105
+ - Per-specialist Python integration tests for the 4 register
106
+ specialists (e2e Playwright covers them).
107
+
108
+ ## Sharp edges to remember
109
+
110
+ - **`floodnet_forecast` silent floor.** Sensors with <5 historical
111
+ events skip the forecast entirely (output is dominated by
112
+ quantization noise around zero — exactly the kind of
113
+ pseudo-quantitative claim the four-tier discipline guards
114
+ against). Trace shows `silent` with reason
115
+ "sensor has only N historical events; forecast omitted".
116
+ Don't lower the threshold without revisiting the calibration.
117
+ - **Buffer choice in `app/registers/_footprint.py` is per-asset-class.**
118
+ 100m hospital buffer catches NYU Langone but not the entire NYU
119
+ Langone Tisch Center (campus extends ~250m). Calibrated against
120
+ the three canonical addresses. Document any future change in the
121
+ same module's docstring.
122
+ - **NYCHA polygons not yet on the map.** Centroid-pin rendering is
123
+ shipped; polygon-fill needs the dataclass change above.
124
+ - **Trace UI `output` field carries the raw object.** Don't
125
+ re-stringify it in `q/[queryId]/+page.svelte` — the panel
126
+ formatter does that. The 240-char truncation that used to happen
127
+ in onStep is gone; if you're inspecting a giant payload, the
128
+ panel scrolls.
129
+ - **TTM grouping uses `status='fan'` as the auto-expand marker.**
130
+ The recursive `leafSteps` walker in `TraceUI.svelte` excludes
131
+ fan/merge nodes from counts but recurses into their children.
132
+ Don't add another structural-only status without updating the
133
+ recursion.
134
+
135
+ ## Files touched (overnight, by area)
136
+
137
+ - `app/registers/_footprint.py` (new)
138
+ - `app/registers/{mta_entrances,doe_schools,doh_hospitals}.py`
139
+ - `app/live/floodnet_forecast.py` (new)
140
+ - `app/fsm.py`, `app/reconcile.py`
141
+ - `web/sveltekit/src/lib/types/{trace,states,tier}.ts`
142
+ - `web/sveltekit/src/lib/components/trace/{TraceUI,TraceRow}.svelte`
143
+ - `web/sveltekit/src/lib/components/map/RipMap.svelte`
144
+ - `web/sveltekit/src/lib/styles.css`
145
+ - `web/sveltekit/src/routes/q/[queryId]/+page.svelte`
146
+ - `web/sveltekit/build/*` (rebuilt artefacts, committed)
147
+ - `web/static/agent.js` (legacy bundle: STEP_LABELS / SOURCE_LABELS
148
+ for `floodnet_forecast`)
149
+ - `MONDAY.md`, `pitch/cold_open.md`, `experiments/shared/licenses.md`
150
+ - Deletions: `Riprap.zip`,
151
+ `experiments/05_sam2_promptable/`,
152
+ `experiments/06_chronos_bolt_forecast/`,
153
+ `experiments/05_terramind_finetune/{micro.py, RESULTS.md}`
154
+ (renamed to `05a_terramind_finetune_micro/`),
155
+ `web/sveltekit/src/lib/components/states/GuardianRefusal.svelte`.
156
+
157
+ ## Tests run
158
+
159
+ - 18-test static Playwright suite passes after every UI change.
160
+ - Python smoke probes verified the buffered-footprint Sandy join
161
+ on the canonical addresses (NYU Langone, Stuyvesant HS, P.S. 89).
162
+ - Did NOT run `pytest tests/` (requires uvicorn + live SSE; the
163
+ morning verification routine above hits all the same code paths).
164
+ - Did NOT push to either remote — `git push && git push huggingface main`
165
+ when ready to deploy. HF rebuild ~10 min.
data/hospitals.geojson ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:164004fd98b88200014cb2da8cdb03f76bb456ec5d60820dd4e0da4da2e5f679
3
+ size 30022
data/nyc_ntas_2020.geojson ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bb5f1759872c8fa7ed18f7430c971b3b238a68927e0046d06d1ddb4ce90fc26b
3
+ size 4589872
experiments/00_endpoints/RESULTS.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Phase 0 — Endpoints smoke tests
2
+
3
+ 8/8 endpoints reachable from local dev machine. Run with:
4
+
5
+ ```bash
6
+ /Users/amsrahman/riprap-nyc/.venv/bin/python run_all.py
7
+ ```
8
+
9
+ | Endpoint | Status | Latency | Notes |
10
+ |----------|:------:|--------:|-------|
11
+ | Microsoft PC STAC (Sentinel-2 L2A search) | PASS | 1.2 s | keyless; 3 items in S2 Brooklyn bbox |
12
+ | NYC Open Data Socrata (311, PLUTO, Sandy) | PASS | 5.8 s | each dataset returns its row keys |
13
+ | USGS NWIS (Bronx River at NYBG) | PASS | 0.3 s | 2 series, 190 obs/24h |
14
+ | NOAA Tides (Battery 8518750) | PASS | 0.1 s | latest WL=1.056 ft |
15
+ | NOAA NWPS (gauges in NY+PA bbox) | PASS | 4.1 s | 750 gauges; **needs `srid=EPSG_4326`** |
16
+ | NWS API (NY active alerts) | PASS | 0.5 s | 10 active alerts (cold day) |
17
+ | FEMA OpenFEMA (FimaNfipClaims, NY) | PASS | 0.1 s | 1-row probe; aggregated only per project policy |
18
+ | HF Hub (small Apache-2.0 model) | PASS | 0.3 s | sentence-transformers/all-MiniLM-L6-v2 metadata |
19
+
20
+ ## Sharp edges discovered
21
+
22
+ 1. **NWPS silently empty without `srid=EPSG_4326`.** Default `srid` is
23
+ apparently a non-WGS84 system; bbox in geographic coords matches no
24
+ gauges. Endpoint returns `200 OK` with an empty array — no error
25
+ signal. Recorded in the smoke test comments so the next person
26
+ doesn't lose 20 minutes.
27
+
28
+ 2. **NYC Open Data Socrata is slow on cold connection.** ~6 s for three
29
+ sequential single-row fetches. Probably PoP-routing or DNS warmup.
30
+ Cache aggressively and batch.
31
+
32
+ 3. **NWS API requires User-Agent.** The smoke test sets one; without a
33
+ UA you'd get HTTP 403 (NWS docs say so but it's a quiet failure
34
+ mode in production).
35
+
36
+ 4. **OpenFEMA FimaNfipClaims schema is wide.** First-row keys include
37
+ `amountPaidOnBuildingClaim`, `amountPaidOnContentsClaim`,
38
+ `baseFloodElevation`, etc — these are the property-level fields we
39
+ are NOT allowed to surface. Specialists using OpenFEMA must
40
+ aggregate (e.g., `$select=count(*)&$filter=...&$apply=...`) before
41
+ ingesting, never store property-level rows.
42
+
43
+ ## Cache contents
44
+
45
+ `.cache/*.json` — one per smoke test, holds the parsed first row /
46
+ metadata so subsequent dev iterations don't re-hit the endpoint.
47
+
48
+ ## Conclusion
49
+
50
+ All eight data sources are usable. Proceed to Phase 1 (Prithvi-EO
51
+ live water segmentation). No blocking issues.
experiments/00_endpoints/_runner.py ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Tiny harness shared by the 8 smoke tests.
2
+
3
+ Each test exposes a `probe()` callable that returns (ok, summary, payload).
4
+ Cache hits are kept in .cache/ as JSON or raw bytes; tests are idempotent.
5
+ """
6
+
7
+ from __future__ import annotations
8
+
9
+ import json
10
+ import sys
11
+ import time
12
+ import traceback
13
+ from pathlib import Path
14
+
15
+ CACHE = Path(__file__).parent / ".cache"
16
+ CACHE.mkdir(exist_ok=True)
17
+
18
+
19
+ def cache_path(key: str, ext: str = "json") -> Path:
20
+ return CACHE / f"{key}.{ext}"
21
+
22
+
23
+ def write_cache(key: str, obj, ext: str = "json") -> None:
24
+ p = cache_path(key, ext)
25
+ if ext == "json":
26
+ p.write_text(json.dumps(obj, default=str)[:200_000])
27
+ else:
28
+ p.write_bytes(obj if isinstance(obj, bytes) else str(obj).encode())
29
+
30
+
31
+ def run(name: str, fn) -> tuple[bool, str, float]:
32
+ t0 = time.time()
33
+ try:
34
+ ok, summary, payload = fn()
35
+ dt = time.time() - t0
36
+ return ok, summary, dt
37
+ except Exception as e:
38
+ traceback.print_exc()
39
+ dt = time.time() - t0
40
+ return False, f"exception: {type(e).__name__}: {e}", dt
41
+
42
+
43
+ def cli(name: str, fn) -> int:
44
+ ok, summary, dt = run(name, fn)
45
+ badge = "PASS" if ok else "FAIL"
46
+ print(f"{badge} {name} ({dt:.2f}s) {summary}")
47
+ return 0 if ok else 1
48
+
49
+
50
+ if __name__ == "__main__":
51
+ sys.exit(0)