seriffic Claude Opus 4.7 (1M context) commited on
Commit
f6423e1
Β·
1 Parent(s): da4b993

Voice pass: strip em-dashes from user-facing docs

Browse files

Constraint set:
- No em-dashes (β€”). Replace with periods, semicolons, commas, or
parens depending on context.
- No "not just X but Y" constructions (none found; nothing to do).
- Drop AI-tells (no "delve / leverage / cutting-edge / robust /
comprehensive / it's worth noting" etc found; nothing to do).

Files touched:
- README.md (rewritten cleanly; new live-demo URL pointing at the
AMD-hackathon org Space; probe_addresses.py mentioned in the
local-development block)
- RESEARCH.md (rewritten cleanly; same content, no em-dashes)
- ARCHITECTURE.md (bulk em-dash sweep; manual fix for two parenthetical
sentences the regex broke: line 105 modes-see-Β§4 and line 134
Burr-FSM-DAGWorks)
- METHODOLOGY.md (bulk; heading "Riprap. Scoring Methodology" β†’
"Riprap scoring methodology")
- MONDAY.md (bulk; trailing-space-period sweep)
- services/riprap-models/README.md (bulk + manual: parenthetical bug
on which-surface-laptop-or-HF-Space hosts the FastAPI process; "Deploy"
headings now use ":" instead of "."; "Riprap Models" subtitle now
":" instead of ".")
- slides/deck.md (single em-dash in Marp title metadata fixed)
- slides/README.md (rewritten to match)

Total em-dashes across all eight user-facing docs went from ~150 to
0. Deck PDF/HTML/PPTX rebuilt from the cleaned deck.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (8) hide show
  1. ARCHITECTURE.md +41 -41
  2. METHODOLOGY.md +16 -16
  3. MONDAY.md +12 -12
  4. README.md +46 -43
  5. RESEARCH.md +175 -173
  6. services/riprap-models/README.md +20 -18
  7. slides/README.md +20 -16
  8. slides/deck.md +1 -1
ARCHITECTURE.md CHANGED
@@ -1,13 +1,13 @@
1
- # Riprap β€” Architecture
2
 
3
  > **What it is.** A web tool that takes any NYC address and produces a
4
- > short, citation-grounded **flood-exposure briefing** β€” a tier (1–4)
5
  > with a paragraph of evidence, where every numeric claim links back to
6
  > the specific dataset, agency report, or model output it came from.
7
  >
8
  > **Who it's for.** Urban planners, journalists on deadline, NYCEM
9
  > grant writers filing FEMA BRIC sub-applications, agency capital
10
- > planners, researchers under FOIL/IRB constraints β€” *not* consumers
11
  > shopping for flood insurance.
12
  >
13
  > **Why local foundation models.** A newsroom with FOIL'd documents
@@ -21,56 +21,56 @@
21
 
22
  ## 1. A 60-second primer on NYC flooding
23
 
24
- Skip if you already know this. Most architecture docs assume you do β€”
25
- this one doesn't.
26
 
27
  ### 1.1 Three kinds of flood
28
 
29
  NYC gets hit by three flood mechanisms that look completely different
30
  on a map and are caused by different physics:
31
 
32
- - **Coastal / surge flooding** β€” The ocean rises into the city.
33
  Driven by storm surge (wind pushing water against the coast),
34
  astronomical high tide, and wave run-up. Affects the **shoreline:**
35
  Brighton Beach, Coney Island, Red Hook, Lower Manhattan, the
36
  Rockaways, Staten Island east shore. **Hurricane Sandy 2012** is
37
- the canonical event β€” water came over the seawall and flooded
38
  subway tunnels, hospitals, and electrical substations. Affects
39
  buildings that were dry that morning.
40
- - **Pluvial / stormwater flooding** β€” Rain falls faster than the
41
  drainage system can carry it away. Affects **inland low points,
42
  basement apartments, and chronically under-sewered neighborhoods**:
43
  Hollis (Queens), Carroll Gardens (Brooklyn), Jamaica. **Hurricane
44
- Ida 2021** is the canonical event for NYC β€” most of the deaths
45
  were in basement apartments far from any coast. Optical satellites
46
  largely *can't see* this kind of flooding because the water drains
47
  fast and is often sub-surface.
48
- - **Compound flooding** β€” Coastal + pluvial happening at the same
49
  time, with groundwater rising too. Currently the active research
50
  frontier (NPCC4 Ch. 3 calls it out explicitly). Most agencies model
51
  these mechanisms separately; reality combines them.
52
 
53
  A good civic flood tool has to cover all three and be honest about
54
  what each signal can and cannot see. Riprap surfaces evidence for all
55
- three but **doesn't predict damage** β€” see scope below.
56
 
57
  ### 1.2 Empirical vs modeled vs proxy
58
 
59
  Each piece of flood evidence falls into one of three classes, and the
60
  distinction matters for how much weight to give it:
61
 
62
- - **Empirical** β€” Something flooded a place and was measured. USGS
63
  high-water marks (people went out after Hurricane Ida and surveyed
64
  where water reached on building walls). The 2012 Sandy Inundation
65
  Zone (mapped by the city after the storm). FloodNet ultrasonic
66
  sensors that recorded an actual depth. **Highest-confidence**: this
67
  flood happened here.
68
- - **Modeled scenarios** β€” Hydraulic models simulate "what if" cases.
69
  FEMA's regulatory floodplains (1 % and 0.2 % annual chance). NYC
70
  DEP's Stormwater Maps (modeled water depth under three rainfall
71
  scenarios with varying sea-level-rise assumptions). **Useful but
72
  scenario-bounded**: this could happen here under those conditions.
73
- - **Proxy signals** β€” Indirect indicators of flooding. NYC 311
74
  complaints ("street flooding", "sewer backup") clustering around an
75
  address. Topographic indices (HAND, TWI) suggesting water *would*
76
  pool here based on terrain. **Useful but biased**: 311 reflects
@@ -79,7 +79,7 @@ distinction matters for how much weight to give it:
79
 
80
  Riprap surfaces all three classes. The score weights them in that
81
  order (empirical > modeled > proxy), with empirical hits granted a
82
- **floor rule** β€” see [Β§5](#5-the-scoring-rubric).
83
 
84
  ### 1.3 Hydrology indices used in this app
85
 
@@ -87,11 +87,11 @@ Two terrain-derived numbers come up repeatedly. They're cheap to
87
  compute from a Digital Elevation Model (DEM) and they're the
88
  hydrological literature's canonical exposure proxies:
89
 
90
- - **HAND (Height Above Nearest Drainage)** β€” Vertical distance from
91
  the address up to the nearest river/drainage channel. **<1 m** = at
92
  drainage level (water *will* reach here in flood). **>10 m** =
93
  hillslope (very dry). Nobre et al. 2011.
94
- - **TWI (Topographic Wetness Index)** β€” `ln(catchment_area / tan
95
  slope)`. **High TWI** = water tends to accumulate here (large
96
  contributing area, gentle slope). Beven & Kirkby 1979.
97
 
@@ -102,7 +102,7 @@ Neither is a flood prediction; both are exposure indicators that say
102
 
103
  ## 2. What Riprap actually produces
104
 
105
- For a given address (or any of three modes β€” see [Β§4](#4-three-user-modes)),
106
  Riprap returns:
107
 
108
  1. **A tier 1–4** computed by a deterministic, published rubric
@@ -113,11 +113,11 @@ Riprap returns:
113
  *Status*, *Empirical evidence*, *Modeled scenarios*, *Policy
114
  context*. A section is omitted entirely if no specialist fired for
115
  it (silence-over-confabulation contract).
116
- 3. **Evidence cards** β€” one per fired specialist, with the raw values
117
  and a link to the source dataset.
118
- 4. **Map overlay** β€” the address pinned, with the empirical and
119
  modeled flood extents that overlap it.
120
- 5. **Live "right now" signals** β€” active NWS flood alerts, current
121
  tide residual at the Battery, recent precipitation at the nearest
122
  ASOS, and a Granite TTM short-horizon forecast of the surge
123
  residual. **These do not modify the tier** (per IPCC AR6 WG II's
@@ -131,7 +131,7 @@ prose.
131
 
132
  ## 3. The Burr FSM and how the specialists chain
133
 
134
- Riprap is a **state machine** β€” a Burr FSM (DAGWorks) β€” that walks
135
  through a fixed list of "specialist" functions in order. Each
136
  specialist either produces a structured fact or stays silent. At the
137
  end, the reconciler reads all the produced facts and writes the
@@ -206,7 +206,7 @@ shows live as the FSM runs.
206
  | 8 | **nws_obs** *(live)* | Recent precipitation from the nearest airport ASOS station (KNYC / KLGA / KJFK / KEWR / KFRG). | live |
207
  | 9 | **ttm_forecast** *(live)* | Granite TTM r2 zero-shot forecast of the surge **residual** at the Battery for the next ~9.6 h. NOAA already publishes the astronomical tide; TTM forecasts the part NOAA doesn't. | live (model-derived) |
208
  | 10 | **microtopo** | LiDAR-derived terrain features at the point: elevation, HAND, TWI, local relief percentile. | proxy |
209
- | 11 | **ida_hwm** | USGS Hurricane Ida 2021 high-water marks β€” actual measured water heights surveyed in the days after the storm. | empirical |
210
  | 12 | **prithvi** | NASA/IBM Prithvi-EO 2.0 segmentation of Sentinel-2 imagery for the Ida pre/post pair. Pre-computed offline; serves point-in-polygon queries against the resulting 166 polygons. | empirical (model-derived) |
211
  | 13 | **rag** | Granite Embedding 278M retrieves the most-relevant paragraphs from 5 NYC policy PDFs (Comptroller, NPCC4, MTA, NYCHA, ConEd) given the address's borough + which scenarios fired. | policy |
212
  | 14 | **reconcile** | Granite 4.1 :3b reads all the documents produced by steps 1–13 and writes the cited briefing paragraph. See [Β§6](#6-document-grounded-reconciliation). | LLM synthesis |
@@ -219,7 +219,7 @@ address:
219
  | Step | What it returns |
220
  |---|---|
221
  | geocode | `(40.5780, -73.9617)`, BBL `3-08660-0001`, Brooklyn |
222
- | sandy | **YES** β€” inside the 2012 Sandy Inundation Zone |
223
  | dep_stormwater | `dep_moderate_2050`: depth 0.4-0.8 ft; `dep_extreme_2080`: depth 0.8-2.0 ft |
224
  | floodnet | 2 sensors within 600 m; 1 trigger event in last 3 yr (peak 14 cm) |
225
  | nyc311 | 11 flood-related complaints in 200 m, 5-yr window |
@@ -267,7 +267,7 @@ saw those headers and didn't invent them.
267
  |---------------------------------------|------------------|---|
268
  | `/` | **Single address** | Geocode β†’ run the full FSM β†’ cited paragraph + map. Live demo path. |
269
  | `/compare` | **Compare** | Two addresses side by side; parallel FSM runs (`asyncio.to_thread`, `OLLAMA_NUM_PARALLEL=2`). Useful for "this site vs the alternative". |
270
- | `/register/{schools,nycha,mta_entrances}` | **Register** | Pre-computed bulk runs over NYC public-asset registries β€” 126 schools, 45 NYCHA developments, ~1,900 MTA subway entrances. Loaded from `data/registers/*.json` at boot. |
271
 
272
  Single-address is the live path. Registers are pre-computed because
273
  running 1,900 reconciler calls at request time is a non-starter; the
@@ -302,14 +302,14 @@ Tier 4, 0 β†’ Tier 0.
302
  ### 5.2 Max-empirical floor
303
 
304
  If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
305
- the tier is capped at **2 (Elevated)** β€” it cannot be worse,
306
  regardless of the additive composite.
307
 
308
  This recovers the *important* multiplicative behaviour Balica 2012
309
  argues for (empirical observations should not be cancelled by
310
  terrain or modeled scenarios) without giving up additive transparency.
311
  The 100 m radius is chosen because USGS HWM positional uncertainty is
312
- typically 5–30 m β€” 100 m gives ~3Οƒ headroom for a confident "this
313
  address was inundated" signal.
314
 
315
  ### 5.3 Live signals stay out
@@ -366,7 +366,7 @@ omitted entirely.
366
 
367
  `app/mellea_validator.py` wraps the Granite-via-Ollama call in IBM
368
  Research's [Mellea](https://github.com/generative-computing/mellea)
369
- framework β€” instruct, validate, repair. The synthesis intents call
370
  `reconcile_strict_streaming(...)` which:
371
 
372
  1. **Streams** each generation attempt's tokens to the user (via the
@@ -374,21 +374,21 @@ framework β€” instruct, validate, repair. The synthesis intents call
374
  `progress_q` for the polygon intents).
375
  2. After each attempt, runs **four deterministic checks** on the
376
  accumulated paragraph:
377
- - **`numerics_grounded`** β€” every non-trivial number in the output
378
  appears verbatim in a source document.
379
- - **`no_placeholder_tokens`** β€” output contains no leaked
380
  `[source]` / `<document>` template markup.
381
- - **`citations_dense`** β€” every non-trivial number has a
382
  `[doc_id]` citation **somewhere in the same sentence** (sentence
383
  boundaries: `. ` / `.\n` / end-of-text).
384
- - **`citations_resolve`** β€” cited `doc_id`s are a subset of the
385
  input doc_ids.
386
  3. If any check fails, fires a `mellea_attempt` SSE event with the
387
  failed-requirement names, then **rerolls** with a feedback prompt
388
  that names the specific failing sentences (the model usually
389
  responds well to surgical corrections). Loop budget: 3 attempts.
390
 
391
- The frontend renders an inline banner above the briefing β€” amber on
392
  reroll (with the failed-req list), green on first-try pass. The final
393
  reconcile step in the trace shows the `passed: N/4 Β· rerolls: M`
394
  metadata for full audit transparency.
@@ -411,12 +411,12 @@ X}fact<|end_of_cite|>` mode. **It's deprecated in 4.x.** Verified:
411
  - The official Ollama chat template for `granite4.x` has no citation
412
  branch (the 3.3 / 4.0-preview templates did).
413
  - `granite_common` ships only `granite3/granite32` and
414
- `granite3/granite33` subdirs β€” no 4.x equivalent.
415
  - `granite-io` has only `granite_3_2/` and `granite_3_3/` processor
416
  dirs.
417
 
418
  The base 4.1 weights still contain the cite tokens (training residue),
419
- so the model emits them as real tokens when nudged β€” but only as an
420
  end-of-response list, not inline in prose. IBM's published 4.x
421
  grounding path is a separate **Citation Generation LoRA** (built on
422
  `granite-4.0-micro`, not 4.1) requiring HF transformers + LoRA
@@ -452,11 +452,11 @@ drop noise, <1 kmΒ² to drop tidal artifacts) into **166 polygons**
452
  baked into `data/prithvi_ida_2021.geojson`. The runtime FSM does a
453
  point-in-polygon test, not fresh inference. This is honest about
454
  where foundation models earn their keep: **once, to produce a
455
- defensible event-level signal β€” not per request**.
456
 
457
  ### 7.2 Why TTM r2 runs live
458
 
459
- TTM r2 is **1.5 M params** β€” vastly smaller than Prithvi or Granite
460
  4.1. Inference is millisecond-scale even on CPU. It forecasts only
461
  the residual (surge component) at the Battery, which complements the
462
  NOAA snapshot specialist; it does **not** try to forecast the
@@ -474,7 +474,7 @@ from static layers and is handled separately:
474
  unless source data changed.
475
  - **Cadence**: NOAA tides update every 6 min; NWS alerts on push;
476
  NWS obs ~hourly; TTM is computed per query (cheap).
477
- - **Failure mode**: graceful β€” if NOAA times out, no `noaa_tides`
478
  doc is emitted; the reconciler simply doesn't see it.
479
 
480
  This mirrors how First Street separates Flood Factor (static, 30-yr)
@@ -551,7 +551,7 @@ riprap-nyc/
551
  compare.html / .js two-address side-by-side
552
  register.html / .js bulk register browser
553
  style.css IBM Plex Sans, Planning Labs idiom
554
- dist/ Svelte 5 custom-element bundle (committed β€”
555
  HF Spaces doesn't run a Node build).
556
  Built from web/svelte/ via `npm run build`.
557
 
@@ -601,7 +601,7 @@ riprap-nyc/
601
  type, electrical hardening, drainage condition), social capacity,
602
  and financial absorption are out of scope.
603
  - **No sub-surface flooding.** Optical satellites can't see basement
604
- apartments or subway entrances β€” the dominant Hurricane Ida damage
605
  mode in NYC. Prithvi correctly emits no polygons for Hollis or
606
  Carroll Gardens. That silence is a feature, not a bug.
607
  - **Vintage-bounded.** FEMA NFHL is years stale; DEP Stormwater Maps
@@ -657,7 +657,7 @@ weights load and TTM downloads. Warm queries:
657
  adds ~25 s)
658
 
659
  The Svelte bundle in `web/static/dist/` is committed, so HF Spaces
660
- runs no Node build step β€” only the Python deps + Ollama install.
661
 
662
  ### 12.2 Local development
663
 
 
1
+ # Riprap architecture
2
 
3
  > **What it is.** A web tool that takes any NYC address and produces a
4
+ > short, citation-grounded **flood-exposure briefing**. A tier (1–4)
5
  > with a paragraph of evidence, where every numeric claim links back to
6
  > the specific dataset, agency report, or model output it came from.
7
  >
8
  > **Who it's for.** Urban planners, journalists on deadline, NYCEM
9
  > grant writers filing FEMA BRIC sub-applications, agency capital
10
+ > planners, researchers under FOIL/IRB constraints. Not consumers
11
  > shopping for flood insurance.
12
  >
13
  > **Why local foundation models.** A newsroom with FOIL'd documents
 
21
 
22
  ## 1. A 60-second primer on NYC flooding
23
 
24
+ Skip if you already know this. Most architecture docs assume you do.
25
+ This one doesn't.
26
 
27
  ### 1.1 Three kinds of flood
28
 
29
  NYC gets hit by three flood mechanisms that look completely different
30
  on a map and are caused by different physics:
31
 
32
+ - **Coastal / surge flooding**. The ocean rises into the city.
33
  Driven by storm surge (wind pushing water against the coast),
34
  astronomical high tide, and wave run-up. Affects the **shoreline:**
35
  Brighton Beach, Coney Island, Red Hook, Lower Manhattan, the
36
  Rockaways, Staten Island east shore. **Hurricane Sandy 2012** is
37
+ the canonical event. Water came over the seawall and flooded
38
  subway tunnels, hospitals, and electrical substations. Affects
39
  buildings that were dry that morning.
40
+ - **Pluvial / stormwater flooding**. Rain falls faster than the
41
  drainage system can carry it away. Affects **inland low points,
42
  basement apartments, and chronically under-sewered neighborhoods**:
43
  Hollis (Queens), Carroll Gardens (Brooklyn), Jamaica. **Hurricane
44
+ Ida 2021** is the canonical event for NYC. Most of the deaths
45
  were in basement apartments far from any coast. Optical satellites
46
  largely *can't see* this kind of flooding because the water drains
47
  fast and is often sub-surface.
48
+ - **Compound flooding**. Coastal + pluvial happening at the same
49
  time, with groundwater rising too. Currently the active research
50
  frontier (NPCC4 Ch. 3 calls it out explicitly). Most agencies model
51
  these mechanisms separately; reality combines them.
52
 
53
  A good civic flood tool has to cover all three and be honest about
54
  what each signal can and cannot see. Riprap surfaces evidence for all
55
+ three but **doesn't predict damage**. See scope below.
56
 
57
  ### 1.2 Empirical vs modeled vs proxy
58
 
59
  Each piece of flood evidence falls into one of three classes, and the
60
  distinction matters for how much weight to give it:
61
 
62
+ - **Empirical**. Something flooded a place and was measured. USGS
63
  high-water marks (people went out after Hurricane Ida and surveyed
64
  where water reached on building walls). The 2012 Sandy Inundation
65
  Zone (mapped by the city after the storm). FloodNet ultrasonic
66
  sensors that recorded an actual depth. **Highest-confidence**: this
67
  flood happened here.
68
+ - **Modeled scenarios**. Hydraulic models simulate "what if" cases.
69
  FEMA's regulatory floodplains (1 % and 0.2 % annual chance). NYC
70
  DEP's Stormwater Maps (modeled water depth under three rainfall
71
  scenarios with varying sea-level-rise assumptions). **Useful but
72
  scenario-bounded**: this could happen here under those conditions.
73
+ - **Proxy signals**. Indirect indicators of flooding. NYC 311
74
  complaints ("street flooding", "sewer backup") clustering around an
75
  address. Topographic indices (HAND, TWI) suggesting water *would*
76
  pool here based on terrain. **Useful but biased**: 311 reflects
 
79
 
80
  Riprap surfaces all three classes. The score weights them in that
81
  order (empirical > modeled > proxy), with empirical hits granted a
82
+ **floor rule**. See [Β§5](#5-the-scoring-rubric).
83
 
84
  ### 1.3 Hydrology indices used in this app
85
 
 
87
  compute from a Digital Elevation Model (DEM) and they're the
88
  hydrological literature's canonical exposure proxies:
89
 
90
+ - **HAND (Height Above Nearest Drainage)**. Vertical distance from
91
  the address up to the nearest river/drainage channel. **<1 m** = at
92
  drainage level (water *will* reach here in flood). **>10 m** =
93
  hillslope (very dry). Nobre et al. 2011.
94
+ - **TWI (Topographic Wetness Index)**. `ln(catchment_area / tan
95
  slope)`. **High TWI** = water tends to accumulate here (large
96
  contributing area, gentle slope). Beven & Kirkby 1979.
97
 
 
102
 
103
  ## 2. What Riprap actually produces
104
 
105
+ For a given address (or any of three modes; see [Β§4](#4-three-user-modes)),
106
  Riprap returns:
107
 
108
  1. **A tier 1–4** computed by a deterministic, published rubric
 
113
  *Status*, *Empirical evidence*, *Modeled scenarios*, *Policy
114
  context*. A section is omitted entirely if no specialist fired for
115
  it (silence-over-confabulation contract).
116
+ 3. **Evidence cards**. One per fired specialist, with the raw values
117
  and a link to the source dataset.
118
+ 4. **Map overlay**. The address pinned, with the empirical and
119
  modeled flood extents that overlap it.
120
+ 5. **Live "right now" signals**. Active NWS flood alerts, current
121
  tide residual at the Battery, recent precipitation at the nearest
122
  ASOS, and a Granite TTM short-horizon forecast of the surge
123
  residual. **These do not modify the tier** (per IPCC AR6 WG II's
 
131
 
132
  ## 3. The Burr FSM and how the specialists chain
133
 
134
+ Riprap is a **state machine**, a Burr FSM (DAGWorks), that walks
135
  through a fixed list of "specialist" functions in order. Each
136
  specialist either produces a structured fact or stays silent. At the
137
  end, the reconciler reads all the produced facts and writes the
 
206
  | 8 | **nws_obs** *(live)* | Recent precipitation from the nearest airport ASOS station (KNYC / KLGA / KJFK / KEWR / KFRG). | live |
207
  | 9 | **ttm_forecast** *(live)* | Granite TTM r2 zero-shot forecast of the surge **residual** at the Battery for the next ~9.6 h. NOAA already publishes the astronomical tide; TTM forecasts the part NOAA doesn't. | live (model-derived) |
208
  | 10 | **microtopo** | LiDAR-derived terrain features at the point: elevation, HAND, TWI, local relief percentile. | proxy |
209
+ | 11 | **ida_hwm** | USGS Hurricane Ida 2021 high-water marks. Actual measured water heights surveyed in the days after the storm. | empirical |
210
  | 12 | **prithvi** | NASA/IBM Prithvi-EO 2.0 segmentation of Sentinel-2 imagery for the Ida pre/post pair. Pre-computed offline; serves point-in-polygon queries against the resulting 166 polygons. | empirical (model-derived) |
211
  | 13 | **rag** | Granite Embedding 278M retrieves the most-relevant paragraphs from 5 NYC policy PDFs (Comptroller, NPCC4, MTA, NYCHA, ConEd) given the address's borough + which scenarios fired. | policy |
212
  | 14 | **reconcile** | Granite 4.1 :3b reads all the documents produced by steps 1–13 and writes the cited briefing paragraph. See [Β§6](#6-document-grounded-reconciliation). | LLM synthesis |
 
219
  | Step | What it returns |
220
  |---|---|
221
  | geocode | `(40.5780, -73.9617)`, BBL `3-08660-0001`, Brooklyn |
222
+ | sandy | **YES**. Inside the 2012 Sandy Inundation Zone |
223
  | dep_stormwater | `dep_moderate_2050`: depth 0.4-0.8 ft; `dep_extreme_2080`: depth 0.8-2.0 ft |
224
  | floodnet | 2 sensors within 600 m; 1 trigger event in last 3 yr (peak 14 cm) |
225
  | nyc311 | 11 flood-related complaints in 200 m, 5-yr window |
 
267
  |---------------------------------------|------------------|---|
268
  | `/` | **Single address** | Geocode β†’ run the full FSM β†’ cited paragraph + map. Live demo path. |
269
  | `/compare` | **Compare** | Two addresses side by side; parallel FSM runs (`asyncio.to_thread`, `OLLAMA_NUM_PARALLEL=2`). Useful for "this site vs the alternative". |
270
+ | `/register/{schools,nycha,mta_entrances}` | **Register** | Pre-computed bulk runs over NYC public-asset registries. 126 schools, 45 NYCHA developments, ~1,900 MTA subway entrances. Loaded from `data/registers/*.json` at boot. |
271
 
272
  Single-address is the live path. Registers are pre-computed because
273
  running 1,900 reconciler calls at request time is a non-starter; the
 
302
  ### 5.2 Max-empirical floor
303
 
304
  If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
305
+ the tier is capped at **2 (Elevated)**. It cannot be worse,
306
  regardless of the additive composite.
307
 
308
  This recovers the *important* multiplicative behaviour Balica 2012
309
  argues for (empirical observations should not be cancelled by
310
  terrain or modeled scenarios) without giving up additive transparency.
311
  The 100 m radius is chosen because USGS HWM positional uncertainty is
312
+ typically 5–30 m. 100 m gives ~3Οƒ headroom for a confident "this
313
  address was inundated" signal.
314
 
315
  ### 5.3 Live signals stay out
 
366
 
367
  `app/mellea_validator.py` wraps the Granite-via-Ollama call in IBM
368
  Research's [Mellea](https://github.com/generative-computing/mellea)
369
+ framework. Instruct, validate, repair. The synthesis intents call
370
  `reconcile_strict_streaming(...)` which:
371
 
372
  1. **Streams** each generation attempt's tokens to the user (via the
 
374
  `progress_q` for the polygon intents).
375
  2. After each attempt, runs **four deterministic checks** on the
376
  accumulated paragraph:
377
+ - **`numerics_grounded`**. Every non-trivial number in the output
378
  appears verbatim in a source document.
379
+ - **`no_placeholder_tokens`**. Output contains no leaked
380
  `[source]` / `<document>` template markup.
381
+ - **`citations_dense`**. Every non-trivial number has a
382
  `[doc_id]` citation **somewhere in the same sentence** (sentence
383
  boundaries: `. ` / `.\n` / end-of-text).
384
+ - **`citations_resolve`**. Cited `doc_id`s are a subset of the
385
  input doc_ids.
386
  3. If any check fails, fires a `mellea_attempt` SSE event with the
387
  failed-requirement names, then **rerolls** with a feedback prompt
388
  that names the specific failing sentences (the model usually
389
  responds well to surgical corrections). Loop budget: 3 attempts.
390
 
391
+ The frontend renders an inline banner above the briefing. Amber on
392
  reroll (with the failed-req list), green on first-try pass. The final
393
  reconcile step in the trace shows the `passed: N/4 Β· rerolls: M`
394
  metadata for full audit transparency.
 
411
  - The official Ollama chat template for `granite4.x` has no citation
412
  branch (the 3.3 / 4.0-preview templates did).
413
  - `granite_common` ships only `granite3/granite32` and
414
+ `granite3/granite33` subdirs. No 4.x equivalent.
415
  - `granite-io` has only `granite_3_2/` and `granite_3_3/` processor
416
  dirs.
417
 
418
  The base 4.1 weights still contain the cite tokens (training residue),
419
+ so the model emits them as real tokens when nudged. But only as an
420
  end-of-response list, not inline in prose. IBM's published 4.x
421
  grounding path is a separate **Citation Generation LoRA** (built on
422
  `granite-4.0-micro`, not 4.1) requiring HF transformers + LoRA
 
452
  baked into `data/prithvi_ida_2021.geojson`. The runtime FSM does a
453
  point-in-polygon test, not fresh inference. This is honest about
454
  where foundation models earn their keep: **once, to produce a
455
+ defensible event-level signal. Not per request**.
456
 
457
  ### 7.2 Why TTM r2 runs live
458
 
459
+ TTM r2 is **1.5 M params**. Vastly smaller than Prithvi or Granite
460
  4.1. Inference is millisecond-scale even on CPU. It forecasts only
461
  the residual (surge component) at the Battery, which complements the
462
  NOAA snapshot specialist; it does **not** try to forecast the
 
474
  unless source data changed.
475
  - **Cadence**: NOAA tides update every 6 min; NWS alerts on push;
476
  NWS obs ~hourly; TTM is computed per query (cheap).
477
+ - **Failure mode**: graceful. If NOAA times out, no `noaa_tides`
478
  doc is emitted; the reconciler simply doesn't see it.
479
 
480
  This mirrors how First Street separates Flood Factor (static, 30-yr)
 
551
  compare.html / .js two-address side-by-side
552
  register.html / .js bulk register browser
553
  style.css IBM Plex Sans, Planning Labs idiom
554
+ dist/ Svelte 5 custom-element bundle (committed.
555
  HF Spaces doesn't run a Node build).
556
  Built from web/svelte/ via `npm run build`.
557
 
 
601
  type, electrical hardening, drainage condition), social capacity,
602
  and financial absorption are out of scope.
603
  - **No sub-surface flooding.** Optical satellites can't see basement
604
+ apartments or subway entrances. The dominant Hurricane Ida damage
605
  mode in NYC. Prithvi correctly emits no polygons for Hollis or
606
  Carroll Gardens. That silence is a feature, not a bug.
607
  - **Vintage-bounded.** FEMA NFHL is years stale; DEP Stormwater Maps
 
657
  adds ~25 s)
658
 
659
  The Svelte bundle in `web/static/dist/` is committed, so HF Spaces
660
+ runs no Node build step. Only the Python deps + Ollama install.
661
 
662
  ### 12.2 Local development
663
 
METHODOLOGY.md CHANGED
@@ -1,4 +1,4 @@
1
- # Riprap β€” Scoring Methodology
2
 
3
  > Riprap produces a **flood-exposure tier (1–4) per NYC address**, not
4
  > a calibrated damage probability. The tier is a deterministic
@@ -8,7 +8,7 @@
8
  ## 1. Why this design
9
 
10
  Closed-methodology scores (First Street, Jupiter, Fathom) are useful
11
- products but uncitable in civic work β€” a NYCEM grant writer can't quote
12
  "0.73" in a FEMA BRIC sub-application without a defensible audit trail.
13
  At the same time, an LLM-emitted score would be non-reproducible and
14
  uncalibrated, with documented LLM-as-judge pathologies (Zheng et al.
@@ -25,19 +25,19 @@ The composite construction follows a well-trodden path in the multi-
25
  indicator vulnerability/exposure literature:
26
 
27
  - **Cutter, Boruff & Shirley (2003)**, *Social Science Quarterly* 84(2):
28
- 242–261 β€” the SoVI hazards-of-place pattern: group indicators
29
  thematically; sum factors with equal weights because there is no
30
  defensible theoretical basis for differential weighting.
31
- - **Tate (2012)**, *Natural Hazards* 63: 325–347 β€” explicit Monte Carlo
32
  sensitivity analysis showing that hierarchical equal-weighted
33
  composites are the most rank-stable. This is why we use equal weights
34
  *within* sub-indices.
35
  - **Balica, Wright & van der Meulen (2012)**, *Natural Hazards* 64:
36
- 73–105 β€” Coastal City Flood Vulnerability Index, multiplicative
37
  (Exposure Γ— Susceptibility / Resilience). We adopt only the
38
  override-behavior of multiplicative form, as a "max-empirical floor"
39
  (Β§4 below), because we have no resilience term.
40
- - **Kim et al. (2019)**, *Scientific Reports* 9:18564 β€” additive vs
41
  geometric aggregation; additive is more transparent and reproducible
42
  *if* sub-indices are pre-grouped thematically. Done.
43
 
@@ -58,15 +58,15 @@ Binary "inside zone" indicators with weights ordered by agency tiering:
58
 
59
  | Indicator | Weight | Citation |
60
  |---------------------------------|-------:|----------|
61
- | FEMA NFHL 1% (SFHA) | 1.00 | FEMA NFHL β€” regulatory mandate threshold |
62
- | FEMA NFHL 0.2% | 0.50 | FEMA NFHL β€” tail scenario |
63
  | NYC DEP Moderate-2050 + 2.5 ft | 0.75 | NYC DEP Stormwater Maps 2021; NPCC4 Ch.3 |
64
- | NYC DEP Extreme-2080 + SLR | 0.50 | NYC DEP Stormwater Maps 2021 β€” explicitly tail |
65
  | NYC DEP Tidal-2050 | 0.75 | NPCC4 Ch.3 coastal projection |
66
 
67
  Why DEP-2050 outranks DEP-2080: NPCC4 designates the 2080 extreme
68
  scenario as a **tail** projection. Closer-horizon coastal/pluvial
69
- maps β€” those a current planner can act on β€” get the higher weight.
70
 
71
  ### 3.2 Hydrological sub-index
72
 
@@ -99,7 +99,7 @@ Mix of binary observed-extent flags and banded count signals:
99
  | FloodNet trigger (3 yr) | 0.75 | FloodNet NYC; NPCC4 Ch.3 references |
100
 
101
  The 311 and FloodNet weights are capped at 0.75 because both signals
102
- have documented coverage and reporting bias β€” 311 reflects civic
103
  engagement as well as flooding, FloodNet has uneven spatial coverage.
104
  Sandy and HWMs are 1.0 because they're engineered ground-truth
105
  observations.
@@ -116,7 +116,7 @@ Bands for 311 count (200 m buffer, 5-year window):
116
  ## 4. Max-empirical floor
117
 
118
  If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
119
- the tier is capped at **2 (Elevated)** β€” it cannot be worse, regardless
120
  of the additive composite.
121
 
122
  This recovers the *important* multiplicative behaviour Balica 2012
@@ -126,7 +126,7 @@ floor (a `min(tier, 2)` after composition) rather than a full
126
  multiplicative form so the composite remains additive and auditable.
127
 
128
  The 100 m radius is chosen because USGS HWM positional uncertainty is
129
- typically 5–30 m horizontal β€” 100 m gives ~3Οƒ headroom for a confident
130
  "this address was inundated" signal.
131
 
132
  ## 5. Composite β†’ tier mapping
@@ -177,7 +177,7 @@ Riprap's tier is **not**:
177
 
178
  It **is**:
179
 
180
- - An exposure prior β€” a literature-grounded, deterministic, reproducible
181
  index of how many publicly-documented flood signals overlap this
182
  address.
183
  - Auditable end-to-end: every term has a published source; every weight
@@ -200,7 +200,7 @@ These appear next to the tier badge and in the methodology disclosure:
200
 
201
  > **DEP 2050/2080 and FEMA 0.2% are bounding scenarios, not forecasts.**
202
  > The tier reads them as "if this scenario materialized, this address
203
- > would be inside its footprint" β€” not "this is the expected future."
204
 
205
  > **Compound flooding is not separately modeled.** Concurrence of rain
206
  > + tide + groundwater is the residual research frontier (NPCC4 Ch. 3).
@@ -244,7 +244,7 @@ Kim, S. et al. (2019). "Assessment of Aggregation Frameworks for
244
  Composite Indicators in Measuring Flood Vulnerability to Climate
245
  Change." *Scientific Reports* 9:18564.
246
 
247
- Nobre, A. D. et al. (2011). "Height Above the Nearest Drainage β€” A
248
  Hydrologically Relevant New Terrain Model." *Journal of Hydrology*
249
  404(1–2): 13–29.
250
 
 
1
+ # Riprap scoring methodology
2
 
3
  > Riprap produces a **flood-exposure tier (1–4) per NYC address**, not
4
  > a calibrated damage probability. The tier is a deterministic
 
8
  ## 1. Why this design
9
 
10
  Closed-methodology scores (First Street, Jupiter, Fathom) are useful
11
+ products but uncitable in civic work. A NYCEM grant writer can't quote
12
  "0.73" in a FEMA BRIC sub-application without a defensible audit trail.
13
  At the same time, an LLM-emitted score would be non-reproducible and
14
  uncalibrated, with documented LLM-as-judge pathologies (Zheng et al.
 
25
  indicator vulnerability/exposure literature:
26
 
27
  - **Cutter, Boruff & Shirley (2003)**, *Social Science Quarterly* 84(2):
28
+ 242–261. The SoVI hazards-of-place pattern: group indicators
29
  thematically; sum factors with equal weights because there is no
30
  defensible theoretical basis for differential weighting.
31
+ - **Tate (2012)**, *Natural Hazards* 63: 325–347. Explicit Monte Carlo
32
  sensitivity analysis showing that hierarchical equal-weighted
33
  composites are the most rank-stable. This is why we use equal weights
34
  *within* sub-indices.
35
  - **Balica, Wright & van der Meulen (2012)**, *Natural Hazards* 64:
36
+ 73–105. Coastal City Flood Vulnerability Index, multiplicative
37
  (Exposure Γ— Susceptibility / Resilience). We adopt only the
38
  override-behavior of multiplicative form, as a "max-empirical floor"
39
  (Β§4 below), because we have no resilience term.
40
+ - **Kim et al. (2019)**, *Scientific Reports* 9:18564. Additive vs
41
  geometric aggregation; additive is more transparent and reproducible
42
  *if* sub-indices are pre-grouped thematically. Done.
43
 
 
58
 
59
  | Indicator | Weight | Citation |
60
  |---------------------------------|-------:|----------|
61
+ | FEMA NFHL 1% (SFHA) | 1.00 | FEMA NFHL. Regulatory mandate threshold |
62
+ | FEMA NFHL 0.2% | 0.50 | FEMA NFHL. Tail scenario |
63
  | NYC DEP Moderate-2050 + 2.5 ft | 0.75 | NYC DEP Stormwater Maps 2021; NPCC4 Ch.3 |
64
+ | NYC DEP Extreme-2080 + SLR | 0.50 | NYC DEP Stormwater Maps 2021. Explicitly tail |
65
  | NYC DEP Tidal-2050 | 0.75 | NPCC4 Ch.3 coastal projection |
66
 
67
  Why DEP-2050 outranks DEP-2080: NPCC4 designates the 2080 extreme
68
  scenario as a **tail** projection. Closer-horizon coastal/pluvial
69
+ maps. Those a current planner can act on. Get the higher weight.
70
 
71
  ### 3.2 Hydrological sub-index
72
 
 
99
  | FloodNet trigger (3 yr) | 0.75 | FloodNet NYC; NPCC4 Ch.3 references |
100
 
101
  The 311 and FloodNet weights are capped at 0.75 because both signals
102
+ have documented coverage and reporting bias. 311 reflects civic
103
  engagement as well as flooding, FloodNet has uneven spatial coverage.
104
  Sandy and HWMs are 1.0 because they're engineered ground-truth
105
  observations.
 
116
  ## 4. Max-empirical floor
117
 
118
  If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
119
+ the tier is capped at **2 (Elevated)**. It cannot be worse, regardless
120
  of the additive composite.
121
 
122
  This recovers the *important* multiplicative behaviour Balica 2012
 
126
  multiplicative form so the composite remains additive and auditable.
127
 
128
  The 100 m radius is chosen because USGS HWM positional uncertainty is
129
+ typically 5–30 m horizontal. 100 m gives ~3Οƒ headroom for a confident
130
  "this address was inundated" signal.
131
 
132
  ## 5. Composite β†’ tier mapping
 
177
 
178
  It **is**:
179
 
180
+ - An exposure prior. A literature-grounded, deterministic, reproducible
181
  index of how many publicly-documented flood signals overlap this
182
  address.
183
  - Auditable end-to-end: every term has a published source; every weight
 
200
 
201
  > **DEP 2050/2080 and FEMA 0.2% are bounding scenarios, not forecasts.**
202
  > The tier reads them as "if this scenario materialized, this address
203
+ > would be inside its footprint". Not "this is the expected future."
204
 
205
  > **Compound flooding is not separately modeled.** Concurrence of rain
206
  > + tide + groundwater is the residual research frontier (NPCC4 Ch. 3).
 
244
  Composite Indicators in Measuring Flood Vulnerability to Climate
245
  Change." *Scientific Reports* 9:18564.
246
 
247
+ Nobre, A. D. et al. (2011). "Height Above the Nearest Drainage. A
248
  Hydrologically Relevant New Terrain Model." *Journal of Hydrology*
249
  404(1–2): 13–29.
250
 
MONDAY.md CHANGED
@@ -8,7 +8,7 @@ Demo is **Sunday May 10**.
8
  Eight priorities closed against `audit/2026-05-03-evening-audit.md`:
9
 
10
  1. `pitch/cold_open.md` restored (was accidentally deleted in 1cb5ee6).
11
- 2. Granite Guardian / refusal-classification leftovers removed β€”
12
  Mellea is the sole grounding mechanism, period.
13
  3. **Trace UI is now clickable.** Click any specialist row to reveal
14
  its raw structured output (formatted JSON, copy button,
@@ -23,9 +23,9 @@ Eight priorities closed against `audit/2026-05-03-evening-audit.md`:
23
  NYCHA-centroid) coloured by Sandy exposure with click popups
24
  showing name + `[doc_id]`. NYCHA polygon-fill is queued for when
25
  `geometry_geojson` lands in the dataclass.
26
- 6. **`floodnet_forecast` specialist** β€” TTM r2 forecast on the
27
  nearest FloodNet sensor's flood-event recurrence. Reuses the
28
- (512, 96) singleton already loaded for `ttm_311_forecast` β€”
29
  *no new model class loaded into memory*. The strongest single
30
  TTM win for the NYU CUSP audience.
31
  7. Trace UI groups TTM specialists under one parent node
@@ -60,7 +60,7 @@ fine-tune prep green.
60
 
61
  End-to-end smoke on "Coney Island Brooklyn" produced citations
62
  `[mta_entrance_56]`, `[nycha_dev_239]`, `[nycha_dev_166]` alongside
63
- `[rag_mta]` and `[nyc311]` β€” family-prefix chip routing works.
64
 
65
  Last commit: `86861be` (FSM integration of 4 register specialists).
66
 
@@ -85,7 +85,7 @@ Last commit: `86861be` (FSM integration of 4 register specialists).
85
  GPU with vLLM; both fit on one MI300X.
86
  - **TerraMind manifest is 1028 paired chips**, 2021-05 β†’ 2026-04,
87
  NYC 5-borough hull +5 km, S2-cloud <30%, ≀3-day pair window. One
88
- year (2022-05 β†’ 2023-04) returned 0 due to PC API intermittency β€”
89
  acceptable for the micro-fine-tune.
90
 
91
  ## First thing Monday morning
@@ -112,7 +112,7 @@ Last commit: `86861be` (FSM integration of 4 register specialists).
112
  - A: ship the fine-tuned checkpoint as a Riprap specialist.
113
  - B: publish to HF as `msradam/TerraMind-1.0-NYC` with model card,
114
  don't ship in demo. **Bucket B is fully acceptable** per the
115
- spec β€” civic-tech publication discipline is the durable goal.
116
  - C: discard checkpoint, no public artefact.
117
 
118
  ## Working on Monday
@@ -166,10 +166,10 @@ In rough priority order:
166
  tree.
167
  3. **WCAG 2.2 AA pass.**
168
  4. **Methodology paper draft** (6-8 page PDF). Goal: Saturday May 9.
169
- 5. **Historical-event mode** β€” vintage-cutoff queries. Saturday.
170
  6. **Five Build-in-Public posts** through the week.
171
  7. **5-minute hackathon pitch + 3 demo queries.** Friday rehearsal.
172
- 8. **ASCE talk materials** β€” May 13 (post-hackathon).
173
 
174
  ## Sharp edges to remember
175
 
@@ -196,14 +196,14 @@ In rough priority order:
196
  ## Files to read in order on Monday morning
197
 
198
  1. This file.
199
- 2. `experiments/05_terramind_nyc_finetune/eval/eval_spec.md` β€” the
200
  contract for what training output triggers ship/publish/revert.
201
- 3. `experiments/06_granite_guardian/RESULTS.md` β€” the Guardian β†’
202
  planner pivot decision record (so you know why Guardian is in
203
  the repo but not on the demo path).
204
- 4. `experiments/07_mta_entrances/RESULTS.md` β€” the canonical
205
  register-specialist pattern (the other three follow it).
206
- 5. `CLAUDE.md` β€” fix the AMD droplet IP (165.245.134.44 β†’
207
  129.212.182.52) at the same time as the first edit of the day.
208
 
209
  ## Status as of 2026-05-03 ~12:50 ET
 
8
  Eight priorities closed against `audit/2026-05-03-evening-audit.md`:
9
 
10
  1. `pitch/cold_open.md` restored (was accidentally deleted in 1cb5ee6).
11
+ 2. Granite Guardian / refusal-classification leftovers removed.
12
  Mellea is the sole grounding mechanism, period.
13
  3. **Trace UI is now clickable.** Click any specialist row to reveal
14
  its raw structured output (formatted JSON, copy button,
 
23
  NYCHA-centroid) coloured by Sandy exposure with click popups
24
  showing name + `[doc_id]`. NYCHA polygon-fill is queued for when
25
  `geometry_geojson` lands in the dataclass.
26
+ 6. **`floodnet_forecast` specialist**. TTM r2 forecast on the
27
  nearest FloodNet sensor's flood-event recurrence. Reuses the
28
+ (512, 96) singleton already loaded for `ttm_311_forecast`.
29
  *no new model class loaded into memory*. The strongest single
30
  TTM win for the NYU CUSP audience.
31
  7. Trace UI groups TTM specialists under one parent node
 
60
 
61
  End-to-end smoke on "Coney Island Brooklyn" produced citations
62
  `[mta_entrance_56]`, `[nycha_dev_239]`, `[nycha_dev_166]` alongside
63
+ `[rag_mta]` and `[nyc311]`. Family-prefix chip routing works.
64
 
65
  Last commit: `86861be` (FSM integration of 4 register specialists).
66
 
 
85
  GPU with vLLM; both fit on one MI300X.
86
  - **TerraMind manifest is 1028 paired chips**, 2021-05 β†’ 2026-04,
87
  NYC 5-borough hull +5 km, S2-cloud <30%, ≀3-day pair window. One
88
+ year (2022-05 β†’ 2023-04) returned 0 due to PC API intermittency.
89
  acceptable for the micro-fine-tune.
90
 
91
  ## First thing Monday morning
 
112
  - A: ship the fine-tuned checkpoint as a Riprap specialist.
113
  - B: publish to HF as `msradam/TerraMind-1.0-NYC` with model card,
114
  don't ship in demo. **Bucket B is fully acceptable** per the
115
+ spec. Civic-tech publication discipline is the durable goal.
116
  - C: discard checkpoint, no public artefact.
117
 
118
  ## Working on Monday
 
166
  tree.
167
  3. **WCAG 2.2 AA pass.**
168
  4. **Methodology paper draft** (6-8 page PDF). Goal: Saturday May 9.
169
+ 5. **Historical-event mode**. Vintage-cutoff queries. Saturday.
170
  6. **Five Build-in-Public posts** through the week.
171
  7. **5-minute hackathon pitch + 3 demo queries.** Friday rehearsal.
172
+ 8. **ASCE talk materials**. May 13 (post-hackathon).
173
 
174
  ## Sharp edges to remember
175
 
 
196
  ## Files to read in order on Monday morning
197
 
198
  1. This file.
199
+ 2. `experiments/05_terramind_nyc_finetune/eval/eval_spec.md`. The
200
  contract for what training output triggers ship/publish/revert.
201
+ 3. `experiments/06_granite_guardian/RESULTS.md`. The Guardian β†’
202
  planner pivot decision record (so you know why Guardian is in
203
  the repo but not on the demo path).
204
+ 4. `experiments/07_mta_entrances/RESULTS.md`. The canonical
205
  register-specialist pattern (the other three follow it).
206
+ 5. `CLAUDE.md`. Fix the AMD droplet IP (165.245.134.44 β†’
207
  129.212.182.52) at the same time as the first edit of the day.
208
 
209
  ## Status as of 2026-05-03 ~12:50 ET
README.md CHANGED
@@ -11,12 +11,12 @@ pinned: false
11
  <img src="assets/logo@2x.png" width="72" height="72" alt="Riprap dam mark" />
12
  </p>
13
 
14
- # Riprap β€” citation-grounded NYC flood-exposure briefings
15
 
16
- Riprap takes any NYC address (or neighborhood, or development-permit query)
17
- and produces a four-section briefing β€” **Status / Empirical evidence /
18
- Modeled scenarios / Policy context** β€” where every numeric claim is
19
- anchored to a `[doc_id]` citation pointing back into the source document.
20
 
21
  The Capstone reconciler is **Granite 4.1** (8B, served via Ollama on T4
22
  or vLLM on AMD MI300X), wrapped in **Mellea**-validated rejection
@@ -29,11 +29,11 @@ Live demo: <https://lablab-ai-amd-developer-hackathon-riprap-nyc.hf.space>
29
 
30
  ---
31
 
32
- ## How Riprap works β€” the Five Stones
33
 
34
- Behind every briefing, ~25 atomic specialists fan out across NYC datasets,
35
- satellite imagery, sensors, and forecasts. The **Five Stones** are a
36
- re-grouping of those specialists into five legible roles:
37
 
38
  > **Cornerstone** remembers. **Keystone** tallies. **Touchstone**
39
  > watches. **Lodestone** projects. **Capstone** writes it all down with
@@ -41,11 +41,11 @@ re-grouping of those specialists into five legible roles:
41
 
42
  | Stone | Role | What fires |
43
  |---|---|---|
44
- | **Cornerstone** | The Hazard Reader β€” what NYC's ground remembers | Sandy 2012 inundation extent, NYC DEP stormwater scenarios, 2021 Ida USGS high-water marks, baked Prithvi-EO Ida-attributable polygons, USGS 3DEP DEM + HAND/TWI |
45
- | **Keystone** | The Asset Register β€” what's exposed | MTA subway entrances, NYCHA developments, NYC DOE schools, NYS DOH hospitals, **TerraMind-NYC Buildings LoRA** |
46
- | **Touchstone** | The Live Observer β€” current state of the city | FloodNet ultrasonic depth sensors, NYC 311 flood complaints, NWS hourly METAR, NOAA tide-gauge water levels, **Prithvi-EO 2.0 NYC-Pluvial v2**, **TerraMind-NYC LULC LoRA** |
47
- | **Lodestone** | The Projector β€” what's coming | NWS public flood alerts, Granite TTM r2 surge nowcast (zero-shot, 6-min cadence, 9.6 h horizon), per-address 311 weekly forecast, FloodNet sensor recurrence forecast, **Granite-TTM-r2-Battery-Surge fine-tune** (96 h hourly horizon) |
48
- | **Capstone** | The Synthesiser β€” citation-grounded briefing | Granite 4.1 + Mellea rejection sampling |
49
 
50
  The four data-Stones run sequentially per query; the Capstone reconciles
51
  their documents into one cited paragraph.
@@ -55,45 +55,45 @@ their documents into one cited paragraph.
55
  ## NYC-specialised foundation models (Apache 2.0)
56
 
57
  Three NYC-specific fine-tunes built on AMD Instinct MI300X via AMD
58
- Developer Cloud, published under permissive licence:
59
-
60
- - **[`msradam/TerraMind-NYC-Adapters`](https://huggingface.co/msradam/TerraMind-NYC-Adapters)**
61
- β€” LoRA family on TerraMind 1.0 base. LULC mIoU 0.5866 (+6.13 pp over
62
- full-FT baseline), TiM 0.6023, Buildings 0.5511. Trained in ~18 min on
63
- a single MI300X.
64
- - **[`msradam/Prithvi-EO-2.0-NYC-Pluvial`](https://huggingface.co/msradam/Prithvi-EO-2.0-NYC-Pluvial)**
65
- β€” NYC pluvial-flood fine-tune of Prithvi-EO 2.0. Test flood IoU
66
- 0.5979 vs 0.10 on the Sen1Floods11 base β€” a 6Γ— lift. LovΓ‘sz-Softmax
67
- loss + copy-paste augmentation.
68
- - **[`msradam/Granite-TTM-r2-Battery-Surge`](https://huggingface.co/msradam/Granite-TTM-r2-Battery-Surge)**
69
- β€” NYC Battery storm-surge nowcast fine-tune of Granite TimeSeries TTM
70
- r2. Test MAE 0.1091 m, βˆ’41% vs persistence and βˆ’25% vs zero-shot.
71
-
72
- All three are loaded at runtime by their respective FSM specialists in
 
 
73
  `app/context/` and `app/live/`.
74
 
75
  ---
76
 
77
  ## Architecture pointers
78
 
79
- - `app/stones/` β€” the Stones taxonomy (NAME / TAGLINE / SOURCES /
80
- collect()) over the FSM specialists.
81
- - `app/fsm.py` β€” Burr FSM with one specialist per `@action`.
82
- - `app/reconcile.py` β€” `build_documents()` emits Granite-native
83
  document-role messages in canonical Stone order.
84
- - `app/mellea_validator.py` β€” strict reconcile path (4-check rejection
85
  sampling).
86
- - `app/llm.py` β€” LiteLLM Router shim. Routes to Ollama (T4 / local) or
87
  vLLM (AMD MI300X) without changing caller code.
88
- - `web/main.py` β€” FastAPI + SSE. The stream emits
89
  `plan / step / token / mellea_attempt / final` events plus the
90
  `stone_start / stone_done` envelope around each Stone group.
91
- - `web/sveltekit/` β€” primary UI (SvelteKit + adapter-static).
92
- - `web/svelte/` + `web/static/` β€” legacy custom-element bundle, still
93
- serving `/legacy`, `/single`, `/compare`.
94
- - `experiments/18_terramind_nyc_lora/` /
95
- `experiments/19_prithvi_nyc_v2/` /
96
- `experiments/20_ttm_battery_surge/` β€” full reproduction recipes for
97
  the three HF artifacts above.
98
 
99
  ---
@@ -112,6 +112,9 @@ RIPRAP_LLM_API_KEY=<token> \
112
 
113
  # Programmatic Mellea probe (server must be running)
114
  .venv/bin/python scripts/probe_mellea.py --query "Hollis" --runs 5
 
 
 
115
  ```
116
 
117
  ---
@@ -122,7 +125,7 @@ Apache 2.0 (this repository). The three NYC-specialised models above
122
  are also Apache 2.0; underlying upstream models retain their own
123
  permissive licences (see each `MODEL_CARD.md`).
124
 
125
- Check out the HF Space configuration reference at
126
  <https://huggingface.co/docs/hub/spaces-config-reference>.
127
 
128
  ---
 
11
  <img src="assets/logo@2x.png" width="72" height="72" alt="Riprap dam mark" />
12
  </p>
13
 
14
+ # Riprap: citation-grounded NYC flood-exposure briefings
15
 
16
+ Riprap takes any NYC address (or neighborhood, or development-permit
17
+ query) and produces a four-section briefing (Status, Empirical evidence,
18
+ Modeled scenarios, Policy context). Every numeric claim is anchored to
19
+ a `[doc_id]` citation pointing back into the source document.
20
 
21
  The Capstone reconciler is **Granite 4.1** (8B, served via Ollama on T4
22
  or vLLM on AMD MI300X), wrapped in **Mellea**-validated rejection
 
29
 
30
  ---
31
 
32
+ ## How Riprap works: the Five Stones
33
 
34
+ Behind every briefing, around 25 atomic data probes fan out across NYC
35
+ datasets, satellite imagery, sensors, and forecasts. The **Five Stones**
36
+ group those probes into five legible roles:
37
 
38
  > **Cornerstone** remembers. **Keystone** tallies. **Touchstone**
39
  > watches. **Lodestone** projects. **Capstone** writes it all down with
 
41
 
42
  | Stone | Role | What fires |
43
  |---|---|---|
44
+ | **Cornerstone** | The Hazard Reader. What NYC's ground remembers. | Sandy 2012 inundation extent, NYC DEP stormwater scenarios, 2021 Ida USGS high-water marks, baked Prithvi-EO Ida-attributable polygons, USGS 3DEP DEM + HAND/TWI |
45
+ | **Keystone** | The Asset Register. What's exposed. | MTA subway entrances, NYCHA developments, NYC DOE schools, NYS DOH hospitals, **TerraMind-NYC Buildings LoRA** |
46
+ | **Touchstone** | The Live Observer. Current state of the city. | FloodNet ultrasonic depth sensors, NYC 311 flood complaints, NWS hourly METAR, NOAA tide-gauge water levels, **Prithvi-EO 2.0 NYC-Pluvial v2**, **TerraMind-NYC LULC LoRA** |
47
+ | **Lodestone** | The Projector. What's coming. | NWS public flood alerts, Granite TTM r2 surge nowcast (zero-shot, 6-min cadence, 9.6 h horizon), per-address 311 weekly forecast, FloodNet sensor recurrence forecast, **Granite-TTM-r2-Battery-Surge fine-tune** (96 h hourly horizon) |
48
+ | **Capstone** | The Synthesiser. Citation-grounded briefing. | Granite 4.1 + Mellea rejection sampling |
49
 
50
  The four data-Stones run sequentially per query; the Capstone reconciles
51
  their documents into one cited paragraph.
 
55
  ## NYC-specialised foundation models (Apache 2.0)
56
 
57
  Three NYC-specific fine-tunes built on AMD Instinct MI300X via AMD
58
+ Developer Cloud, published under permissive licence.
59
+
60
+ **[`msradam/TerraMind-NYC-Adapters`](https://huggingface.co/msradam/TerraMind-NYC-Adapters).**
61
+ LoRA family on TerraMind 1.0 base. LULC mIoU 0.5866 (+6.13 pp over
62
+ full-FT baseline), TiM 0.6023, Buildings 0.5511. Trained in around 18
63
+ minutes on a single MI300X.
64
+
65
+ **[`msradam/Prithvi-EO-2.0-NYC-Pluvial`](https://huggingface.co/msradam/Prithvi-EO-2.0-NYC-Pluvial).**
66
+ NYC pluvial-flood fine-tune of Prithvi-EO 2.0. Test flood IoU 0.5979
67
+ vs 0.10 on the Sen1Floods11 base, a 6Γ— lift. LovΓ‘sz-Softmax loss with
68
+ copy-paste augmentation.
69
+
70
+ **[`msradam/Granite-TTM-r2-Battery-Surge`](https://huggingface.co/msradam/Granite-TTM-r2-Battery-Surge).**
71
+ NYC Battery storm-surge nowcast fine-tune of Granite TimeSeries TTM
72
+ r2. Test MAE 0.1091 m, βˆ’41% vs persistence and βˆ’25% vs zero-shot.
73
+
74
+ All three are loaded at runtime by their respective FSM probes in
75
  `app/context/` and `app/live/`.
76
 
77
  ---
78
 
79
  ## Architecture pointers
80
 
81
+ - `app/stones/`: the Stones taxonomy (NAME / TAGLINE / SOURCES /
82
+ collect()) over the FSM probes.
83
+ - `app/fsm.py`: Burr FSM with one probe per `@action`.
84
+ - `app/reconcile.py`: `build_documents()` emits Granite-native
85
  document-role messages in canonical Stone order.
86
+ - `app/mellea_validator.py`: strict reconcile path (4-check rejection
87
  sampling).
88
+ - `app/llm.py`: LiteLLM Router shim. Routes to Ollama (T4 / local) or
89
  vLLM (AMD MI300X) without changing caller code.
90
+ - `web/main.py`: FastAPI + SSE. The stream emits
91
  `plan / step / token / mellea_attempt / final` events plus the
92
  `stone_start / stone_done` envelope around each Stone group.
93
+ - `web/sveltekit/`: primary UI (SvelteKit + adapter-static).
94
+ - `experiments/18_terramind_nyc_lora/`,
95
+ `experiments/19_prithvi_nyc_v2/`,
96
+ `experiments/20_ttm_battery_surge/`: full reproduction recipes for
 
 
97
  the three HF artifacts above.
98
 
99
  ---
 
112
 
113
  # Programmatic Mellea probe (server must be running)
114
  .venv/bin/python scripts/probe_mellea.py --query "Hollis" --runs 5
115
+
116
+ # End-to-end address suite (5 NYC addresses, intent-aware checks)
117
+ .venv/bin/python scripts/probe_addresses.py
118
  ```
119
 
120
  ---
 
125
  are also Apache 2.0; underlying upstream models retain their own
126
  permissive licences (see each `MODEL_CARD.md`).
127
 
128
+ HF Space configuration reference:
129
  <https://huggingface.co/docs/hub/spaces-config-reference>.
130
 
131
  ---
RESEARCH.md CHANGED
@@ -1,4 +1,4 @@
1
- # Riprap β€” landscape research
2
 
3
  Captured 2026-05-06 as part of the AMD x lablab.ai hackathon polish
4
  phase. This document underpins the pitch deck (`slides/deck.md`) and
@@ -9,133 +9,133 @@ re-using any specific figure.
9
 
10
  ## What Riprap is, distinctly
11
 
12
- A **citation-grounded LLM that writes audit-quality flood-exposure
13
- briefings for NYC addresses** by fusing 17+ live, historical,
14
- modeled, and projected data sources, with **Mellea rejection
15
- sampling** that refuses to publish a numeric claim it can't cite.
16
- The output isn't a score β€” it's a four-section prose briefing with
17
- `[doc_id]` citations on every numeric assertion, where each
18
- `doc_id` resolves to one specific dataset (Sandy 2012 zone, NYC
19
- DEP scenario, USGS HWM, Sentinel-2 chip, NOAA gauge reading,
20
- NPCC4 SLR projection).
21
-
22
- Granite 4.1 8B drives the prose; Granite Embedding 278M + GLiNER
23
- drive policy-doc retrieval; Prithvi-EO 2.0 + TerraMind LULC/
24
- Buildings + Granite TTM r2 drive the EO and forecast specialists,
25
  with three Apache-2.0 NYC fine-tunes trained on AMD MI300X published
26
  on HF Hub.
27
 
28
- The architectural commitments other tools don't make:
29
 
30
- 1. **Silence over confabulation** β€” when a specialist returns no
31
- data, the briefing omits the section rather than papering over it.
32
- 2. **Five-stone epistemic structure** so the user can see what's
33
  empirical vs modeled vs proxy vs synthetic.
34
- 3. **Fully open-source pipeline** β€” Apache-2.0 end-to-end on
35
- public-record data, no commercial APIs touched at runtime.
36
- 4. **Deployable on either local Ollama or AMD MI300X via vLLM**
37
- with auto-failover.
38
 
39
- Stack as of 2026-05-06: SvelteKit UI on HF Spaces (cpu-basic) at
40
- the AMD-hackathon org, FastAPI agent FSM, two-container droplet
41
- (vLLM + riprap-models) on MI300X, full address probe suite at 5/5
42
- PASS in 5.8–13.1 s end-to-end.
43
 
44
  ---
45
 
46
  ## Landscape map
47
 
48
- ### Direct comps β€” score-based property risk tools
49
 
50
  | Tool | What it gives | Who it serves | Hidden cost |
51
  |---|---|---|---|
52
- | **First Street Risk Factor** (Flood Factor) | Score 1–10 + 30-yr risk narrative; powers Redfin, Realtor.com (until Dec 2025 also Zillow) | Homebuyers; some lenders | Closed model; commercial partnerships; Zillow removed it under industry pressure in Dec 2025 |
53
- | **ClimateCheck** | Score 1–100 + ~30-page property report; 2050 projections | Homeowners + REIT/PE diligence | Subscription tiers; methodology behind paywall |
54
  | **Jupiter ClimateScore Global** | Enterprise SaaS / API; financial metrics (CapEx, OpEx, credit risk) | Banks, insurers, asset managers | Enterprise pricing; not consumer-facing |
55
- | **Cervest / Climate X / ICEYE** | Variants of above for ESG / reinsurance | Corporate finance & insurance | Same |
56
 
57
- Score-based tools all converge on the same shape: **one number,
58
- one chart, an explainer paragraph**. None show what claim is
59
- grounded in which dataset. None expose the audit trail.
60
 
61
  ### NYC-specific government tools
62
 
63
- - **FloodHelpNY** (City + State, IDEO-designed) β€” address lookup
64
- β†’ flood-zone label + insurance + free resiliency audit. Forms-
65
- based; consumer-facing; doesn't fuse live signals.
66
- - **NYC Flood Hazard Mapper** β€” ArcGIS web map of FEMA + NPCC +
67
- Sandy + future scenarios. Static visualization; no narrative.
68
- - **NYC OEM Flood Maps page** β€” index of the above.
69
- - **EJNYC Flood Vulnerability Index** (released 2024-04 by
70
- Mayor's Office of Climate & EJ) β€” first-ever city FVI, used to
71
- direct spending under NY's "Disadvantaged Communities" framework
72
- (35% of climate spend by law).
73
- - **FloodNet NYC** (NYU + CUNY + city) β€” 350+ ultrasonic sensors
74
- at 1-min cadence, growing to 500 by end-2026. Has a public
75
  dashboard but no narrative layer.
76
 
77
  ### Federal / authoritative
78
 
79
- - **FEMA Flood Map Service Center / NFHL** β€” official; covers 90%+
80
- population; static GIS layer + PDFs. The disclosure-of-record but
81
- not a synthesis tool.
82
 
83
  ### Real-estate platforms (the volatile zone)
84
 
85
- - **Redfin** β€” still shows First Street Flood Factor on every
86
  listing.
87
- - **Realtor.com** β€” still shows it on 110M+ listings.
88
- - **Zillow** β€” **removed climate risk display in December 2025**
89
- under California Regional MLS pressure. Still links out but it's
90
- hidden. **This created a vacuum that an open citation-grounded
91
- alternative could fill.**
92
 
93
  ### Closest academic / AI comps
94
 
95
- - **Flood-LLM** (Brisbane, MDPI Sustainability 2026) β€” multi-source
96
  LLM for property-level flood risk, validated on Brisbane against
97
  official labels. Academic, not deployed; no Mellea-style citation
98
  discipline; no live signals.
99
- - **GIS-Integrated Flood LLM** (Tandfonline 2024) β€” LLM constrained
100
- by a flood knowledge graph + GIS interaction. Research artefact.
101
- - **FloodLense** (arXiv 2024) β€” UNet/RDN/ViT + LLM for satellite
102
  flood detection. Research; image-only.
103
 
104
  ---
105
 
106
- ## Where Riprap fits β€” differentiators that demo well
107
 
108
  Ranked by visibility in a 3-minute demo:
109
 
110
  1. **Citation prose vs scores.** Riprap returns *"Hurricane Sandy
111
- flooded this address on October 29–30, 2012, according to the
112
  empirical inundation zone [sandy]. 19 flood-related 311 service
113
- requests were logged within 200 m over five years [nyc311]."* Every
114
- number cites a doc; each doc resolves to a footer source row.
115
- First Street returns "Flood Factor 8/10". This gap is the demo.
116
- 2. **Live + historical + modeled + projected, in one paragraph.**
117
- Sandy 2012 (empirical), DEP 2080 stormwater scenarios (modeled),
118
- 311 last 5 years (proxy), FloodNet last 3 years (empirical,
 
119
  hyperlocal), NPCC4 SLR (projected), Granite TTM r2 surge nowcast
120
  (96-h forecast). No comp combines all four temporal modes.
121
  3. **Open-source NYC fine-tunes.** Three Apache-2.0 models
122
  (`Prithvi-EO-2.0-NYC-Pluvial`, `TerraMind-NYC-Adapters`,
123
- `Granite-TTM-r2-Battery-Surge`) trained on AMD MI300X β€” anyone
124
- can reproduce, fork to other cities, or audit. First Street's
125
- model is closed; ClimateCheck's methodology is behind a paywall.
126
  4. **AMD hardware story.** The whole stack runs on MI300X via vLLM
127
- (LLM) + a sibling ROCm container (specialists). All Apache-2.0.
128
  This is the AMD hackathon track's preferred narrative: open
129
  models, open infra, open data, real GPU acceleration.
130
  5. **Mellea grounding receipts.** The four checks
131
- (`numerics_grounded` / `no_placeholder_tokens` / `citations_dense`
132
- / `citations_resolve`) are the audit. The meta card surfaces
133
- "4/4 grounding checks passed, 1 reroll" β€” that's audit
134
- credibility no consumer comp shows.
135
  6. **Self-aware silence.** Touchstone shows "FloodNet sensor: 0
136
- events in 3 years" with `silent_by_design`. Lodestone shows
137
- "TTM Battery surge forecast: peak |residual| < 0.3 m, omitted."
138
- Most tools always render a value. Riprap's silence is a feature.
139
 
140
  ---
141
 
@@ -147,19 +147,19 @@ part of the system. These are the demo arcs to rehearse.
147
  ### 1. Resident / homebuyer (the FloodHelpNY swap-in)
148
 
149
  > *"I'm thinking about renting an apartment at 80 Pioneer Street,
150
- > Brooklyn β€” should I worry?"*
151
 
152
  **Demo arc.** Type the address. Watch the planner classify
153
- `single_address`, then 19 specialists fire across the four Stones
154
- in ~13 s. Briefing names Sandy 2012 inundation, 65 311 complaints,
155
- 2 FloodNet sensors with 4 events including a 51 mm peak on a
156
- specific date, Ida 2021 HWM 130 m away, microtopo HAND 3.81 m + TWI
157
- 14.79 (very high saturation propensity). Footer shows 7+ named
158
- primary sources.
159
 
160
  **Demo hook.** "Compare what we just generated to First Street's
161
- number-and-bar-chart for the same address. Which would you trust
162
- to make a $4,000/month decision?"
163
 
164
  ### 2. Real-estate attorney / disclosure compliance
165
 
@@ -168,32 +168,32 @@ to make a $4,000/month decision?"
168
 
169
  **Demo arc.** Same single_address path. Briefing produces a citable
170
  narrative covering FEMA designation, prior flood claims (where
171
- present), terrain, recent complaints. Mellea grounding check is
172
- the *qualifier* β€” "this prose is grounded against four invariants
173
- and passed 4/4."
174
 
175
  **Demo hook.** New York's March-2024 amended Property Condition
176
- Disclosure Statement requires sellers to disclose flood history
177
- and FEMA-floodplain status. RPL Β§231-b requires every residential
178
- lease to disclose prior flood damage. Riprap is the citable
179
- narrative tool β€” show how the briefing maps line-by-line to the
180
- disclosure requirements.
181
 
182
  ### 3. NYC OEM / DEP planner
183
 
184
  > *"Hollis, Queens"*
185
 
186
- **Demo arc.** Neighborhood intent fires (9 steps), produces NTA-
187
- level briefing β€” 434 flood-related 311 over 3 years (87 catch-
188
  basin clogged, 42 street-flooding), 4.3% of neighborhood projected
189
- to flood under DEP moderate-2050 scenario, 25% of cells with
190
- HAND<1 m. RAG retrieval pulls relevant DEP/NPCC4 policy paragraphs.
191
 
192
  **Demo hook.** DEP just announced a $30B stormwater priority list
193
  (86 locations) and a $68M Brooklyn Bluebelt expansion in Prospect
194
- Park. Riprap supports the prioritization argument with citable
195
- per-NTA evidence. Pair with the EJNYC Flood Vulnerability Index for
196
- the EJ-spending overlay (35%-to-disadvantaged-communities legal
197
  mandate).
198
 
199
  ### 4. Insurance underwriter / actuary
@@ -201,27 +201,27 @@ mandate).
201
  > *"442 East Houston Street, Manhattan"*
202
 
203
  **Demo arc.** Same as resident demo, but emphasize the **provenance
204
- trace** UI β€” every Specialist row, every doc_id, every source URL,
205
  vintage, and tier glyph.
206
 
207
  **Demo hook.** When an underwriter writes a risk memo, the audit
208
  chain matters. First Street's "we used a proprietary catastrophe
209
  model" doesn't survive a regulator review the way "we used FEMA
210
- Sandy 2012 polygon, NYC DEP 2021 stormwater scenario, USGS Ida
211
- HWM Event 312, NOAA gauge 8518750, NWS station KNYC, Granite TTM
212
- r2 fine-tune (test MAE 0.1091 m vs 0.1467 zero-shot, citable)" does.
213
 
214
  ### 5. Climate journalist / advocacy
215
 
216
  > *"Coney Island, Brooklyn"*
217
 
218
- **Demo arc.** Neighborhood briefing β€” 87.5% of NTA in 2012 Sandy
219
  zone, 382 flood complaints over 3 years, 7.8% projected flooded
220
- under 2050 moderate, 38.9% of DEM cells with HAND<1 m, DEP
221
- extreme-2080 β†’ 44.2% flooded.
222
 
223
  **Demo hook.** ProPublica/NYTimes/THE CITY-style data journalism.
224
- Every claim in a Riprap briefing is reproducible β€” anyone can paste
225
  the same query and get a near-identical narrative. The journalist
226
  can publish the briefing as the methods section.
227
 
@@ -229,77 +229,79 @@ can publish the briefing as the methods section.
229
 
230
  > *"What are they building in Gowanus and is it risky"*
231
 
232
- **Demo arc.** Planner classifies `development_check`. FSM pulls
233
- DOB filings + flood layers for the project sites. Briefing comments
234
  on which proposed buildings sit inside Sandy 2012, which intersect
235
  DEP extreme-2080, what the microtopo says.
236
 
237
- **Demo hook.** Pre-design siting check. The Gowanus rezoning is
238
- one of NYC's largest active development zones, well known to flood.
239
- Show how the tool surfaces flood concerns before architects pour
240
  concrete.
241
 
242
  ---
243
 
244
- ## Lateral & unexpected use cases
245
 
246
- 10 bets, ordered roughly from most-buildable to most-speculative.
247
 
248
  1. **Pre-storm cohort briefings.** Subscribe Riprap to NWS flood-
249
  watch alerts. When a watch lands, fan out one briefing per
250
- affected NTA + push to OEM / press / advocacy lists. Citable
251
  evidence on demand for the press cycle that follows.
252
  2. **Climate-grant evidence sections.** HUD CDBG-DR and FEMA BRIC
253
  applications need an auditable evidence base. Riprap auto-
254
  generates the "vulnerability assessment" section so a community
255
- group can apply for resilience funding without hiring a consultant.
 
256
  3. **Local Law disclosure boilerplate.** Plug into a brokerage's
257
- listing flow: when an agent enters an address, auto-generate
258
- the NY RPL Β§231-b lease addendum or Β§462(2) disclosure draft.
259
- ROI is high since the law took effect 2024 and many landlords
260
- are still figuring out compliance.
261
  4. **MTA station-hardening prioritization.** Riprap already has the
262
- MTA-entrance specialist (KEY-001 in the demo). Run the FSM
263
- across all subway entrances; rank by exposure Γ— ridership. The
264
- MTA's October-2025 Climate Resilience Roadmap Update is the
265
- policy hook.
266
- 5. **DOE school siting.** When DOE reviews proposed school sites
267
- or selects schools for retrofit, Riprap briefings (with `expect_311_ge`
268
- + Sandy + DEP overlays) would catch flood exposure that form-
269
- style screens miss.
270
- 6. **Time-machine variant.** Re-run the FSM with snapshot data
271
- from a past date β€” *"what would Riprap have said about Hollis
272
- on August 31, 2021, the day before Ida?"* Useful for retrospective
273
- analysis, expert testimony, and stress-testing the system.
274
- 7. **Cross-city scaffold.** The architecture is NYC-specific by
275
- data choice, not by code. Port to Houston (post-Harvey + Hurricane
276
  Beryl 2024), Miami (king tides), Boston (CSO floods), Charleston
277
- (chronic tidal), with a per-city specialist set + RAG corpus.
278
  8. **Federation with FloodNet alerts.** When a sensor triggers a
279
  flood event NOW, fire a Riprap live_now briefing for the
280
  surrounding NTA: *"what's at stake in the next 6 hours."*
281
  Connects FloodNet's hyperlocal sensor reads to the OEM decision
282
  loop.
283
  9. **EJNYC Γ— Riprap pairing.** Rank all 188 NTAs by Riprap-detected
284
- exposure, intersect with state DAC designations. Output: a map
285
- of "underserved + underwater" β€” the most underfunded high-
286
- exposure neighborhoods.
287
  10. **Court testimony / expert witness.** Citable, reproducible
288
- flood narrative as a court exhibit. The Mellea passes-record +
289
- provenance trace are the kind of artefact a regulator or judge
290
- can audit. Especially relevant after the December-2025 Zillow
291
- controversy created public discussion of climate-data integrity.
 
292
 
293
  ---
294
 
295
- ## Risks & framing
296
 
297
  - **Real-estate industry pushback.** December 2025: Zillow removed
298
  First Street's climate scores under MLS pressure because the data
299
  was hurting transaction volume. A free, citation-grounded
300
  alternative could face the same reflex. Riprap's defence is that
301
- it's a *narrative tool for professional analytical work*, not a
302
- buy/don't-buy verdict β€” keep the disclaimer footer prominent.
303
  - **Redlining hazard.** Exposure narratives can be misused by
304
  landlords or insurers to discriminate against high-flood-risk
305
  (often disproportionately disadvantaged) neighborhoods.
@@ -313,11 +315,11 @@ concrete.
313
  law. Don't position it as legal disclosure-of-record without a
314
  real-estate-attorney review.
315
  - **Cold-start latency.** First query after droplet redeploy is
316
- ~30 s while models warm. For demos, ping the Space + run one
317
- warm-up query 5 minutes before showtime.
318
- - **Geocoder edge cases.** "PS 188, Lower East Side" geocoded to
319
- a Brooklyn PS 188 in our test suite. For demos, pick fully-
320
- qualified street addresses; document the disambiguation behavior.
321
 
322
  ---
323
 
@@ -325,42 +327,42 @@ concrete.
325
 
326
  Concrete polish items the research surfaces, ranked by demo value:
327
 
328
- 1. **Sample-query pills on landing.** Six clickable pills below
329
- the search bar β€” one per persona above. Let the audience demo
330
  themselves.
331
- 2. **A "What this is" bar at the top of the landing.** Three-line:
332
  *"Citation-grounded NYC flood briefings. Every number cites a
333
  primary source. Open-source, public data, audit-grade synthesis."*
334
  3. **Compare-mode link from the briefing.** Once Riprap delivers a
335
  single_address briefing, surface "compare with another address"
336
  as a one-click affordance. The compare intent already exists in
337
  the planner.
338
- 4. **EJNYC-FVI overlay** on the map sidebar (#9 above) β€” Riprap's
339
  exposure Γ— DAC designation, two clicks to a powerful editorial
340
  demo.
341
  5. **First-query warm-up message** during the cold start: *"loading
342
- specialists on AMD MI300X β€” first query after redeploy takes
343
- ~30 s; subsequent queries 5–13 s."*
344
 
345
  ---
346
 
347
  ## Sources
348
 
349
- - [First Street Foundation β€” Flood Factor methodology](https://firststreet.org/methodology/flood)
350
- - [FloodHelpNY β€” NYC + IDEO consumer tool](https://www.floodhelpny.org/en)
351
- - [ClimateCheck β€” flood risk methodology](https://climatecheck.com/risks/flood)
352
- - [Jupiter Intelligence β€” ClimateScore Global / FloodScore](https://www.jupiterintel.com/climatescore-global)
353
  - [FEMA Flood Map Service Center](https://msc.fema.gov/)
354
- - [NY State β€” RPL Β§231-b residential lease flood disclosure (2023)](https://www.nysenate.gov/legislation/bills/2021/S5472)
355
- - [NYSBA β€” Property Condition Disclosure flood-risk amendment (Mar 2024)](https://nysba.org/breaking-news-new-rules-on-property-condition-disclosure-and-flood-risk-go-into-effect-today/)
356
- - [CNN β€” Zillow removes climate risk data under industry pressure (Dec 2025)](https://www.cnn.com/2025/12/02/climate/zillow-climate-data-extreme-weather-first-street-redfin)
357
  - [NYC Stormwater Resiliency Plan](https://www.nyc.gov/assets/orr/pdf/publications/stormwater-resiliency-plan.pdf)
358
- - [FloodNet NYC β€” methodology + sensor network](https://www.floodnet.nyc/methodology)
359
- - [FloodNet WRR 2024 β€” peer-reviewed sensor paper](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023WR036806)
360
- - [EJNYC Report β€” Mayor's Office of Climate & Environmental Justice](https://climate.cityofnewyork.us/ejnyc-report/the-state-of-environmental-justice-in-nyc/)
361
- - [Flood-LLM β€” Brisbane case study (MDPI 2026)](https://www.mdpi.com/2071-1050/18/6/2957)
362
  - [GIS-Integrated Flood LLM (Tandfonline 2024)](https://www.tandfonline.com/doi/full/10.1080/13658816.2024.2306167)
363
- - [THE CITY β€” Disadvantaged Communities flood funding (NY Climate Law)](https://www.thecity.nyc/2022/05/02/billions-ny-climate-law-disadvantaged-communities-flood/)
364
- - [Inman β€” Redfin First Street integration](https://www.inman.com/2021/02/18/redfin-starts-displaying-flood-risk-data-on-listings/)
365
- - [FACTUM β€” citation-hallucination detection in long-form RAG](https://arxiv.org/pdf/2601.05866)
366
- - [AMD x lablab.ai Developer Hackathon (May 4–10, 2026)](https://lablab.ai/ai-hackathons/amd-developer)
 
1
+ # Riprap: landscape research
2
 
3
  Captured 2026-05-06 as part of the AMD x lablab.ai hackathon polish
4
  phase. This document underpins the pitch deck (`slides/deck.md`) and
 
9
 
10
  ## What Riprap is, distinctly
11
 
12
+ A citation-grounded LLM that writes audit-quality flood-exposure
13
+ briefings for NYC addresses by fusing live, historical, modeled, and
14
+ projected data sources. Mellea rejection sampling refuses to publish
15
+ a numeric claim it can't cite. The output isn't a score. It's a
16
+ four-section prose briefing with `[doc_id]` citations on every
17
+ numeric assertion, where each `doc_id` resolves to one specific
18
+ dataset (Sandy 2012 zone, NYC DEP scenario, USGS HWM, Sentinel-2
19
+ chip, NOAA gauge reading, NPCC4 SLR projection).
20
+
21
+ Granite 4.1 8B drives the prose. Granite Embedding 278M plus GLiNER
22
+ drive policy-doc retrieval. Prithvi-EO 2.0, TerraMind LULC and
23
+ Buildings, and Granite TTM r2 drive the EO and forecast probes,
 
24
  with three Apache-2.0 NYC fine-tunes trained on AMD MI300X published
25
  on HF Hub.
26
 
27
+ Architectural commitments other tools don't make:
28
 
29
+ 1. **Silence over confabulation.** When a probe returns no data, the
30
+ briefing omits the section rather than papering over it.
31
+ 2. **Five-stone epistemic structure.** The user can see what's
32
  empirical vs modeled vs proxy vs synthetic.
33
+ 3. **Fully open-source pipeline.** Apache-2.0 end-to-end on public-
34
+ record data, no commercial APIs touched at runtime.
35
+ 4. **Deployable on either local Ollama or AMD MI300X via vLLM** with
36
+ auto-failover.
37
 
38
+ Stack as of 2026-05-06: SvelteKit UI on HF Spaces (cpu-basic) at the
39
+ AMD-hackathon org, FastAPI agent FSM, two-container droplet (vLLM
40
+ plus riprap-models) on MI300X, full address probe suite at 5/5 PASS
41
+ in 5.8 to 13.1 s end-to-end.
42
 
43
  ---
44
 
45
  ## Landscape map
46
 
47
+ ### Direct comps: score-based property risk tools
48
 
49
  | Tool | What it gives | Who it serves | Hidden cost |
50
  |---|---|---|---|
51
+ | **First Street Risk Factor** (Flood Factor) | Score 1 to 10 plus 30-yr risk narrative; powers Redfin, Realtor.com (until Dec 2025 also Zillow) | Homebuyers; some lenders | Closed model; commercial partnerships; Zillow removed it under industry pressure in Dec 2025 |
52
+ | **ClimateCheck** | Score 1 to 100 plus around 30-page property report; 2050 projections | Homeowners plus REIT/PE diligence | Subscription tiers; methodology behind paywall |
53
  | **Jupiter ClimateScore Global** | Enterprise SaaS / API; financial metrics (CapEx, OpEx, credit risk) | Banks, insurers, asset managers | Enterprise pricing; not consumer-facing |
54
+ | **Cervest / Climate X / ICEYE** | Variants of above for ESG / reinsurance | Corporate finance and insurance | Same |
55
 
56
+ Score-based tools all converge on the same shape: one number, one
57
+ chart, an explainer paragraph. None show what claim is grounded in
58
+ which dataset. None expose the audit trail.
59
 
60
  ### NYC-specific government tools
61
 
62
+ - **FloodHelpNY** (City plus State, IDEO-designed). Address lookup
63
+ to flood-zone label plus insurance plus free resiliency audit.
64
+ Forms-based, consumer-facing, doesn't fuse live signals.
65
+ - **NYC Flood Hazard Mapper.** ArcGIS web map of FEMA, NPCC, Sandy,
66
+ and future scenarios. Static visualization, no narrative.
67
+ - **NYC OEM Flood Maps page.** Index of the above.
68
+ - **EJNYC Flood Vulnerability Index** (released 2024-04 by Mayor's
69
+ Office of Climate and EJ). First-ever city FVI, used to direct
70
+ spending under NY's "Disadvantaged Communities" framework (35% of
71
+ climate spend by law).
72
+ - **FloodNet NYC** (NYU plus CUNY plus city). Over 350 ultrasonic
73
+ sensors at 1-min cadence, growing to 500 by end-2026. Has a public
74
  dashboard but no narrative layer.
75
 
76
  ### Federal / authoritative
77
 
78
+ - **FEMA Flood Map Service Center / NFHL.** Official; covers 90%+
79
+ of population; static GIS layer plus PDFs. The disclosure-of-
80
+ record but not a synthesis tool.
81
 
82
  ### Real-estate platforms (the volatile zone)
83
 
84
+ - **Redfin.** Still shows First Street Flood Factor on every
85
  listing.
86
+ - **Realtor.com.** Still shows it on 110M+ listings.
87
+ - **Zillow.** Removed climate risk display in December 2025 under
88
+ California Regional MLS pressure. Still links out, but it's
89
+ hidden. This created a vacuum that an open citation-grounded
90
+ alternative could fill.
91
 
92
  ### Closest academic / AI comps
93
 
94
+ - **Flood-LLM** (Brisbane, MDPI Sustainability 2026). Multi-source
95
  LLM for property-level flood risk, validated on Brisbane against
96
  official labels. Academic, not deployed; no Mellea-style citation
97
  discipline; no live signals.
98
+ - **GIS-Integrated Flood LLM** (Tandfonline 2024). LLM constrained
99
+ by a flood knowledge graph plus GIS interaction. Research artefact.
100
+ - **FloodLense** (arXiv 2024). UNet/RDN/ViT plus LLM for satellite
101
  flood detection. Research; image-only.
102
 
103
  ---
104
 
105
+ ## Where Riprap fits: differentiators that demo well
106
 
107
  Ranked by visibility in a 3-minute demo:
108
 
109
  1. **Citation prose vs scores.** Riprap returns *"Hurricane Sandy
110
+ flooded this address on October 29 to 30, 2012, according to the
111
  empirical inundation zone [sandy]. 19 flood-related 311 service
112
+ requests were logged within 200 m over five years [nyc311]."*
113
+ Every number cites a doc; each doc resolves to a footer source
114
+ row. First Street returns "Flood Factor 8/10". This gap is the
115
+ demo.
116
+ 2. **Live, historical, modeled, projected: in one paragraph.** Sandy
117
+ 2012 (empirical), DEP 2080 stormwater scenarios (modeled), 311
118
+ last 5 years (proxy), FloodNet last 3 years (empirical,
119
  hyperlocal), NPCC4 SLR (projected), Granite TTM r2 surge nowcast
120
  (96-h forecast). No comp combines all four temporal modes.
121
  3. **Open-source NYC fine-tunes.** Three Apache-2.0 models
122
  (`Prithvi-EO-2.0-NYC-Pluvial`, `TerraMind-NYC-Adapters`,
123
+ `Granite-TTM-r2-Battery-Surge`) trained on AMD MI300X. Anyone can
124
+ reproduce, fork to other cities, or audit. First Street's model
125
+ is closed; ClimateCheck's methodology is behind a paywall.
126
  4. **AMD hardware story.** The whole stack runs on MI300X via vLLM
127
+ (LLM) plus a sibling ROCm container (probes). All Apache-2.0.
128
  This is the AMD hackathon track's preferred narrative: open
129
  models, open infra, open data, real GPU acceleration.
130
  5. **Mellea grounding receipts.** The four checks
131
+ (`numerics_grounded`, `no_placeholder_tokens`, `citations_dense`,
132
+ `citations_resolve`) are the audit. The meta card surfaces "4/4
133
+ grounding checks passed, 1 reroll". That's audit credibility no
134
+ consumer comp shows.
135
  6. **Self-aware silence.** Touchstone shows "FloodNet sensor: 0
136
+ events in 3 years" with `silent_by_design`. Lodestone shows "TTM
137
+ Battery surge forecast: peak |residual| < 0.3 m, omitted." Most
138
+ tools always render a value. Riprap's silence is a feature.
139
 
140
  ---
141
 
 
147
  ### 1. Resident / homebuyer (the FloodHelpNY swap-in)
148
 
149
  > *"I'm thinking about renting an apartment at 80 Pioneer Street,
150
+ > Brooklyn. Should I worry?"*
151
 
152
  **Demo arc.** Type the address. Watch the planner classify
153
+ `single_address`, then 19 step events fire across the four data
154
+ Stones in around 13 s. Briefing names Sandy 2012 inundation, 65 311
155
+ complaints, 2 FloodNet sensors with 4 events including a 51 mm peak
156
+ on a specific date, Ida 2021 HWM 130 m away, microtopo HAND 3.81 m
157
+ plus TWI 14.79 (very high saturation propensity). Footer shows 7+
158
+ named primary sources.
159
 
160
  **Demo hook.** "Compare what we just generated to First Street's
161
+ number-and-bar-chart for the same address. Which would you trust to
162
+ make a $4,000/month decision?"
163
 
164
  ### 2. Real-estate attorney / disclosure compliance
165
 
 
168
 
169
  **Demo arc.** Same single_address path. Briefing produces a citable
170
  narrative covering FEMA designation, prior flood claims (where
171
+ present), terrain, recent complaints. Mellea grounding check is the
172
+ qualifier: "this prose is grounded against four invariants and
173
+ passed 4/4."
174
 
175
  **Demo hook.** New York's March-2024 amended Property Condition
176
+ Disclosure Statement requires sellers to disclose flood history and
177
+ FEMA-floodplain status. RPL Β§231-b requires every residential lease
178
+ to disclose prior flood damage. Riprap is the citable narrative
179
+ tool. Show how the briefing maps line-by-line to the disclosure
180
+ requirements.
181
 
182
  ### 3. NYC OEM / DEP planner
183
 
184
  > *"Hollis, Queens"*
185
 
186
+ **Demo arc.** Neighborhood intent fires (9 step events), produces an
187
+ NTA-level briefing. 434 flood-related 311 over 3 years (87 catch-
188
  basin clogged, 42 street-flooding), 4.3% of neighborhood projected
189
+ to flood under DEP moderate-2050 scenario, 25% of cells with HAND<1
190
+ m. RAG retrieval pulls relevant DEP/NPCC4 policy paragraphs.
191
 
192
  **Demo hook.** DEP just announced a $30B stormwater priority list
193
  (86 locations) and a $68M Brooklyn Bluebelt expansion in Prospect
194
+ Park. Riprap supports the prioritization argument with citable per-
195
+ NTA evidence. Pair with the EJNYC Flood Vulnerability Index for the
196
+ EJ-spending overlay (35%-to-disadvantaged-communities legal
197
  mandate).
198
 
199
  ### 4. Insurance underwriter / actuary
 
201
  > *"442 East Houston Street, Manhattan"*
202
 
203
  **Demo arc.** Same as resident demo, but emphasize the **provenance
204
+ trace** UI. Every Stone row, every doc_id, every source URL,
205
  vintage, and tier glyph.
206
 
207
  **Demo hook.** When an underwriter writes a risk memo, the audit
208
  chain matters. First Street's "we used a proprietary catastrophe
209
  model" doesn't survive a regulator review the way "we used FEMA
210
+ Sandy 2012 polygon, NYC DEP 2021 stormwater scenario, USGS Ida HWM
211
+ Event 312, NOAA gauge 8518750, NWS station KNYC, Granite TTM r2
212
+ fine-tune (test MAE 0.1091 m vs 0.1467 zero-shot, citable)" does.
213
 
214
  ### 5. Climate journalist / advocacy
215
 
216
  > *"Coney Island, Brooklyn"*
217
 
218
+ **Demo arc.** Neighborhood briefing. 87.5% of NTA in 2012 Sandy
219
  zone, 382 flood complaints over 3 years, 7.8% projected flooded
220
+ under 2050 moderate, 38.9% of DEM cells with HAND<1 m, DEP extreme-
221
+ 2080 at 44.2% flooded.
222
 
223
  **Demo hook.** ProPublica/NYTimes/THE CITY-style data journalism.
224
+ Every claim in a Riprap briefing is reproducible. Anyone can paste
225
  the same query and get a near-identical narrative. The journalist
226
  can publish the briefing as the methods section.
227
 
 
229
 
230
  > *"What are they building in Gowanus and is it risky"*
231
 
232
+ **Demo arc.** Planner classifies `development_check`. FSM pulls DOB
233
+ filings plus flood layers for the project sites. Briefing comments
234
  on which proposed buildings sit inside Sandy 2012, which intersect
235
  DEP extreme-2080, what the microtopo says.
236
 
237
+ **Demo hook.** Pre-design siting check. The Gowanus rezoning is one
238
+ of NYC's largest active development zones, well known to flood. Show
239
+ how the tool surfaces flood concerns before architects pour
240
  concrete.
241
 
242
  ---
243
 
244
+ ## Lateral and unexpected use cases
245
 
246
+ Ten bets, ordered roughly from most-buildable to most-speculative.
247
 
248
  1. **Pre-storm cohort briefings.** Subscribe Riprap to NWS flood-
249
  watch alerts. When a watch lands, fan out one briefing per
250
+ affected NTA plus push to OEM, press, and advocacy lists. Citable
251
  evidence on demand for the press cycle that follows.
252
  2. **Climate-grant evidence sections.** HUD CDBG-DR and FEMA BRIC
253
  applications need an auditable evidence base. Riprap auto-
254
  generates the "vulnerability assessment" section so a community
255
+ group can apply for resilience funding without hiring a
256
+ consultant.
257
  3. **Local Law disclosure boilerplate.** Plug into a brokerage's
258
+ listing flow. When an agent enters an address, auto-generate the
259
+ NY RPL Β§231-b lease addendum or Β§462(2) disclosure draft. ROI is
260
+ high since the law took effect 2024 and many landlords are still
261
+ figuring out compliance.
262
  4. **MTA station-hardening prioritization.** Riprap already has the
263
+ MTA-entrance probe (KEY-001 in the demo). Run the FSM across all
264
+ subway entrances; rank by exposure Γ— ridership. The MTA's
265
+ October-2025 Climate Resilience Roadmap Update is the policy
266
+ hook.
267
+ 5. **DOE school siting.** When DOE reviews proposed school sites or
268
+ selects schools for retrofit, Riprap briefings (with `expect_311_ge`
269
+ plus Sandy plus DEP overlays) would catch flood exposure that
270
+ form-style screens miss.
271
+ 6. **Time-machine variant.** Re-run the FSM with snapshot data from
272
+ a past date. *"What would Riprap have said about Hollis on August
273
+ 31, 2021, the day before Ida?"* Useful for retrospective analysis,
274
+ expert testimony, and stress-testing the system.
275
+ 7. **Cross-city scaffold.** The architecture is NYC-specific by data
276
+ choice, not by code. Port to Houston (post-Harvey plus Hurricane
277
  Beryl 2024), Miami (king tides), Boston (CSO floods), Charleston
278
+ (chronic tidal), with a per-city probe set plus RAG corpus.
279
  8. **Federation with FloodNet alerts.** When a sensor triggers a
280
  flood event NOW, fire a Riprap live_now briefing for the
281
  surrounding NTA: *"what's at stake in the next 6 hours."*
282
  Connects FloodNet's hyperlocal sensor reads to the OEM decision
283
  loop.
284
  9. **EJNYC Γ— Riprap pairing.** Rank all 188 NTAs by Riprap-detected
285
+ exposure, intersect with state DAC designations. Output: a map of
286
+ "underserved plus underwater". The most underfunded high-exposure
287
+ neighborhoods.
288
  10. **Court testimony / expert witness.** Citable, reproducible
289
+ flood narrative as a court exhibit. The Mellea passes-record
290
+ plus provenance trace are the kind of artefact a regulator or
291
+ judge can audit. Especially relevant after the December-2025
292
+ Zillow controversy created public discussion of climate-data
293
+ integrity.
294
 
295
  ---
296
 
297
+ ## Risks and framing
298
 
299
  - **Real-estate industry pushback.** December 2025: Zillow removed
300
  First Street's climate scores under MLS pressure because the data
301
  was hurting transaction volume. A free, citation-grounded
302
  alternative could face the same reflex. Riprap's defence is that
303
+ it's a narrative tool for professional analytical work, not a
304
+ buy/don't-buy verdict. Keep the disclaimer footer prominent.
305
  - **Redlining hazard.** Exposure narratives can be misused by
306
  landlords or insurers to discriminate against high-flood-risk
307
  (often disproportionately disadvantaged) neighborhoods.
 
315
  law. Don't position it as legal disclosure-of-record without a
316
  real-estate-attorney review.
317
  - **Cold-start latency.** First query after droplet redeploy is
318
+ around 30 s while models warm. For demos, ping the Space and run
319
+ one warm-up query 5 minutes before showtime.
320
+ - **Geocoder edge cases.** "PS 188, Lower East Side" geocoded to a
321
+ Brooklyn PS 188 in our test suite. For demos, pick fully-qualified
322
+ street addresses; document the disambiguation behavior.
323
 
324
  ---
325
 
 
327
 
328
  Concrete polish items the research surfaces, ranked by demo value:
329
 
330
+ 1. **Sample-query pills on landing.** Six clickable pills below the
331
+ search bar, one per persona above. Let the audience demo
332
  themselves.
333
+ 2. **A "What this is" bar at the top of the landing.** Three lines:
334
  *"Citation-grounded NYC flood briefings. Every number cites a
335
  primary source. Open-source, public data, audit-grade synthesis."*
336
  3. **Compare-mode link from the briefing.** Once Riprap delivers a
337
  single_address briefing, surface "compare with another address"
338
  as a one-click affordance. The compare intent already exists in
339
  the planner.
340
+ 4. **EJNYC-FVI overlay** on the map sidebar (#9 above). Riprap's
341
  exposure Γ— DAC designation, two clicks to a powerful editorial
342
  demo.
343
  5. **First-query warm-up message** during the cold start: *"loading
344
+ probes on AMD MI300X. First query after redeploy takes around 30
345
+ s; subsequent queries 5 to 13 s."*
346
 
347
  ---
348
 
349
  ## Sources
350
 
351
+ - [First Street Foundation: Flood Factor methodology](https://firststreet.org/methodology/flood)
352
+ - [FloodHelpNY: NYC and IDEO consumer tool](https://www.floodhelpny.org/en)
353
+ - [ClimateCheck: flood risk methodology](https://climatecheck.com/risks/flood)
354
+ - [Jupiter Intelligence: ClimateScore Global / FloodScore](https://www.jupiterintel.com/climatescore-global)
355
  - [FEMA Flood Map Service Center](https://msc.fema.gov/)
356
+ - [NY State: RPL Β§231-b residential lease flood disclosure (2023)](https://www.nysenate.gov/legislation/bills/2021/S5472)
357
+ - [NYSBA: Property Condition Disclosure flood-risk amendment (Mar 2024)](https://nysba.org/breaking-news-new-rules-on-property-condition-disclosure-and-flood-risk-go-into-effect-today/)
358
+ - [CNN: Zillow removes climate risk data under industry pressure (Dec 2025)](https://www.cnn.com/2025/12/02/climate/zillow-climate-data-extreme-weather-first-street-redfin)
359
  - [NYC Stormwater Resiliency Plan](https://www.nyc.gov/assets/orr/pdf/publications/stormwater-resiliency-plan.pdf)
360
+ - [FloodNet NYC: methodology and sensor network](https://www.floodnet.nyc/methodology)
361
+ - [FloodNet WRR 2024: peer-reviewed sensor paper](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023WR036806)
362
+ - [EJNYC Report: Mayor's Office of Climate and Environmental Justice](https://climate.cityofnewyork.us/ejnyc-report/the-state-of-environmental-justice-in-nyc/)
363
+ - [Flood-LLM: Brisbane case study (MDPI 2026)](https://www.mdpi.com/2071-1050/18/6/2957)
364
  - [GIS-Integrated Flood LLM (Tandfonline 2024)](https://www.tandfonline.com/doi/full/10.1080/13658816.2024.2306167)
365
+ - [THE CITY: Disadvantaged Communities flood funding (NY Climate Law)](https://www.thecity.nyc/2022/05/02/billions-ny-climate-law-disadvantaged-communities-flood/)
366
+ - [Inman: Redfin First Street integration](https://www.inman.com/2021/02/18/redfin-starts-displaying-flood-risk-data-on-listings/)
367
+ - [FACTUM: citation-hallucination detection in long-form RAG](https://arxiv.org/pdf/2601.05866)
368
+ - [AMD x lablab.ai Developer Hackathon (May 4 to 10, 2026)](https://lablab.ai/ai-hackathons/amd-developer)
services/riprap-models/README.md CHANGED
@@ -1,11 +1,11 @@
1
- # Riprap Models β€” droplet inference service
2
 
3
  GPU inference microservice that runs alongside vLLM on the AMD MI300X
4
  droplet. Exposes one HTTP endpoint per model class consumed by the
5
- Riprap FastAPI app's specialists, so all GPU-accelerable forward
6
- passes (Prithvi-NYC-Pluvial, TerraMind LULC + Buildings, Granite TTM
7
- r2, Granite Embedding 278M, GLiNER) run on the MI300X regardless of
8
- which surface β€” laptop or HF Space β€” hosts the FastAPI process.
9
 
10
  ## Service contract
11
 
@@ -22,7 +22,7 @@ Auth: bearer token on every `/v1/*` route via `RIPRAP_MODELS_API_KEY`.
22
  Same shape as vLLM. `/healthz` is open so liveness probes don't need
23
  auth.
24
 
25
- ## Deploy β€” fresh droplet (recommended)
26
 
27
  Use the one-shot bring-up script. Works on any AMD ROCm GPU droplet
28
  with Docker + GPU device files (`/dev/kfd`, `/dev/dri`) and SSH root
@@ -46,7 +46,7 @@ What it does, in order:
46
  6. Waits up to 90 s for vLLM `/v1/models` and 60 s for
47
  riprap-models `/healthz`, exits non-zero if either misses
48
 
49
- Re-running on the same droplet is idempotent β€” existing containers
50
  get `docker rm -f`'d and recreated.
51
 
52
  Env knobs:
@@ -65,7 +65,7 @@ After it returns, set the printed env vars in your local shell or HF
65
  Space variables, run `scripts/probe_addresses.py` to verify, and
66
  you're live.
67
 
68
- ## Deploy β€” extend an existing container (legacy)
69
 
70
  If you already have a `terramind` container with the heavy ML deps
71
  baked in (the bootstrap-droplet path), you can skip the Dockerfile
@@ -92,10 +92,11 @@ mapping was set when the container was created.
92
 
93
  What survives a droplet destruction:
94
 
95
- - `services/riprap-models/Dockerfile` + `requirements-full.txt` β€”
96
- every pinned dep, captured from the bootstrap droplet on 2026-05-05
97
- - `scripts/deploy_droplet.sh` β€” the bring-up script
98
- - HF Hub model artefacts β€” every fine-tune lives at
 
99
  `msradam/Prithvi-EO-2.0-NYC-Pluvial`,
100
  `msradam/TerraMind-NYC-Adapters`,
101
  `msradam/Granite-TTM-r2-Battery-Surge`. The Dockerfile pulls them
@@ -104,12 +105,13 @@ What survives a droplet destruction:
104
  What does NOT survive:
105
 
106
  - The HF cache at `${HF_CACHE_HOST}` (default `/root/hf-cache`) on
107
- the droplet β€” every redeploy re-downloads ~12 GB of weights
108
- (Granite 4.1 8b for vLLM ~16 GB, Prithvi v2 ~1.3 GB, TerraMind
109
- adapters ~600 MB, Granite Embedding ~600 MB, GLiNER ~400 MB,
110
- Granite TTM r2 ~6 MB). First query after redeploy takes ~30 s
111
- longer than steady-state because of the lazy model load
112
- - The bearer token β€” generate a fresh one when re-deploying
 
113
 
114
  To redeploy:
115
 
 
1
+ # Riprap Models: droplet inference service
2
 
3
  GPU inference microservice that runs alongside vLLM on the AMD MI300X
4
  droplet. Exposes one HTTP endpoint per model class consumed by the
5
+ Riprap FastAPI app's probes, so all GPU-accelerable forward passes
6
+ (Prithvi-NYC-Pluvial, TerraMind LULC + Buildings, Granite TTM r2,
7
+ Granite Embedding 278M, GLiNER) run on the MI300X regardless of
8
+ which surface (laptop or HF Space) hosts the FastAPI process.
9
 
10
  ## Service contract
11
 
 
22
  Same shape as vLLM. `/healthz` is open so liveness probes don't need
23
  auth.
24
 
25
+ ## Deploy: fresh droplet (recommended)
26
 
27
  Use the one-shot bring-up script. Works on any AMD ROCm GPU droplet
28
  with Docker + GPU device files (`/dev/kfd`, `/dev/dri`) and SSH root
 
46
  6. Waits up to 90 s for vLLM `/v1/models` and 60 s for
47
  riprap-models `/healthz`, exits non-zero if either misses
48
 
49
+ Re-running on the same droplet is idempotent. Existing containers
50
  get `docker rm -f`'d and recreated.
51
 
52
  Env knobs:
 
65
  Space variables, run `scripts/probe_addresses.py` to verify, and
66
  you're live.
67
 
68
+ ## Deploy: extend an existing container (legacy)
69
 
70
  If you already have a `terramind` container with the heavy ML deps
71
  baked in (the bootstrap-droplet path), you can skip the Dockerfile
 
92
 
93
  What survives a droplet destruction:
94
 
95
+ - `services/riprap-models/Dockerfile` plus `requirements-full.txt`.
96
+ Every pinned dep, captured from the bootstrap droplet on
97
+ 2026-05-05.
98
+ - `scripts/deploy_droplet.sh`. The bring-up script.
99
+ - HF Hub model artefacts. Every fine-tune lives at
100
  `msradam/Prithvi-EO-2.0-NYC-Pluvial`,
101
  `msradam/TerraMind-NYC-Adapters`,
102
  `msradam/Granite-TTM-r2-Battery-Surge`. The Dockerfile pulls them
 
105
  What does NOT survive:
106
 
107
  - The HF cache at `${HF_CACHE_HOST}` (default `/root/hf-cache`) on
108
+ the droplet. Every redeploy re-downloads around 12 GB of weights
109
+ (Granite 4.1 8b for vLLM around 16 GB, Prithvi v2 around 1.3 GB,
110
+ TerraMind adapters around 600 MB, Granite Embedding around 600 MB,
111
+ GLiNER around 400 MB, Granite TTM r2 around 6 MB). First query
112
+ after redeploy takes around 30 s longer than steady-state because
113
+ of the lazy model load.
114
+ - The bearer token. Generate a fresh one when re-deploying.
115
 
116
  To redeploy:
117
 
slides/README.md CHANGED
@@ -1,16 +1,20 @@
1
- # Riprap β€” pitch deck
2
 
3
  Marp-rendered slides for the AMD x lablab.ai hackathon submission
4
- video (May 4–10, 2026). Eight slides, ~30 s of voiceover each, sized
5
- to leave 3+ minutes for the live demo inside the 5-minute video cap.
 
6
 
7
  ## Files
8
 
9
- - `deck.md` β€” slide source (Marp markdown).
10
- - `riprap.css` β€” Marp theme that ports the SvelteKit UI's design
11
- tokens 1:1: IBM Plex (Sans / Mono / Serif), paper register
12
- (`#FAFAF7`), Stone palette, the `β–Œ` accent block.
13
- - `Makefile` β€” one-command builds.
 
 
 
14
 
15
  ## Render
16
 
@@ -23,9 +27,9 @@ npm install -g @marp-team/marp-cli
23
  Build any of the three artefacts:
24
 
25
  ```bash
26
- make pdf # β†’ deck.pdf (best for sharing)
27
- make html # β†’ deck.html (best for live presenting in browser)
28
- make pptx # β†’ deck.pptx (best for editing in Keynote / PowerPoint)
29
  make all # all three
30
  make clean
31
  ```
@@ -41,10 +45,10 @@ marp deck.md --theme riprap.css --pptx --output deck.pptx
41
  ## Notes
42
 
43
  - The theme `@import`s IBM Plex from Google Fonts, so Marp needs
44
- network access on first build. Cache it to `~/.npm/_marp`/etc by
45
- building once on a connected machine.
46
  - The deck targets a 1280Γ—720 16:9 frame.
47
- - Slide 7 is the demo handoff β€” leave the slide visible while you
48
  cut to the live screen capture, then return to slide 8 for the
49
  closing CTA.
50
 
@@ -56,7 +60,7 @@ marp deck.md --theme riprap.css --pptx --output deck.pptx
56
  | 2 | Problem | Zillow yanked First Street in Dec 2025; black-box scores hit a wall | 0:30 |
57
  | 3 | What it is | Show the cited paragraph; "every number cites its source or it doesn't appear" | 0:35 |
58
  | 4 | Stack | Three of four hackathon tracks; MI300X, vLLM, three NYC fine-tunes | 0:40 |
59
- | 5 | Receipts | 5 of 5 probe pass, 5.8–13.1 s, every claim verified | 0:30 |
60
- | 6 | Civic impact | NY disclosure law, DEP $30B plan, EJNYC FVI β€” open-source matters | 0:25 |
61
  | 7 | Demo handoff | Cut to the live HF Space; type the query; let the FSM speak | ~3:00 |
62
  | 8 | Closing CTA | github.com/msradam/riprap-nyc | 0:10 |
 
1
+ # Riprap pitch deck
2
 
3
  Marp-rendered slides for the AMD x lablab.ai hackathon submission
4
+ video (May 4 to 10, 2026). Eight slides, around 30 s of voiceover
5
+ each, sized to leave three or more minutes for the live demo inside
6
+ the five-minute video cap.
7
 
8
  ## Files
9
 
10
+ - `deck.md`. Slide source (Marp markdown).
11
+ - `riprap.css`. Marp theme that ports the SvelteKit UI's design
12
+ tokens 1:1: IBM Plex (Sans / Mono / Serif), Civic Hydrology paper
13
+ register (`#F4F6F9`), Stone palette, the dam mark.
14
+ - `logo.svg`, `logo-paper.svg`. Cleaned dam mark variants used by
15
+ the title and CTA slides. The original Noun Project file (with
16
+ embedded attribution text) lives at `assets/` in the repo root.
17
+ - `Makefile`. One-command builds.
18
 
19
  ## Render
20
 
 
27
  Build any of the three artefacts:
28
 
29
  ```bash
30
+ make pdf # writes deck.pdf (best for sharing)
31
+ make html # writes deck.html (best for live presenting in browser)
32
+ make pptx # writes deck.pptx (best for editing in Keynote / PowerPoint)
33
  make all # all three
34
  make clean
35
  ```
 
45
  ## Notes
46
 
47
  - The theme `@import`s IBM Plex from Google Fonts, so Marp needs
48
+ network access on first build. Cache it by building once on a
49
+ connected machine.
50
  - The deck targets a 1280Γ—720 16:9 frame.
51
+ - Slide 7 is the demo handoff. Leave the slide visible while you
52
  cut to the live screen capture, then return to slide 8 for the
53
  closing CTA.
54
 
 
60
  | 2 | Problem | Zillow yanked First Street in Dec 2025; black-box scores hit a wall | 0:30 |
61
  | 3 | What it is | Show the cited paragraph; "every number cites its source or it doesn't appear" | 0:35 |
62
  | 4 | Stack | Three of four hackathon tracks; MI300X, vLLM, three NYC fine-tunes | 0:40 |
63
+ | 5 | Receipts | 5 of 5 probe pass, 5.8 to 13.1 s, every claim verified | 0:30 |
64
+ | 6 | Civic impact | NY disclosure law, DEP $30B plan, EJNYC FVI; open-source matters | 0:25 |
65
  | 7 | Demo handoff | Cut to the live HF Space; type the query; let the FSM speak | ~3:00 |
66
  | 8 | Closing CTA | github.com/msradam/riprap-nyc | 0:10 |
slides/deck.md CHANGED
@@ -3,7 +3,7 @@ marp: true
3
  theme: riprap
4
  paginate: true
5
  size: 16:9
6
- title: Riprap β€” Citation-grounded NYC flood briefings
7
  description: AMD x lablab.ai Developer Hackathon, May 4–10 2026
8
  ---
9
 
 
3
  theme: riprap
4
  paginate: true
5
  size: 16:9
6
+ title: Riprap. Citation-grounded NYC flood briefings.
7
  description: AMD x lablab.ai Developer Hackathon, May 4–10 2026
8
  ---
9