Spaces:

lablab-ai-amd-developer-hackathon
/

riprap-nyc

Restarting

seriffic Claude Opus 4.7 (1M context) commited on 2 days ago

Commit

f6423e1

1 Parent(s): da4b993

Voice pass: strip em-dashes from user-facing docs

Constraint set:
- No em-dashes (—). Replace with periods, semicolons, commas, or
parens depending on context.
- No "not just X but Y" constructions (none found; nothing to do).
- Drop AI-tells (no "delve / leverage / cutting-edge / robust /
comprehensive / it's worth noting" etc found; nothing to do).

Files touched:
- README.md (rewritten cleanly; new live-demo URL pointing at the
AMD-hackathon org Space; probe_addresses.py mentioned in the
local-development block)
- RESEARCH.md (rewritten cleanly; same content, no em-dashes)
- ARCHITECTURE.md (bulk em-dash sweep; manual fix for two parenthetical
sentences the regex broke: line 105 modes-see-§4 and line 134
Burr-FSM-DAGWorks)
- METHODOLOGY.md (bulk; heading "Riprap. Scoring Methodology" →
"Riprap scoring methodology")
- MONDAY.md (bulk; trailing-space-period sweep)
- services/riprap-models/README.md (bulk + manual: parenthetical bug
on which-surface-laptop-or-HF-Space hosts the FastAPI process; "Deploy"
headings now use ":" instead of "."; "Riprap Models" subtitle now
":" instead of ".")
- slides/deck.md (single em-dash in Marp title metadata fixed)
- slides/README.md (rewritten to match)

Total em-dashes across all eight user-facing docs went from ~150 to
0. Deck PDF/HTML/PPTX rebuilt from the cleaned deck.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (8) hide show

ARCHITECTURE.md +41 -41
METHODOLOGY.md +16 -16
MONDAY.md +12 -12
README.md +46 -43
RESEARCH.md +175 -173
services/riprap-models/README.md +20 -18
slides/README.md +20 -16
slides/deck.md +1 -1

ARCHITECTURE.md CHANGED Viewed

@@ -1,13 +1,13 @@
-# Riprap — Architecture
 > **What it is.** A web tool that takes any NYC address and produces a
-> short, citation-grounded **flood-exposure briefing** — a tier (1–4)
 > with a paragraph of evidence, where every numeric claim links back to
 > the specific dataset, agency report, or model output it came from.
 >
 > **Who it's for.** Urban planners, journalists on deadline, NYCEM
 > grant writers filing FEMA BRIC sub-applications, agency capital
-> planners, researchers under FOIL/IRB constraints — *not* consumers
 > shopping for flood insurance.
 >
 > **Why local foundation models.** A newsroom with FOIL'd documents
@@ -21,56 +21,56 @@
 ## 1. A 60-second primer on NYC flooding
-Skip if you already know this. Most architecture docs assume you do —
-this one doesn't.
 ### 1.1 Three kinds of flood
 NYC gets hit by three flood mechanisms that look completely different
 on a map and are caused by different physics:
-- **Coastal / surge flooding** — The ocean rises into the city.
   Driven by storm surge (wind pushing water against the coast),
   astronomical high tide, and wave run-up. Affects the **shoreline:**
   Brighton Beach, Coney Island, Red Hook, Lower Manhattan, the
   Rockaways, Staten Island east shore. **Hurricane Sandy 2012** is
-  the canonical event — water came over the seawall and flooded
   subway tunnels, hospitals, and electrical substations. Affects
   buildings that were dry that morning.
-- **Pluvial / stormwater flooding** — Rain falls faster than the
   drainage system can carry it away. Affects **inland low points,
   basement apartments, and chronically under-sewered neighborhoods**:
   Hollis (Queens), Carroll Gardens (Brooklyn), Jamaica. **Hurricane
-  Ida 2021** is the canonical event for NYC — most of the deaths
   were in basement apartments far from any coast. Optical satellites
   largely *can't see* this kind of flooding because the water drains
   fast and is often sub-surface.
-- **Compound flooding** — Coastal + pluvial happening at the same
   time, with groundwater rising too. Currently the active research
   frontier (NPCC4 Ch. 3 calls it out explicitly). Most agencies model
   these mechanisms separately; reality combines them.
 A good civic flood tool has to cover all three and be honest about
 what each signal can and cannot see. Riprap surfaces evidence for all
-three but **doesn't predict damage** — see scope below.
 ### 1.2 Empirical vs modeled vs proxy
 Each piece of flood evidence falls into one of three classes, and the
 distinction matters for how much weight to give it:
-- **Empirical** — Something flooded a place and was measured. USGS
   high-water marks (people went out after Hurricane Ida and surveyed
   where water reached on building walls). The 2012 Sandy Inundation
   Zone (mapped by the city after the storm). FloodNet ultrasonic
   sensors that recorded an actual depth. **Highest-confidence**: this
   flood happened here.
-- **Modeled scenarios** — Hydraulic models simulate "what if" cases.
   FEMA's regulatory floodplains (1 % and 0.2 % annual chance). NYC
   DEP's Stormwater Maps (modeled water depth under three rainfall
   scenarios with varying sea-level-rise assumptions). **Useful but
   scenario-bounded**: this could happen here under those conditions.
-- **Proxy signals** — Indirect indicators of flooding. NYC 311
   complaints ("street flooding", "sewer backup") clustering around an
   address. Topographic indices (HAND, TWI) suggesting water *would*
   pool here based on terrain. **Useful but biased**: 311 reflects
@@ -79,7 +79,7 @@ distinction matters for how much weight to give it:
 Riprap surfaces all three classes. The score weights them in that
 order (empirical > modeled > proxy), with empirical hits granted a
-**floor rule** — see [§5](#5-the-scoring-rubric).
 ### 1.3 Hydrology indices used in this app
@@ -87,11 +87,11 @@ Two terrain-derived numbers come up repeatedly. They're cheap to
 compute from a Digital Elevation Model (DEM) and they're the
 hydrological literature's canonical exposure proxies:
-- **HAND (Height Above Nearest Drainage)** — Vertical distance from
   the address up to the nearest river/drainage channel. **<1 m** = at
   drainage level (water *will* reach here in flood). **>10 m** =
   hillslope (very dry). Nobre et al. 2011.
-- **TWI (Topographic Wetness Index)** — `ln(catchment_area / tan
   slope)`. **High TWI** = water tends to accumulate here (large
   contributing area, gentle slope). Beven & Kirkby 1979.
@@ -102,7 +102,7 @@ Neither is a flood prediction; both are exposure indicators that say
 ## 2. What Riprap actually produces
-For a given address (or any of three modes — see [§4](#4-three-user-modes)),
 Riprap returns:
 1. **A tier 1–4** computed by a deterministic, published rubric
@@ -113,11 +113,11 @@ Riprap returns:
    *Status*, *Empirical evidence*, *Modeled scenarios*, *Policy
    context*. A section is omitted entirely if no specialist fired for
    it (silence-over-confabulation contract).
-3. **Evidence cards** — one per fired specialist, with the raw values
    and a link to the source dataset.
-4. **Map overlay** — the address pinned, with the empirical and
    modeled flood extents that overlap it.
-5. **Live "right now" signals** — active NWS flood alerts, current
    tide residual at the Battery, recent precipitation at the nearest
    ASOS, and a Granite TTM short-horizon forecast of the surge
    residual. **These do not modify the tier** (per IPCC AR6 WG II's
@@ -131,7 +131,7 @@ prose.
 ## 3. The Burr FSM and how the specialists chain
-Riprap is a **state machine** — a Burr FSM (DAGWorks) — that walks
 through a fixed list of "specialist" functions in order. Each
 specialist either produces a structured fact or stays silent. At the
 end, the reconciler reads all the produced facts and writes the
@@ -206,7 +206,7 @@ shows live as the FSM runs.
 | 8 | **nws_obs** *(live)*      | Recent precipitation from the nearest airport ASOS station (KNYC / KLGA / KJFK / KEWR / KFRG).                                                                                   | live               |
 | 9 | **ttm_forecast** *(live)* | Granite TTM r2 zero-shot forecast of the surge **residual** at the Battery for the next ~9.6 h. NOAA already publishes the astronomical tide; TTM forecasts the part NOAA doesn't. | live (model-derived) |
 | 10 | **microtopo**            | LiDAR-derived terrain features at the point: elevation, HAND, TWI, local relief percentile.                                                                                      | proxy              |
-| 11 | **ida_hwm**              | USGS Hurricane Ida 2021 high-water marks — actual measured water heights surveyed in the days after the storm.                                                                   | empirical          |
 | 12 | **prithvi**              | NASA/IBM Prithvi-EO 2.0 segmentation of Sentinel-2 imagery for the Ida pre/post pair. Pre-computed offline; serves point-in-polygon queries against the resulting 166 polygons.   | empirical (model-derived) |
 | 13 | **rag**                  | Granite Embedding 278M retrieves the most-relevant paragraphs from 5 NYC policy PDFs (Comptroller, NPCC4, MTA, NYCHA, ConEd) given the address's borough + which scenarios fired. | policy             |
 | 14 | **reconcile**            | Granite 4.1 :3b reads all the documents produced by steps 1–13 and writes the cited briefing paragraph. See [§6](#6-document-grounded-reconciliation).                            | LLM synthesis      |
@@ -219,7 +219,7 @@ address:
 | Step | What it returns |
 |---|---|
 | geocode    | `(40.5780, -73.9617)`, BBL `3-08660-0001`, Brooklyn |
-| sandy      | **YES** — inside the 2012 Sandy Inundation Zone |
 | dep_stormwater | `dep_moderate_2050`: depth 0.4-0.8 ft; `dep_extreme_2080`: depth 0.8-2.0 ft |
 | floodnet   | 2 sensors within 600 m; 1 trigger event in last 3 yr (peak 14 cm) |
 | nyc311     | 11 flood-related complaints in 200 m, 5-yr window |
@@ -267,7 +267,7 @@ saw those headers and didn't invent them.
 |---------------------------------------|------------------|---|
 | `/`                                   | **Single address** | Geocode → run the full FSM → cited paragraph + map. Live demo path. |
 | `/compare`                            | **Compare**      | Two addresses side by side; parallel FSM runs (`asyncio.to_thread`, `OLLAMA_NUM_PARALLEL=2`). Useful for "this site vs the alternative". |
-| `/register/{schools,nycha,mta_entrances}` | **Register** | Pre-computed bulk runs over NYC public-asset registries — 126 schools, 45 NYCHA developments, ~1,900 MTA subway entrances. Loaded from `data/registers/*.json` at boot. |
 Single-address is the live path. Registers are pre-computed because
 running 1,900 reconciler calls at request time is a non-starter; the
@@ -302,14 +302,14 @@ Tier 4, 0 → Tier 0.
 ### 5.2 Max-empirical floor
 If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
-the tier is capped at **2 (Elevated)** — it cannot be worse,
 regardless of the additive composite.
 This recovers the *important* multiplicative behaviour Balica 2012
 argues for (empirical observations should not be cancelled by
 terrain or modeled scenarios) without giving up additive transparency.
 The 100 m radius is chosen because USGS HWM positional uncertainty is
-typically 5–30 m — 100 m gives ~3σ headroom for a confident "this
 address was inundated" signal.
 ### 5.3 Live signals stay out
@@ -366,7 +366,7 @@ omitted entirely.
 `app/mellea_validator.py` wraps the Granite-via-Ollama call in IBM
 Research's [Mellea](https://github.com/generative-computing/mellea)
-framework — instruct, validate, repair. The synthesis intents call
 `reconcile_strict_streaming(...)` which:
 1. **Streams** each generation attempt's tokens to the user (via the
@@ -374,21 +374,21 @@ framework — instruct, validate, repair. The synthesis intents call
    `progress_q` for the polygon intents).
 2. After each attempt, runs **four deterministic checks** on the
    accumulated paragraph:
-   - **`numerics_grounded`** — every non-trivial number in the output
      appears verbatim in a source document.
-   - **`no_placeholder_tokens`** — output contains no leaked
      `[source]` / `<document>` template markup.
-   - **`citations_dense`** — every non-trivial number has a
      `[doc_id]` citation **somewhere in the same sentence** (sentence
      boundaries: `. ` / `.\n` / end-of-text).
-   - **`citations_resolve`** — cited `doc_id`s are a subset of the
      input doc_ids.
 3. If any check fails, fires a `mellea_attempt` SSE event with the
    failed-requirement names, then **rerolls** with a feedback prompt
    that names the specific failing sentences (the model usually
    responds well to surgical corrections). Loop budget: 3 attempts.
-The frontend renders an inline banner above the briefing — amber on
 reroll (with the failed-req list), green on first-try pass. The final
 reconcile step in the trace shows the `passed: N/4 · rerolls: M`
 metadata for full audit transparency.
@@ -411,12 +411,12 @@ X}fact<|end_of_cite|>` mode. **It's deprecated in 4.x.** Verified:
 - The official Ollama chat template for `granite4.x` has no citation
   branch (the 3.3 / 4.0-preview templates did).
 - `granite_common` ships only `granite3/granite32` and
-  `granite3/granite33` subdirs — no 4.x equivalent.
 - `granite-io` has only `granite_3_2/` and `granite_3_3/` processor
   dirs.
 The base 4.1 weights still contain the cite tokens (training residue),
-so the model emits them as real tokens when nudged — but only as an
 end-of-response list, not inline in prose. IBM's published 4.x
 grounding path is a separate **Citation Generation LoRA** (built on
 `granite-4.0-micro`, not 4.1) requiring HF transformers + LoRA
@@ -452,11 +452,11 @@ drop noise, <1 km² to drop tidal artifacts) into **166 polygons**
 baked into `data/prithvi_ida_2021.geojson`. The runtime FSM does a
 point-in-polygon test, not fresh inference. This is honest about
 where foundation models earn their keep: **once, to produce a
-defensible event-level signal — not per request**.
 ### 7.2 Why TTM r2 runs live
-TTM r2 is **1.5 M params** — vastly smaller than Prithvi or Granite
 4.1. Inference is millisecond-scale even on CPU. It forecasts only
 the residual (surge component) at the Battery, which complements the
 NOAA snapshot specialist; it does **not** try to forecast the
@@ -474,7 +474,7 @@ from static layers and is handled separately:
   unless source data changed.
 - **Cadence**: NOAA tides update every 6 min; NWS alerts on push;
   NWS obs ~hourly; TTM is computed per query (cheap).
-- **Failure mode**: graceful — if NOAA times out, no `noaa_tides`
   doc is emitted; the reconciler simply doesn't see it.
 This mirrors how First Street separates Flood Factor (static, 30-yr)
@@ -551,7 +551,7 @@ riprap-nyc/
       compare.html / .js     two-address side-by-side
       register.html / .js    bulk register browser
       style.css              IBM Plex Sans, Planning Labs idiom
-      dist/                  Svelte 5 custom-element bundle (committed —
                               HF Spaces doesn't run a Node build).
                               Built from web/svelte/ via `npm run build`.
@@ -601,7 +601,7 @@ riprap-nyc/
   type, electrical hardening, drainage condition), social capacity,
   and financial absorption are out of scope.
 - **No sub-surface flooding.** Optical satellites can't see basement
-  apartments or subway entrances — the dominant Hurricane Ida damage
   mode in NYC. Prithvi correctly emits no polygons for Hollis or
   Carroll Gardens. That silence is a feature, not a bug.
 - **Vintage-bounded.** FEMA NFHL is years stale; DEP Stormwater Maps
@@ -657,7 +657,7 @@ weights load and TTM downloads. Warm queries:
   adds ~25 s)
 The Svelte bundle in `web/static/dist/` is committed, so HF Spaces
-runs no Node build step — only the Python deps + Ollama install.
 ### 12.2 Local development

+# Riprap architecture
 > **What it is.** A web tool that takes any NYC address and produces a
+> short, citation-grounded **flood-exposure briefing**. A tier (1–4)
 > with a paragraph of evidence, where every numeric claim links back to
 > the specific dataset, agency report, or model output it came from.
 >
 > **Who it's for.** Urban planners, journalists on deadline, NYCEM
 > grant writers filing FEMA BRIC sub-applications, agency capital
+> planners, researchers under FOIL/IRB constraints. Not consumers
 > shopping for flood insurance.
 >
 > **Why local foundation models.** A newsroom with FOIL'd documents
 ## 1. A 60-second primer on NYC flooding
+Skip if you already know this. Most architecture docs assume you do.
+This one doesn't.
 ### 1.1 Three kinds of flood
 NYC gets hit by three flood mechanisms that look completely different
 on a map and are caused by different physics:
+- **Coastal / surge flooding**. The ocean rises into the city.
   Driven by storm surge (wind pushing water against the coast),
   astronomical high tide, and wave run-up. Affects the **shoreline:**
   Brighton Beach, Coney Island, Red Hook, Lower Manhattan, the
   Rockaways, Staten Island east shore. **Hurricane Sandy 2012** is
+  the canonical event. Water came over the seawall and flooded
   subway tunnels, hospitals, and electrical substations. Affects
   buildings that were dry that morning.
+- **Pluvial / stormwater flooding**. Rain falls faster than the
   drainage system can carry it away. Affects **inland low points,
   basement apartments, and chronically under-sewered neighborhoods**:
   Hollis (Queens), Carroll Gardens (Brooklyn), Jamaica. **Hurricane
+  Ida 2021** is the canonical event for NYC. Most of the deaths
   were in basement apartments far from any coast. Optical satellites
   largely *can't see* this kind of flooding because the water drains
   fast and is often sub-surface.
+- **Compound flooding**. Coastal + pluvial happening at the same
   time, with groundwater rising too. Currently the active research
   frontier (NPCC4 Ch. 3 calls it out explicitly). Most agencies model
   these mechanisms separately; reality combines them.
 A good civic flood tool has to cover all three and be honest about
 what each signal can and cannot see. Riprap surfaces evidence for all
+three but **doesn't predict damage**. See scope below.
 ### 1.2 Empirical vs modeled vs proxy
 Each piece of flood evidence falls into one of three classes, and the
 distinction matters for how much weight to give it:
+- **Empirical**. Something flooded a place and was measured. USGS
   high-water marks (people went out after Hurricane Ida and surveyed
   where water reached on building walls). The 2012 Sandy Inundation
   Zone (mapped by the city after the storm). FloodNet ultrasonic
   sensors that recorded an actual depth. **Highest-confidence**: this
   flood happened here.
+- **Modeled scenarios**. Hydraulic models simulate "what if" cases.
   FEMA's regulatory floodplains (1 % and 0.2 % annual chance). NYC
   DEP's Stormwater Maps (modeled water depth under three rainfall
   scenarios with varying sea-level-rise assumptions). **Useful but
   scenario-bounded**: this could happen here under those conditions.
+- **Proxy signals**. Indirect indicators of flooding. NYC 311
   complaints ("street flooding", "sewer backup") clustering around an
   address. Topographic indices (HAND, TWI) suggesting water *would*
   pool here based on terrain. **Useful but biased**: 311 reflects
 Riprap surfaces all three classes. The score weights them in that
 order (empirical > modeled > proxy), with empirical hits granted a
+**floor rule**. See [§5](#5-the-scoring-rubric).
 ### 1.3 Hydrology indices used in this app
 compute from a Digital Elevation Model (DEM) and they're the
 hydrological literature's canonical exposure proxies:
+- **HAND (Height Above Nearest Drainage)**. Vertical distance from
   the address up to the nearest river/drainage channel. **<1 m** = at
   drainage level (water *will* reach here in flood). **>10 m** =
   hillslope (very dry). Nobre et al. 2011.
+- **TWI (Topographic Wetness Index)**. `ln(catchment_area / tan
   slope)`. **High TWI** = water tends to accumulate here (large
   contributing area, gentle slope). Beven & Kirkby 1979.
 ## 2. What Riprap actually produces
+For a given address (or any of three modes; see [§4](#4-three-user-modes)),
 Riprap returns:
 1. **A tier 1–4** computed by a deterministic, published rubric
    *Status*, *Empirical evidence*, *Modeled scenarios*, *Policy
    context*. A section is omitted entirely if no specialist fired for
    it (silence-over-confabulation contract).
+3. **Evidence cards**. One per fired specialist, with the raw values
    and a link to the source dataset.
+4. **Map overlay**. The address pinned, with the empirical and
    modeled flood extents that overlap it.
+5. **Live "right now" signals**. Active NWS flood alerts, current
    tide residual at the Battery, recent precipitation at the nearest
    ASOS, and a Granite TTM short-horizon forecast of the surge
    residual. **These do not modify the tier** (per IPCC AR6 WG II's
 ## 3. The Burr FSM and how the specialists chain
+Riprap is a **state machine**, a Burr FSM (DAGWorks), that walks
 through a fixed list of "specialist" functions in order. Each
 specialist either produces a structured fact or stays silent. At the
 end, the reconciler reads all the produced facts and writes the
 | 8 | **nws_obs** *(live)*      | Recent precipitation from the nearest airport ASOS station (KNYC / KLGA / KJFK / KEWR / KFRG).                                                                                   | live               |
 | 9 | **ttm_forecast** *(live)* | Granite TTM r2 zero-shot forecast of the surge **residual** at the Battery for the next ~9.6 h. NOAA already publishes the astronomical tide; TTM forecasts the part NOAA doesn't. | live (model-derived) |
 | 10 | **microtopo**            | LiDAR-derived terrain features at the point: elevation, HAND, TWI, local relief percentile.                                                                                      | proxy              |
+| 11 | **ida_hwm**              | USGS Hurricane Ida 2021 high-water marks. Actual measured water heights surveyed in the days after the storm.                                                                   | empirical          |
 | 12 | **prithvi**              | NASA/IBM Prithvi-EO 2.0 segmentation of Sentinel-2 imagery for the Ida pre/post pair. Pre-computed offline; serves point-in-polygon queries against the resulting 166 polygons.   | empirical (model-derived) |
 | 13 | **rag**                  | Granite Embedding 278M retrieves the most-relevant paragraphs from 5 NYC policy PDFs (Comptroller, NPCC4, MTA, NYCHA, ConEd) given the address's borough + which scenarios fired. | policy             |
 | 14 | **reconcile**            | Granite 4.1 :3b reads all the documents produced by steps 1–13 and writes the cited briefing paragraph. See [§6](#6-document-grounded-reconciliation).                            | LLM synthesis      |
 | Step | What it returns |
 |---|---|
 | geocode    | `(40.5780, -73.9617)`, BBL `3-08660-0001`, Brooklyn |
+| sandy      | **YES**. Inside the 2012 Sandy Inundation Zone |
 | dep_stormwater | `dep_moderate_2050`: depth 0.4-0.8 ft; `dep_extreme_2080`: depth 0.8-2.0 ft |
 | floodnet   | 2 sensors within 600 m; 1 trigger event in last 3 yr (peak 14 cm) |
 | nyc311     | 11 flood-related complaints in 200 m, 5-yr window |
 |---------------------------------------|------------------|---|
 | `/`                                   | **Single address** | Geocode → run the full FSM → cited paragraph + map. Live demo path. |
 | `/compare`                            | **Compare**      | Two addresses side by side; parallel FSM runs (`asyncio.to_thread`, `OLLAMA_NUM_PARALLEL=2`). Useful for "this site vs the alternative". |
+| `/register/{schools,nycha,mta_entrances}` | **Register** | Pre-computed bulk runs over NYC public-asset registries. 126 schools, 45 NYCHA developments, ~1,900 MTA subway entrances. Loaded from `data/registers/*.json` at boot. |
 Single-address is the live path. Registers are pre-computed because
 running 1,900 reconciler calls at request time is a non-starter; the
 ### 5.2 Max-empirical floor
 If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
+the tier is capped at **2 (Elevated)**. It cannot be worse,
 regardless of the additive composite.
 This recovers the *important* multiplicative behaviour Balica 2012
 argues for (empirical observations should not be cancelled by
 terrain or modeled scenarios) without giving up additive transparency.
 The 100 m radius is chosen because USGS HWM positional uncertainty is
+typically 5–30 m. 100 m gives ~3σ headroom for a confident "this
 address was inundated" signal.
 ### 5.3 Live signals stay out
 `app/mellea_validator.py` wraps the Granite-via-Ollama call in IBM
 Research's [Mellea](https://github.com/generative-computing/mellea)
+framework. Instruct, validate, repair. The synthesis intents call
 `reconcile_strict_streaming(...)` which:
 1. **Streams** each generation attempt's tokens to the user (via the
    `progress_q` for the polygon intents).
 2. After each attempt, runs **four deterministic checks** on the
    accumulated paragraph:
+   - **`numerics_grounded`**. Every non-trivial number in the output
      appears verbatim in a source document.
+   - **`no_placeholder_tokens`**. Output contains no leaked
      `[source]` / `<document>` template markup.
+   - **`citations_dense`**. Every non-trivial number has a
      `[doc_id]` citation **somewhere in the same sentence** (sentence
      boundaries: `. ` / `.\n` / end-of-text).
+   - **`citations_resolve`**. Cited `doc_id`s are a subset of the
      input doc_ids.
 3. If any check fails, fires a `mellea_attempt` SSE event with the
    failed-requirement names, then **rerolls** with a feedback prompt
    that names the specific failing sentences (the model usually
    responds well to surgical corrections). Loop budget: 3 attempts.
+The frontend renders an inline banner above the briefing. Amber on
 reroll (with the failed-req list), green on first-try pass. The final
 reconcile step in the trace shows the `passed: N/4 · rerolls: M`
 metadata for full audit transparency.
 - The official Ollama chat template for `granite4.x` has no citation
   branch (the 3.3 / 4.0-preview templates did).
 - `granite_common` ships only `granite3/granite32` and
+  `granite3/granite33` subdirs. No 4.x equivalent.
 - `granite-io` has only `granite_3_2/` and `granite_3_3/` processor
   dirs.
 The base 4.1 weights still contain the cite tokens (training residue),
+so the model emits them as real tokens when nudged. But only as an
 end-of-response list, not inline in prose. IBM's published 4.x
 grounding path is a separate **Citation Generation LoRA** (built on
 `granite-4.0-micro`, not 4.1) requiring HF transformers + LoRA
 baked into `data/prithvi_ida_2021.geojson`. The runtime FSM does a
 point-in-polygon test, not fresh inference. This is honest about
 where foundation models earn their keep: **once, to produce a
+defensible event-level signal. Not per request**.
 ### 7.2 Why TTM r2 runs live
+TTM r2 is **1.5 M params**. Vastly smaller than Prithvi or Granite
 4.1. Inference is millisecond-scale even on CPU. It forecasts only
 the residual (surge component) at the Battery, which complements the
 NOAA snapshot specialist; it does **not** try to forecast the
   unless source data changed.
 - **Cadence**: NOAA tides update every 6 min; NWS alerts on push;
   NWS obs ~hourly; TTM is computed per query (cheap).
+- **Failure mode**: graceful. If NOAA times out, no `noaa_tides`
   doc is emitted; the reconciler simply doesn't see it.
 This mirrors how First Street separates Flood Factor (static, 30-yr)
       compare.html / .js     two-address side-by-side
       register.html / .js    bulk register browser
       style.css              IBM Plex Sans, Planning Labs idiom
+      dist/                  Svelte 5 custom-element bundle (committed.
                               HF Spaces doesn't run a Node build).
                               Built from web/svelte/ via `npm run build`.
   type, electrical hardening, drainage condition), social capacity,
   and financial absorption are out of scope.
 - **No sub-surface flooding.** Optical satellites can't see basement
+  apartments or subway entrances. The dominant Hurricane Ida damage
   mode in NYC. Prithvi correctly emits no polygons for Hollis or
   Carroll Gardens. That silence is a feature, not a bug.
 - **Vintage-bounded.** FEMA NFHL is years stale; DEP Stormwater Maps
   adds ~25 s)
 The Svelte bundle in `web/static/dist/` is committed, so HF Spaces
+runs no Node build step. Only the Python deps + Ollama install.
 ### 12.2 Local development

METHODOLOGY.md CHANGED Viewed

@@ -1,4 +1,4 @@
-# Riprap — Scoring Methodology
 > Riprap produces a **flood-exposure tier (1–4) per NYC address**, not
 > a calibrated damage probability. The tier is a deterministic
@@ -8,7 +8,7 @@
 ## 1. Why this design
 Closed-methodology scores (First Street, Jupiter, Fathom) are useful
-products but uncitable in civic work — a NYCEM grant writer can't quote
 "0.73" in a FEMA BRIC sub-application without a defensible audit trail.
 At the same time, an LLM-emitted score would be non-reproducible and
 uncalibrated, with documented LLM-as-judge pathologies (Zheng et al.
@@ -25,19 +25,19 @@ The composite construction follows a well-trodden path in the multi-
 indicator vulnerability/exposure literature:
 - **Cutter, Boruff & Shirley (2003)**, *Social Science Quarterly* 84(2):
-  242–261 — the SoVI hazards-of-place pattern: group indicators
   thematically; sum factors with equal weights because there is no
   defensible theoretical basis for differential weighting.
-- **Tate (2012)**, *Natural Hazards* 63: 325–347 — explicit Monte Carlo
   sensitivity analysis showing that hierarchical equal-weighted
   composites are the most rank-stable. This is why we use equal weights
   *within* sub-indices.
 - **Balica, Wright & van der Meulen (2012)**, *Natural Hazards* 64:
-  73–105 — Coastal City Flood Vulnerability Index, multiplicative
   (Exposure × Susceptibility / Resilience). We adopt only the
   override-behavior of multiplicative form, as a "max-empirical floor"
   (§4 below), because we have no resilience term.
-- **Kim et al. (2019)**, *Scientific Reports* 9:18564 — additive vs
   geometric aggregation; additive is more transparent and reproducible
   *if* sub-indices are pre-grouped thematically. Done.
@@ -58,15 +58,15 @@ Binary "inside zone" indicators with weights ordered by agency tiering:
 | Indicator                       | Weight | Citation |
 |---------------------------------|-------:|----------|
-| FEMA NFHL 1% (SFHA)             | 1.00   | FEMA NFHL — regulatory mandate threshold |
-| FEMA NFHL 0.2%                  | 0.50   | FEMA NFHL — tail scenario |
 | NYC DEP Moderate-2050 + 2.5 ft  | 0.75   | NYC DEP Stormwater Maps 2021; NPCC4 Ch.3 |
-| NYC DEP Extreme-2080 + SLR      | 0.50   | NYC DEP Stormwater Maps 2021 — explicitly tail |
 | NYC DEP Tidal-2050              | 0.75   | NPCC4 Ch.3 coastal projection |
 Why DEP-2050 outranks DEP-2080: NPCC4 designates the 2080 extreme
 scenario as a **tail** projection. Closer-horizon coastal/pluvial
-maps — those a current planner can act on — get the higher weight.
 ### 3.2 Hydrological sub-index
@@ -99,7 +99,7 @@ Mix of binary observed-extent flags and banded count signals:
 | FloodNet trigger (3 yr)    | 0.75   | FloodNet NYC; NPCC4 Ch.3 references |
 The 311 and FloodNet weights are capped at 0.75 because both signals
-have documented coverage and reporting bias — 311 reflects civic
 engagement as well as flooding, FloodNet has uneven spatial coverage.
 Sandy and HWMs are 1.0 because they're engineered ground-truth
 observations.
@@ -116,7 +116,7 @@ Bands for 311 count (200 m buffer, 5-year window):
 ## 4. Max-empirical floor
 If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
-the tier is capped at **2 (Elevated)** — it cannot be worse, regardless
 of the additive composite.
 This recovers the *important* multiplicative behaviour Balica 2012
@@ -126,7 +126,7 @@ floor (a `min(tier, 2)` after composition) rather than a full
 multiplicative form so the composite remains additive and auditable.
 The 100 m radius is chosen because USGS HWM positional uncertainty is
-typically 5–30 m horizontal — 100 m gives ~3σ headroom for a confident
 "this address was inundated" signal.
 ## 5. Composite → tier mapping
@@ -177,7 +177,7 @@ Riprap's tier is **not**:
 It **is**:
-- An exposure prior — a literature-grounded, deterministic, reproducible
   index of how many publicly-documented flood signals overlap this
   address.
 - Auditable end-to-end: every term has a published source; every weight
@@ -200,7 +200,7 @@ These appear next to the tier badge and in the methodology disclosure:
 > **DEP 2050/2080 and FEMA 0.2% are bounding scenarios, not forecasts.**
 > The tier reads them as "if this scenario materialized, this address
-> would be inside its footprint" — not "this is the expected future."
 > **Compound flooding is not separately modeled.** Concurrence of rain
 > + tide + groundwater is the residual research frontier (NPCC4 Ch. 3).
@@ -244,7 +244,7 @@ Kim, S. et al. (2019). "Assessment of Aggregation Frameworks for
 Composite Indicators in Measuring Flood Vulnerability to Climate
 Change." *Scientific Reports* 9:18564.
-Nobre, A. D. et al. (2011). "Height Above the Nearest Drainage — A
 Hydrologically Relevant New Terrain Model." *Journal of Hydrology*
 404(1–2): 13–29.

+# Riprap scoring methodology
 > Riprap produces a **flood-exposure tier (1–4) per NYC address**, not
 > a calibrated damage probability. The tier is a deterministic
 ## 1. Why this design
 Closed-methodology scores (First Street, Jupiter, Fathom) are useful
+products but uncitable in civic work. A NYCEM grant writer can't quote
 "0.73" in a FEMA BRIC sub-application without a defensible audit trail.
 At the same time, an LLM-emitted score would be non-reproducible and
 uncalibrated, with documented LLM-as-judge pathologies (Zheng et al.
 indicator vulnerability/exposure literature:
 - **Cutter, Boruff & Shirley (2003)**, *Social Science Quarterly* 84(2):
+  242–261. The SoVI hazards-of-place pattern: group indicators
   thematically; sum factors with equal weights because there is no
   defensible theoretical basis for differential weighting.
+- **Tate (2012)**, *Natural Hazards* 63: 325–347. Explicit Monte Carlo
   sensitivity analysis showing that hierarchical equal-weighted
   composites are the most rank-stable. This is why we use equal weights
   *within* sub-indices.
 - **Balica, Wright & van der Meulen (2012)**, *Natural Hazards* 64:
+  73–105. Coastal City Flood Vulnerability Index, multiplicative
   (Exposure × Susceptibility / Resilience). We adopt only the
   override-behavior of multiplicative form, as a "max-empirical floor"
   (§4 below), because we have no resilience term.
+- **Kim et al. (2019)**, *Scientific Reports* 9:18564. Additive vs
   geometric aggregation; additive is more transparent and reproducible
   *if* sub-indices are pre-grouped thematically. Done.
 | Indicator                       | Weight | Citation |
 |---------------------------------|-------:|----------|
+| FEMA NFHL 1% (SFHA)             | 1.00   | FEMA NFHL. Regulatory mandate threshold |
+| FEMA NFHL 0.2%                  | 0.50   | FEMA NFHL. Tail scenario |
 | NYC DEP Moderate-2050 + 2.5 ft  | 0.75   | NYC DEP Stormwater Maps 2021; NPCC4 Ch.3 |
+| NYC DEP Extreme-2080 + SLR      | 0.50   | NYC DEP Stormwater Maps 2021. Explicitly tail |
 | NYC DEP Tidal-2050              | 0.75   | NPCC4 Ch.3 coastal projection |
 Why DEP-2050 outranks DEP-2080: NPCC4 designates the 2080 extreme
 scenario as a **tail** projection. Closer-horizon coastal/pluvial
+maps. Those a current planner can act on. Get the higher weight.
 ### 3.2 Hydrological sub-index
 | FloodNet trigger (3 yr)    | 0.75   | FloodNet NYC; NPCC4 Ch.3 references |
 The 311 and FloodNet weights are capped at 0.75 because both signals
+have documented coverage and reporting bias. 311 reflects civic
 engagement as well as flooding, FloodNet has uneven spatial coverage.
 Sandy and HWMs are 1.0 because they're engineered ground-truth
 observations.
 ## 4. Max-empirical floor
 If **Sandy 2012 inundation** OR **a USGS Ida HWM within 100 m** fired,
+the tier is capped at **2 (Elevated)**. It cannot be worse, regardless
 of the additive composite.
 This recovers the *important* multiplicative behaviour Balica 2012
 multiplicative form so the composite remains additive and auditable.
 The 100 m radius is chosen because USGS HWM positional uncertainty is
+typically 5–30 m horizontal. 100 m gives ~3σ headroom for a confident
 "this address was inundated" signal.
 ## 5. Composite → tier mapping
 It **is**:
+- An exposure prior. A literature-grounded, deterministic, reproducible
   index of how many publicly-documented flood signals overlap this
   address.
 - Auditable end-to-end: every term has a published source; every weight
 > **DEP 2050/2080 and FEMA 0.2% are bounding scenarios, not forecasts.**
 > The tier reads them as "if this scenario materialized, this address
+> would be inside its footprint". Not "this is the expected future."
 > **Compound flooding is not separately modeled.** Concurrence of rain
 > + tide + groundwater is the residual research frontier (NPCC4 Ch. 3).
 Composite Indicators in Measuring Flood Vulnerability to Climate
 Change." *Scientific Reports* 9:18564.
+Nobre, A. D. et al. (2011). "Height Above the Nearest Drainage. A
 Hydrologically Relevant New Terrain Model." *Journal of Hydrology*
 404(1–2): 13–29.

MONDAY.md CHANGED Viewed

@@ -8,7 +8,7 @@ Demo is **Sunday May 10**.
 Eight priorities closed against `audit/2026-05-03-evening-audit.md`:
 1. `pitch/cold_open.md` restored (was accidentally deleted in 1cb5ee6).
-2. Granite Guardian / refusal-classification leftovers removed —
    Mellea is the sole grounding mechanism, period.
 3. **Trace UI is now clickable.** Click any specialist row to reveal
    its raw structured output (formatted JSON, copy button,
@@ -23,9 +23,9 @@ Eight priorities closed against `audit/2026-05-03-evening-audit.md`:
    NYCHA-centroid) coloured by Sandy exposure with click popups
    showing name + `[doc_id]`. NYCHA polygon-fill is queued for when
    `geometry_geojson` lands in the dataclass.
-6. **`floodnet_forecast` specialist** — TTM r2 forecast on the
    nearest FloodNet sensor's flood-event recurrence. Reuses the
-   (512, 96) singleton already loaded for `ttm_311_forecast` —
    *no new model class loaded into memory*. The strongest single
    TTM win for the NYU CUSP audience.
 7. Trace UI groups TTM specialists under one parent node
@@ -60,7 +60,7 @@ fine-tune prep green.
 End-to-end smoke on "Coney Island Brooklyn" produced citations
 `[mta_entrance_56]`, `[nycha_dev_239]`, `[nycha_dev_166]` alongside
-`[rag_mta]` and `[nyc311]` — family-prefix chip routing works.
 Last commit: `86861be` (FSM integration of 4 register specialists).
@@ -85,7 +85,7 @@ Last commit: `86861be` (FSM integration of 4 register specialists).
   GPU with vLLM; both fit on one MI300X.
 - **TerraMind manifest is 1028 paired chips**, 2021-05 → 2026-04,
   NYC 5-borough hull +5 km, S2-cloud <30%, ≤3-day pair window. One
-  year (2022-05 → 2023-04) returned 0 due to PC API intermittency —
   acceptable for the micro-fine-tune.
 ## First thing Monday morning
@@ -112,7 +112,7 @@ Last commit: `86861be` (FSM integration of 4 register specialists).
    - A: ship the fine-tuned checkpoint as a Riprap specialist.
    - B: publish to HF as `msradam/TerraMind-1.0-NYC` with model card,
      don't ship in demo. **Bucket B is fully acceptable** per the
-     spec — civic-tech publication discipline is the durable goal.
    - C: discard checkpoint, no public artefact.
 ## Working on Monday
@@ -166,10 +166,10 @@ In rough priority order:
    tree.
 3. **WCAG 2.2 AA pass.**
 4. **Methodology paper draft** (6-8 page PDF). Goal: Saturday May 9.
-5. **Historical-event mode** — vintage-cutoff queries. Saturday.
 6. **Five Build-in-Public posts** through the week.
 7. **5-minute hackathon pitch + 3 demo queries.** Friday rehearsal.
-8. **ASCE talk materials** — May 13 (post-hackathon).
 ## Sharp edges to remember
@@ -196,14 +196,14 @@ In rough priority order:
 ## Files to read in order on Monday morning
 1. This file.
-2. `experiments/05_terramind_nyc_finetune/eval/eval_spec.md` — the
    contract for what training output triggers ship/publish/revert.
-3. `experiments/06_granite_guardian/RESULTS.md` — the Guardian →
    planner pivot decision record (so you know why Guardian is in
    the repo but not on the demo path).
-4. `experiments/07_mta_entrances/RESULTS.md` — the canonical
    register-specialist pattern (the other three follow it).
-5. `CLAUDE.md` — fix the AMD droplet IP (165.245.134.44 →
    129.212.182.52) at the same time as the first edit of the day.
 ## Status as of 2026-05-03 ~12:50 ET

 Eight priorities closed against `audit/2026-05-03-evening-audit.md`:
 1. `pitch/cold_open.md` restored (was accidentally deleted in 1cb5ee6).
+2. Granite Guardian / refusal-classification leftovers removed.
    Mellea is the sole grounding mechanism, period.
 3. **Trace UI is now clickable.** Click any specialist row to reveal
    its raw structured output (formatted JSON, copy button,
    NYCHA-centroid) coloured by Sandy exposure with click popups
    showing name + `[doc_id]`. NYCHA polygon-fill is queued for when
    `geometry_geojson` lands in the dataclass.
+6. **`floodnet_forecast` specialist**. TTM r2 forecast on the
    nearest FloodNet sensor's flood-event recurrence. Reuses the
+   (512, 96) singleton already loaded for `ttm_311_forecast`.
    *no new model class loaded into memory*. The strongest single
    TTM win for the NYU CUSP audience.
 7. Trace UI groups TTM specialists under one parent node
 End-to-end smoke on "Coney Island Brooklyn" produced citations
 `[mta_entrance_56]`, `[nycha_dev_239]`, `[nycha_dev_166]` alongside
+`[rag_mta]` and `[nyc311]`. Family-prefix chip routing works.
 Last commit: `86861be` (FSM integration of 4 register specialists).
   GPU with vLLM; both fit on one MI300X.
 - **TerraMind manifest is 1028 paired chips**, 2021-05 → 2026-04,
   NYC 5-borough hull +5 km, S2-cloud <30%, ≤3-day pair window. One
+  year (2022-05 → 2023-04) returned 0 due to PC API intermittency.
   acceptable for the micro-fine-tune.
 ## First thing Monday morning
    - A: ship the fine-tuned checkpoint as a Riprap specialist.
    - B: publish to HF as `msradam/TerraMind-1.0-NYC` with model card,
      don't ship in demo. **Bucket B is fully acceptable** per the
+     spec. Civic-tech publication discipline is the durable goal.
    - C: discard checkpoint, no public artefact.
 ## Working on Monday
    tree.
 3. **WCAG 2.2 AA pass.**
 4. **Methodology paper draft** (6-8 page PDF). Goal: Saturday May 9.
+5. **Historical-event mode**. Vintage-cutoff queries. Saturday.
 6. **Five Build-in-Public posts** through the week.
 7. **5-minute hackathon pitch + 3 demo queries.** Friday rehearsal.
+8. **ASCE talk materials**. May 13 (post-hackathon).
 ## Sharp edges to remember
 ## Files to read in order on Monday morning
 1. This file.
+2. `experiments/05_terramind_nyc_finetune/eval/eval_spec.md`. The
    contract for what training output triggers ship/publish/revert.
+3. `experiments/06_granite_guardian/RESULTS.md`. The Guardian →
    planner pivot decision record (so you know why Guardian is in
    the repo but not on the demo path).
+4. `experiments/07_mta_entrances/RESULTS.md`. The canonical
    register-specialist pattern (the other three follow it).
+5. `CLAUDE.md`. Fix the AMD droplet IP (165.245.134.44 →
    129.212.182.52) at the same time as the first edit of the day.
 ## Status as of 2026-05-03 ~12:50 ET

README.md CHANGED Viewed

@@ -11,12 +11,12 @@ pinned: false
   <img src="assets/logo@2x.png" width="72" height="72" alt="Riprap dam mark" />
 </p>
-# Riprap — citation-grounded NYC flood-exposure briefings
-Riprap takes any NYC address (or neighborhood, or development-permit query)
-and produces a four-section briefing — **Status / Empirical evidence /
-Modeled scenarios / Policy context** — where every numeric claim is
-anchored to a `[doc_id]` citation pointing back into the source document.
 The Capstone reconciler is **Granite 4.1** (8B, served via Ollama on T4
 or vLLM on AMD MI300X), wrapped in **Mellea**-validated rejection
@@ -29,11 +29,11 @@ Live demo: <https://lablab-ai-amd-developer-hackathon-riprap-nyc.hf.space>
 ---
-## How Riprap works — the Five Stones
-Behind every briefing, ~25 atomic specialists fan out across NYC datasets,
-satellite imagery, sensors, and forecasts. The **Five Stones** are a
-re-grouping of those specialists into five legible roles:
 > **Cornerstone** remembers. **Keystone** tallies. **Touchstone**
 > watches. **Lodestone** projects. **Capstone** writes it all down with
@@ -41,11 +41,11 @@ re-grouping of those specialists into five legible roles:
 | Stone | Role | What fires |
 |---|---|---|
-| **Cornerstone** | The Hazard Reader — what NYC's ground remembers | Sandy 2012 inundation extent, NYC DEP stormwater scenarios, 2021 Ida USGS high-water marks, baked Prithvi-EO Ida-attributable polygons, USGS 3DEP DEM + HAND/TWI |
-| **Keystone** | The Asset Register — what's exposed | MTA subway entrances, NYCHA developments, NYC DOE schools, NYS DOH hospitals, **TerraMind-NYC Buildings LoRA** |
-| **Touchstone** | The Live Observer — current state of the city | FloodNet ultrasonic depth sensors, NYC 311 flood complaints, NWS hourly METAR, NOAA tide-gauge water levels, **Prithvi-EO 2.0 NYC-Pluvial v2**, **TerraMind-NYC LULC LoRA** |
-| **Lodestone** | The Projector — what's coming | NWS public flood alerts, Granite TTM r2 surge nowcast (zero-shot, 6-min cadence, 9.6 h horizon), per-address 311 weekly forecast, FloodNet sensor recurrence forecast, **Granite-TTM-r2-Battery-Surge fine-tune** (96 h hourly horizon) |
-| **Capstone** | The Synthesiser — citation-grounded briefing | Granite 4.1 + Mellea rejection sampling |
 The four data-Stones run sequentially per query; the Capstone reconciles
 their documents into one cited paragraph.
@@ -55,45 +55,45 @@ their documents into one cited paragraph.
 ## NYC-specialised foundation models (Apache 2.0)
 Three NYC-specific fine-tunes built on AMD Instinct MI300X via AMD
-Developer Cloud, published under permissive licence:
-- **[`msradam/TerraMind-NYC-Adapters`](https://huggingface.co/msradam/TerraMind-NYC-Adapters)**
-  — LoRA family on TerraMind 1.0 base. LULC mIoU 0.5866 (+6.13 pp over
-  full-FT baseline), TiM 0.6023, Buildings 0.5511. Trained in ~18 min on
-  a single MI300X.
-- **[`msradam/Prithvi-EO-2.0-NYC-Pluvial`](https://huggingface.co/msradam/Prithvi-EO-2.0-NYC-Pluvial)**
-  — NYC pluvial-flood fine-tune of Prithvi-EO 2.0. Test flood IoU
-  0.5979 vs 0.10 on the Sen1Floods11 base — a 6× lift. Lovász-Softmax
-  loss + copy-paste augmentation.
-- **[`msradam/Granite-TTM-r2-Battery-Surge`](https://huggingface.co/msradam/Granite-TTM-r2-Battery-Surge)**
-  — NYC Battery storm-surge nowcast fine-tune of Granite TimeSeries TTM
-  r2. Test MAE 0.1091 m, −41% vs persistence and −25% vs zero-shot.
-All three are loaded at runtime by their respective FSM specialists in
 `app/context/` and `app/live/`.
 ---
 ## Architecture pointers
-- `app/stones/` — the Stones taxonomy (NAME / TAGLINE / SOURCES /
-  collect()) over the FSM specialists.
-- `app/fsm.py` — Burr FSM with one specialist per `@action`.
-- `app/reconcile.py` — `build_documents()` emits Granite-native
   document-role messages in canonical Stone order.
-- `app/mellea_validator.py` — strict reconcile path (4-check rejection
   sampling).
-- `app/llm.py` — LiteLLM Router shim. Routes to Ollama (T4 / local) or
   vLLM (AMD MI300X) without changing caller code.
-- `web/main.py` — FastAPI + SSE. The stream emits
   `plan / step / token / mellea_attempt / final` events plus the
   `stone_start / stone_done` envelope around each Stone group.
-- `web/sveltekit/` — primary UI (SvelteKit + adapter-static).
-- `web/svelte/` + `web/static/` — legacy custom-element bundle, still
-  serving `/legacy`, `/single`, `/compare`.
-- `experiments/18_terramind_nyc_lora/` /
-  `experiments/19_prithvi_nyc_v2/` /
-  `experiments/20_ttm_battery_surge/` — full reproduction recipes for
   the three HF artifacts above.
 ---
@@ -112,6 +112,9 @@ RIPRAP_LLM_API_KEY=<token> \
 # Programmatic Mellea probe (server must be running)
 .venv/bin/python scripts/probe_mellea.py --query "Hollis" --runs 5
 ```
 ---
@@ -122,7 +125,7 @@ Apache 2.0 (this repository). The three NYC-specialised models above
 are also Apache 2.0; underlying upstream models retain their own
 permissive licences (see each `MODEL_CARD.md`).
-Check out the HF Space configuration reference at
 <https://huggingface.co/docs/hub/spaces-config-reference>.
 ---

   <img src="assets/logo@2x.png" width="72" height="72" alt="Riprap dam mark" />
 </p>
+# Riprap: citation-grounded NYC flood-exposure briefings
+Riprap takes any NYC address (or neighborhood, or development-permit
+query) and produces a four-section briefing (Status, Empirical evidence,
+Modeled scenarios, Policy context). Every numeric claim is anchored to
+a `[doc_id]` citation pointing back into the source document.
 The Capstone reconciler is **Granite 4.1** (8B, served via Ollama on T4
 or vLLM on AMD MI300X), wrapped in **Mellea**-validated rejection
 ---
+## How Riprap works: the Five Stones
+Behind every briefing, around 25 atomic data probes fan out across NYC
+datasets, satellite imagery, sensors, and forecasts. The **Five Stones**
+group those probes into five legible roles:
 > **Cornerstone** remembers. **Keystone** tallies. **Touchstone**
 > watches. **Lodestone** projects. **Capstone** writes it all down with
 | Stone | Role | What fires |
 |---|---|---|
+| **Cornerstone** | The Hazard Reader. What NYC's ground remembers. | Sandy 2012 inundation extent, NYC DEP stormwater scenarios, 2021 Ida USGS high-water marks, baked Prithvi-EO Ida-attributable polygons, USGS 3DEP DEM + HAND/TWI |
+| **Keystone** | The Asset Register. What's exposed. | MTA subway entrances, NYCHA developments, NYC DOE schools, NYS DOH hospitals, **TerraMind-NYC Buildings LoRA** |
+| **Touchstone** | The Live Observer. Current state of the city. | FloodNet ultrasonic depth sensors, NYC 311 flood complaints, NWS hourly METAR, NOAA tide-gauge water levels, **Prithvi-EO 2.0 NYC-Pluvial v2**, **TerraMind-NYC LULC LoRA** |
+| **Lodestone** | The Projector. What's coming. | NWS public flood alerts, Granite TTM r2 surge nowcast (zero-shot, 6-min cadence, 9.6 h horizon), per-address 311 weekly forecast, FloodNet sensor recurrence forecast, **Granite-TTM-r2-Battery-Surge fine-tune** (96 h hourly horizon) |
+| **Capstone** | The Synthesiser. Citation-grounded briefing. | Granite 4.1 + Mellea rejection sampling |
 The four data-Stones run sequentially per query; the Capstone reconciles
 their documents into one cited paragraph.
 ## NYC-specialised foundation models (Apache 2.0)
 Three NYC-specific fine-tunes built on AMD Instinct MI300X via AMD
+Developer Cloud, published under permissive licence.
+**[`msradam/TerraMind-NYC-Adapters`](https://huggingface.co/msradam/TerraMind-NYC-Adapters).**
+LoRA family on TerraMind 1.0 base. LULC mIoU 0.5866 (+6.13 pp over
+full-FT baseline), TiM 0.6023, Buildings 0.5511. Trained in around 18
+minutes on a single MI300X.
+**[`msradam/Prithvi-EO-2.0-NYC-Pluvial`](https://huggingface.co/msradam/Prithvi-EO-2.0-NYC-Pluvial).**
+NYC pluvial-flood fine-tune of Prithvi-EO 2.0. Test flood IoU 0.5979
+vs 0.10 on the Sen1Floods11 base, a 6× lift. Lovász-Softmax loss with
+copy-paste augmentation.
+**[`msradam/Granite-TTM-r2-Battery-Surge`](https://huggingface.co/msradam/Granite-TTM-r2-Battery-Surge).**
+NYC Battery storm-surge nowcast fine-tune of Granite TimeSeries TTM
+r2. Test MAE 0.1091 m, −41% vs persistence and −25% vs zero-shot.
+All three are loaded at runtime by their respective FSM probes in
 `app/context/` and `app/live/`.
 ---
 ## Architecture pointers
+- `app/stones/`: the Stones taxonomy (NAME / TAGLINE / SOURCES /
+  collect()) over the FSM probes.
+- `app/fsm.py`: Burr FSM with one probe per `@action`.
+- `app/reconcile.py`: `build_documents()` emits Granite-native
   document-role messages in canonical Stone order.
+- `app/mellea_validator.py`: strict reconcile path (4-check rejection
   sampling).
+- `app/llm.py`: LiteLLM Router shim. Routes to Ollama (T4 / local) or
   vLLM (AMD MI300X) without changing caller code.
+- `web/main.py`: FastAPI + SSE. The stream emits
   `plan / step / token / mellea_attempt / final` events plus the
   `stone_start / stone_done` envelope around each Stone group.
+- `web/sveltekit/`: primary UI (SvelteKit + adapter-static).
+- `experiments/18_terramind_nyc_lora/`,
+  `experiments/19_prithvi_nyc_v2/`,
+  `experiments/20_ttm_battery_surge/`: full reproduction recipes for
   the three HF artifacts above.
 ---
 # Programmatic Mellea probe (server must be running)
 .venv/bin/python scripts/probe_mellea.py --query "Hollis" --runs 5
+# End-to-end address suite (5 NYC addresses, intent-aware checks)
+.venv/bin/python scripts/probe_addresses.py
 ```
 ---
 are also Apache 2.0; underlying upstream models retain their own
 permissive licences (see each `MODEL_CARD.md`).
+HF Space configuration reference:
 <https://huggingface.co/docs/hub/spaces-config-reference>.
 ---

RESEARCH.md CHANGED Viewed

@@ -1,4 +1,4 @@
-# Riprap — landscape research
 Captured 2026-05-06 as part of the AMD x lablab.ai hackathon polish
 phase. This document underpins the pitch deck (`slides/deck.md`) and
@@ -9,133 +9,133 @@ re-using any specific figure.
 ## What Riprap is, distinctly
-A **citation-grounded LLM that writes audit-quality flood-exposure
-briefings for NYC addresses** by fusing 17+ live, historical,
-modeled, and projected data sources, with **Mellea rejection
-sampling** that refuses to publish a numeric claim it can't cite.
-The output isn't a score — it's a four-section prose briefing with
-`[doc_id]` citations on every numeric assertion, where each
-`doc_id` resolves to one specific dataset (Sandy 2012 zone, NYC
-DEP scenario, USGS HWM, Sentinel-2 chip, NOAA gauge reading,
-NPCC4 SLR projection).
-Granite 4.1 8B drives the prose; Granite Embedding 278M + GLiNER
-drive policy-doc retrieval; Prithvi-EO 2.0 + TerraMind LULC/
-Buildings + Granite TTM r2 drive the EO and forecast specialists,
 with three Apache-2.0 NYC fine-tunes trained on AMD MI300X published
 on HF Hub.
-The architectural commitments other tools don't make:
-1. **Silence over confabulation** — when a specialist returns no
-   data, the briefing omits the section rather than papering over it.
-2. **Five-stone epistemic structure** so the user can see what's
    empirical vs modeled vs proxy vs synthetic.
-3. **Fully open-source pipeline** — Apache-2.0 end-to-end on
-   public-record data, no commercial APIs touched at runtime.
-4. **Deployable on either local Ollama or AMD MI300X via vLLM**
-   with auto-failover.
-Stack as of 2026-05-06: SvelteKit UI on HF Spaces (cpu-basic) at
-the AMD-hackathon org, FastAPI agent FSM, two-container droplet
-(vLLM + riprap-models) on MI300X, full address probe suite at 5/5
-PASS in 5.8–13.1 s end-to-end.
 ---
 ## Landscape map
-### Direct comps — score-based property risk tools
 | Tool | What it gives | Who it serves | Hidden cost |
 |---|---|---|---|
-| **First Street Risk Factor** (Flood Factor) | Score 1–10 + 30-yr risk narrative; powers Redfin, Realtor.com (until Dec 2025 also Zillow) | Homebuyers; some lenders | Closed model; commercial partnerships; Zillow removed it under industry pressure in Dec 2025 |
-| **ClimateCheck** | Score 1–100 + ~30-page property report; 2050 projections | Homeowners + REIT/PE diligence | Subscription tiers; methodology behind paywall |
 | **Jupiter ClimateScore Global** | Enterprise SaaS / API; financial metrics (CapEx, OpEx, credit risk) | Banks, insurers, asset managers | Enterprise pricing; not consumer-facing |
-| **Cervest / Climate X / ICEYE** | Variants of above for ESG / reinsurance | Corporate finance & insurance | Same |
-Score-based tools all converge on the same shape: **one number,
-one chart, an explainer paragraph**. None show what claim is
-grounded in which dataset. None expose the audit trail.
 ### NYC-specific government tools
-- **FloodHelpNY** (City + State, IDEO-designed) — address lookup
-  → flood-zone label + insurance + free resiliency audit. Forms-
-  based; consumer-facing; doesn't fuse live signals.
-- **NYC Flood Hazard Mapper** — ArcGIS web map of FEMA + NPCC +
-  Sandy + future scenarios. Static visualization; no narrative.
-- **NYC OEM Flood Maps page** — index of the above.
-- **EJNYC Flood Vulnerability Index** (released 2024-04 by
-  Mayor's Office of Climate & EJ) — first-ever city FVI, used to
-  direct spending under NY's "Disadvantaged Communities" framework
-  (35% of climate spend by law).
-- **FloodNet NYC** (NYU + CUNY + city) — 350+ ultrasonic sensors
-  at 1-min cadence, growing to 500 by end-2026. Has a public
   dashboard but no narrative layer.
 ### Federal / authoritative
-- **FEMA Flood Map Service Center / NFHL** — official; covers 90%+
-  population; static GIS layer + PDFs. The disclosure-of-record but
-  not a synthesis tool.
 ### Real-estate platforms (the volatile zone)
-- **Redfin** — still shows First Street Flood Factor on every
   listing.
-- **Realtor.com** — still shows it on 110M+ listings.
-- **Zillow** — **removed climate risk display in December 2025**
-  under California Regional MLS pressure. Still links out but it's
-  hidden. **This created a vacuum that an open citation-grounded
-  alternative could fill.**
 ### Closest academic / AI comps
-- **Flood-LLM** (Brisbane, MDPI Sustainability 2026) — multi-source
   LLM for property-level flood risk, validated on Brisbane against
   official labels. Academic, not deployed; no Mellea-style citation
   discipline; no live signals.
-- **GIS-Integrated Flood LLM** (Tandfonline 2024) — LLM constrained
-  by a flood knowledge graph + GIS interaction. Research artefact.
-- **FloodLense** (arXiv 2024) — UNet/RDN/ViT + LLM for satellite
   flood detection. Research; image-only.
 ---
-## Where Riprap fits — differentiators that demo well
 Ranked by visibility in a 3-minute demo:
 1. **Citation prose vs scores.** Riprap returns *"Hurricane Sandy
-   flooded this address on October 29–30, 2012, according to the
    empirical inundation zone [sandy]. 19 flood-related 311 service
-   requests were logged within 200 m over five years [nyc311]."* Every
-   number cites a doc; each doc resolves to a footer source row.
-   First Street returns "Flood Factor 8/10". This gap is the demo.
-2. **Live + historical + modeled + projected, in one paragraph.**
-   Sandy 2012 (empirical), DEP 2080 stormwater scenarios (modeled),
-   311 last 5 years (proxy), FloodNet last 3 years (empirical,
    hyperlocal), NPCC4 SLR (projected), Granite TTM r2 surge nowcast
    (96-h forecast). No comp combines all four temporal modes.
 3. **Open-source NYC fine-tunes.** Three Apache-2.0 models
    (`Prithvi-EO-2.0-NYC-Pluvial`, `TerraMind-NYC-Adapters`,
-   `Granite-TTM-r2-Battery-Surge`) trained on AMD MI300X — anyone
-   can reproduce, fork to other cities, or audit. First Street's
-   model is closed; ClimateCheck's methodology is behind a paywall.
 4. **AMD hardware story.** The whole stack runs on MI300X via vLLM
-   (LLM) + a sibling ROCm container (specialists). All Apache-2.0.
    This is the AMD hackathon track's preferred narrative: open
    models, open infra, open data, real GPU acceleration.
 5. **Mellea grounding receipts.** The four checks
-   (`numerics_grounded` / `no_placeholder_tokens` / `citations_dense`
-   / `citations_resolve`) are the audit. The meta card surfaces
-   "4/4 grounding checks passed, 1 reroll" — that's audit
-   credibility no consumer comp shows.
 6. **Self-aware silence.** Touchstone shows "FloodNet sensor: 0
-   events in 3 years" with `silent_by_design`. Lodestone shows
-   "TTM Battery surge forecast: peak |residual| < 0.3 m, omitted."
-   Most tools always render a value. Riprap's silence is a feature.
 ---
@@ -147,19 +147,19 @@ part of the system. These are the demo arcs to rehearse.
 ### 1. Resident / homebuyer (the FloodHelpNY swap-in)
 > *"I'm thinking about renting an apartment at 80 Pioneer Street,
->  Brooklyn — should I worry?"*
 **Demo arc.** Type the address. Watch the planner classify
-`single_address`, then 19 specialists fire across the four Stones
-in ~13 s. Briefing names Sandy 2012 inundation, 65 311 complaints,
-2 FloodNet sensors with 4 events including a 51 mm peak on a
-specific date, Ida 2021 HWM 130 m away, microtopo HAND 3.81 m + TWI
-14.79 (very high saturation propensity). Footer shows 7+ named
-primary sources.
 **Demo hook.** "Compare what we just generated to First Street's
-number-and-bar-chart for the same address. Which would you trust
-to make a $4,000/month decision?"
 ### 2. Real-estate attorney / disclosure compliance
@@ -168,32 +168,32 @@ to make a $4,000/month decision?"
 **Demo arc.** Same single_address path. Briefing produces a citable
 narrative covering FEMA designation, prior flood claims (where
-present), terrain, recent complaints. Mellea grounding check is
-the *qualifier* — "this prose is grounded against four invariants
-and passed 4/4."
 **Demo hook.** New York's March-2024 amended Property Condition
-Disclosure Statement requires sellers to disclose flood history
-and FEMA-floodplain status. RPL §231-b requires every residential
-lease to disclose prior flood damage. Riprap is the citable
-narrative tool — show how the briefing maps line-by-line to the
-disclosure requirements.
 ### 3. NYC OEM / DEP planner
 > *"Hollis, Queens"*
-**Demo arc.** Neighborhood intent fires (9 steps), produces NTA-
-level briefing — 434 flood-related 311 over 3 years (87 catch-
 basin clogged, 42 street-flooding), 4.3% of neighborhood projected
-to flood under DEP moderate-2050 scenario, 25% of cells with
-HAND<1 m. RAG retrieval pulls relevant DEP/NPCC4 policy paragraphs.
 **Demo hook.** DEP just announced a $30B stormwater priority list
 (86 locations) and a $68M Brooklyn Bluebelt expansion in Prospect
-Park. Riprap supports the prioritization argument with citable
-per-NTA evidence. Pair with the EJNYC Flood Vulnerability Index for
-the EJ-spending overlay (35%-to-disadvantaged-communities legal
 mandate).
 ### 4. Insurance underwriter / actuary
@@ -201,27 +201,27 @@ mandate).
 > *"442 East Houston Street, Manhattan"*
 **Demo arc.** Same as resident demo, but emphasize the **provenance
-trace** UI — every Specialist row, every doc_id, every source URL,
 vintage, and tier glyph.
 **Demo hook.** When an underwriter writes a risk memo, the audit
 chain matters. First Street's "we used a proprietary catastrophe
 model" doesn't survive a regulator review the way "we used FEMA
-Sandy 2012 polygon, NYC DEP 2021 stormwater scenario, USGS Ida
-HWM Event 312, NOAA gauge 8518750, NWS station KNYC, Granite TTM
-r2 fine-tune (test MAE 0.1091 m vs 0.1467 zero-shot, citable)" does.
 ### 5. Climate journalist / advocacy
 > *"Coney Island, Brooklyn"*
-**Demo arc.** Neighborhood briefing — 87.5% of NTA in 2012 Sandy
 zone, 382 flood complaints over 3 years, 7.8% projected flooded
-under 2050 moderate, 38.9% of DEM cells with HAND<1 m, DEP
-extreme-2080 → 44.2% flooded.
 **Demo hook.** ProPublica/NYTimes/THE CITY-style data journalism.
-Every claim in a Riprap briefing is reproducible — anyone can paste
 the same query and get a near-identical narrative. The journalist
 can publish the briefing as the methods section.
@@ -229,77 +229,79 @@ can publish the briefing as the methods section.
 > *"What are they building in Gowanus and is it risky"*
-**Demo arc.** Planner classifies `development_check`. FSM pulls
-DOB filings + flood layers for the project sites. Briefing comments
 on which proposed buildings sit inside Sandy 2012, which intersect
 DEP extreme-2080, what the microtopo says.
-**Demo hook.** Pre-design siting check. The Gowanus rezoning is
-one of NYC's largest active development zones, well known to flood.
-Show how the tool surfaces flood concerns before architects pour
 concrete.
 ---
-## Lateral & unexpected use cases
-10 bets, ordered roughly from most-buildable to most-speculative.
 1. **Pre-storm cohort briefings.** Subscribe Riprap to NWS flood-
    watch alerts. When a watch lands, fan out one briefing per
-   affected NTA + push to OEM / press / advocacy lists. Citable
    evidence on demand for the press cycle that follows.
 2. **Climate-grant evidence sections.** HUD CDBG-DR and FEMA BRIC
    applications need an auditable evidence base. Riprap auto-
    generates the "vulnerability assessment" section so a community
-   group can apply for resilience funding without hiring a consultant.
 3. **Local Law disclosure boilerplate.** Plug into a brokerage's
-   listing flow: when an agent enters an address, auto-generate
-   the NY RPL §231-b lease addendum or §462(2) disclosure draft.
-   ROI is high since the law took effect 2024 and many landlords
-   are still figuring out compliance.
 4. **MTA station-hardening prioritization.** Riprap already has the
-   MTA-entrance specialist (KEY-001 in the demo). Run the FSM
-   across all subway entrances; rank by exposure × ridership. The
-   MTA's October-2025 Climate Resilience Roadmap Update is the
-   policy hook.
-5. **DOE school siting.** When DOE reviews proposed school sites
-   or selects schools for retrofit, Riprap briefings (with `expect_311_ge`
-   + Sandy + DEP overlays) would catch flood exposure that form-
-   style screens miss.
-6. **Time-machine variant.** Re-run the FSM with snapshot data
-   from a past date — *"what would Riprap have said about Hollis
-   on August 31, 2021, the day before Ida?"* Useful for retrospective
-   analysis, expert testimony, and stress-testing the system.
-7. **Cross-city scaffold.** The architecture is NYC-specific by
-   data choice, not by code. Port to Houston (post-Harvey + Hurricane
    Beryl 2024), Miami (king tides), Boston (CSO floods), Charleston
-   (chronic tidal), with a per-city specialist set + RAG corpus.
 8. **Federation with FloodNet alerts.** When a sensor triggers a
    flood event NOW, fire a Riprap live_now briefing for the
    surrounding NTA: *"what's at stake in the next 6 hours."*
    Connects FloodNet's hyperlocal sensor reads to the OEM decision
    loop.
 9. **EJNYC × Riprap pairing.** Rank all 188 NTAs by Riprap-detected
-   exposure, intersect with state DAC designations. Output: a map
-   of "underserved + underwater" — the most underfunded high-
-   exposure neighborhoods.
 10. **Court testimony / expert witness.** Citable, reproducible
-    flood narrative as a court exhibit. The Mellea passes-record +
-    provenance trace are the kind of artefact a regulator or judge
-    can audit. Especially relevant after the December-2025 Zillow
-    controversy created public discussion of climate-data integrity.
 ---
-## Risks & framing
 - **Real-estate industry pushback.** December 2025: Zillow removed
   First Street's climate scores under MLS pressure because the data
   was hurting transaction volume. A free, citation-grounded
   alternative could face the same reflex. Riprap's defence is that
-  it's a *narrative tool for professional analytical work*, not a
-  buy/don't-buy verdict — keep the disclaimer footer prominent.
 - **Redlining hazard.** Exposure narratives can be misused by
   landlords or insurers to discriminate against high-flood-risk
   (often disproportionately disadvantaged) neighborhoods.
@@ -313,11 +315,11 @@ concrete.
   law. Don't position it as legal disclosure-of-record without a
   real-estate-attorney review.
 - **Cold-start latency.** First query after droplet redeploy is
-  ~30 s while models warm. For demos, ping the Space + run one
-  warm-up query 5 minutes before showtime.
-- **Geocoder edge cases.** "PS 188, Lower East Side" geocoded to
-  a Brooklyn PS 188 in our test suite. For demos, pick fully-
-  qualified street addresses; document the disambiguation behavior.
 ---
@@ -325,42 +327,42 @@ concrete.
 Concrete polish items the research surfaces, ranked by demo value:
-1. **Sample-query pills on landing.** Six clickable pills below
-   the search bar — one per persona above. Let the audience demo
    themselves.
-2. **A "What this is" bar at the top of the landing.** Three-line:
    *"Citation-grounded NYC flood briefings. Every number cites a
    primary source. Open-source, public data, audit-grade synthesis."*
 3. **Compare-mode link from the briefing.** Once Riprap delivers a
    single_address briefing, surface "compare with another address"
    as a one-click affordance. The compare intent already exists in
    the planner.
-4. **EJNYC-FVI overlay** on the map sidebar (#9 above) — Riprap's
    exposure × DAC designation, two clicks to a powerful editorial
    demo.
 5. **First-query warm-up message** during the cold start: *"loading
-   specialists on AMD MI300X — first query after redeploy takes
-   ~30 s; subsequent queries 5–13 s."*
 ---
 ## Sources
-- [First Street Foundation — Flood Factor methodology](https://firststreet.org/methodology/flood)
-- [FloodHelpNY — NYC + IDEO consumer tool](https://www.floodhelpny.org/en)
-- [ClimateCheck — flood risk methodology](https://climatecheck.com/risks/flood)
-- [Jupiter Intelligence — ClimateScore Global / FloodScore](https://www.jupiterintel.com/climatescore-global)
 - [FEMA Flood Map Service Center](https://msc.fema.gov/)
-- [NY State — RPL §231-b residential lease flood disclosure (2023)](https://www.nysenate.gov/legislation/bills/2021/S5472)
-- [NYSBA — Property Condition Disclosure flood-risk amendment (Mar 2024)](https://nysba.org/breaking-news-new-rules-on-property-condition-disclosure-and-flood-risk-go-into-effect-today/)
-- [CNN — Zillow removes climate risk data under industry pressure (Dec 2025)](https://www.cnn.com/2025/12/02/climate/zillow-climate-data-extreme-weather-first-street-redfin)
 - [NYC Stormwater Resiliency Plan](https://www.nyc.gov/assets/orr/pdf/publications/stormwater-resiliency-plan.pdf)
-- [FloodNet NYC — methodology + sensor network](https://www.floodnet.nyc/methodology)
-- [FloodNet WRR 2024 — peer-reviewed sensor paper](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023WR036806)
-- [EJNYC Report — Mayor's Office of Climate & Environmental Justice](https://climate.cityofnewyork.us/ejnyc-report/the-state-of-environmental-justice-in-nyc/)
-- [Flood-LLM — Brisbane case study (MDPI 2026)](https://www.mdpi.com/2071-1050/18/6/2957)
 - [GIS-Integrated Flood LLM (Tandfonline 2024)](https://www.tandfonline.com/doi/full/10.1080/13658816.2024.2306167)
-- [THE CITY — Disadvantaged Communities flood funding (NY Climate Law)](https://www.thecity.nyc/2022/05/02/billions-ny-climate-law-disadvantaged-communities-flood/)
-- [Inman — Redfin First Street integration](https://www.inman.com/2021/02/18/redfin-starts-displaying-flood-risk-data-on-listings/)
-- [FACTUM — citation-hallucination detection in long-form RAG](https://arxiv.org/pdf/2601.05866)
-- [AMD x lablab.ai Developer Hackathon (May 4–10, 2026)](https://lablab.ai/ai-hackathons/amd-developer)

+# Riprap: landscape research
 Captured 2026-05-06 as part of the AMD x lablab.ai hackathon polish
 phase. This document underpins the pitch deck (`slides/deck.md`) and
 ## What Riprap is, distinctly
+A citation-grounded LLM that writes audit-quality flood-exposure
+briefings for NYC addresses by fusing live, historical, modeled, and
+projected data sources. Mellea rejection sampling refuses to publish
+a numeric claim it can't cite. The output isn't a score. It's a
+four-section prose briefing with `[doc_id]` citations on every
+numeric assertion, where each `doc_id` resolves to one specific
+dataset (Sandy 2012 zone, NYC DEP scenario, USGS HWM, Sentinel-2
+chip, NOAA gauge reading, NPCC4 SLR projection).
+Granite 4.1 8B drives the prose. Granite Embedding 278M plus GLiNER
+drive policy-doc retrieval. Prithvi-EO 2.0, TerraMind LULC and
+Buildings, and Granite TTM r2 drive the EO and forecast probes,
 with three Apache-2.0 NYC fine-tunes trained on AMD MI300X published
 on HF Hub.
+Architectural commitments other tools don't make:
+1. **Silence over confabulation.** When a probe returns no data, the
+   briefing omits the section rather than papering over it.
+2. **Five-stone epistemic structure.** The user can see what's
    empirical vs modeled vs proxy vs synthetic.
+3. **Fully open-source pipeline.** Apache-2.0 end-to-end on public-
+   record data, no commercial APIs touched at runtime.
+4. **Deployable on either local Ollama or AMD MI300X via vLLM** with
+   auto-failover.
+Stack as of 2026-05-06: SvelteKit UI on HF Spaces (cpu-basic) at the
+AMD-hackathon org, FastAPI agent FSM, two-container droplet (vLLM
+plus riprap-models) on MI300X, full address probe suite at 5/5 PASS
+in 5.8 to 13.1 s end-to-end.
 ---
 ## Landscape map
+### Direct comps: score-based property risk tools
 | Tool | What it gives | Who it serves | Hidden cost |
 |---|---|---|---|
+| **First Street Risk Factor** (Flood Factor) | Score 1 to 10 plus 30-yr risk narrative; powers Redfin, Realtor.com (until Dec 2025 also Zillow) | Homebuyers; some lenders | Closed model; commercial partnerships; Zillow removed it under industry pressure in Dec 2025 |
+| **ClimateCheck** | Score 1 to 100 plus around 30-page property report; 2050 projections | Homeowners plus REIT/PE diligence | Subscription tiers; methodology behind paywall |
 | **Jupiter ClimateScore Global** | Enterprise SaaS / API; financial metrics (CapEx, OpEx, credit risk) | Banks, insurers, asset managers | Enterprise pricing; not consumer-facing |
+| **Cervest / Climate X / ICEYE** | Variants of above for ESG / reinsurance | Corporate finance and insurance | Same |
+Score-based tools all converge on the same shape: one number, one
+chart, an explainer paragraph. None show what claim is grounded in
+which dataset. None expose the audit trail.
 ### NYC-specific government tools
+- **FloodHelpNY** (City plus State, IDEO-designed). Address lookup
+  to flood-zone label plus insurance plus free resiliency audit.
+  Forms-based, consumer-facing, doesn't fuse live signals.
+- **NYC Flood Hazard Mapper.** ArcGIS web map of FEMA, NPCC, Sandy,
+  and future scenarios. Static visualization, no narrative.
+- **NYC OEM Flood Maps page.** Index of the above.
+- **EJNYC Flood Vulnerability Index** (released 2024-04 by Mayor's
+  Office of Climate and EJ). First-ever city FVI, used to direct
+  spending under NY's "Disadvantaged Communities" framework (35% of
+  climate spend by law).
+- **FloodNet NYC** (NYU plus CUNY plus city). Over 350 ultrasonic
+  sensors at 1-min cadence, growing to 500 by end-2026. Has a public
   dashboard but no narrative layer.
 ### Federal / authoritative
+- **FEMA Flood Map Service Center / NFHL.** Official; covers 90%+
+  of population; static GIS layer plus PDFs. The disclosure-of-
+  record but not a synthesis tool.
 ### Real-estate platforms (the volatile zone)
+- **Redfin.** Still shows First Street Flood Factor on every
   listing.
+- **Realtor.com.** Still shows it on 110M+ listings.
+- **Zillow.** Removed climate risk display in December 2025 under
+  California Regional MLS pressure. Still links out, but it's
+  hidden. This created a vacuum that an open citation-grounded
+  alternative could fill.
 ### Closest academic / AI comps
+- **Flood-LLM** (Brisbane, MDPI Sustainability 2026). Multi-source
   LLM for property-level flood risk, validated on Brisbane against
   official labels. Academic, not deployed; no Mellea-style citation
   discipline; no live signals.
+- **GIS-Integrated Flood LLM** (Tandfonline 2024). LLM constrained
+  by a flood knowledge graph plus GIS interaction. Research artefact.
+- **FloodLense** (arXiv 2024). UNet/RDN/ViT plus LLM for satellite
   flood detection. Research; image-only.
 ---
+## Where Riprap fits: differentiators that demo well
 Ranked by visibility in a 3-minute demo:
 1. **Citation prose vs scores.** Riprap returns *"Hurricane Sandy
+   flooded this address on October 29 to 30, 2012, according to the
    empirical inundation zone [sandy]. 19 flood-related 311 service
+   requests were logged within 200 m over five years [nyc311]."*
+   Every number cites a doc; each doc resolves to a footer source
+   row. First Street returns "Flood Factor 8/10". This gap is the
+   demo.
+2. **Live, historical, modeled, projected: in one paragraph.** Sandy
+   2012 (empirical), DEP 2080 stormwater scenarios (modeled), 311
+   last 5 years (proxy), FloodNet last 3 years (empirical,
    hyperlocal), NPCC4 SLR (projected), Granite TTM r2 surge nowcast
    (96-h forecast). No comp combines all four temporal modes.
 3. **Open-source NYC fine-tunes.** Three Apache-2.0 models
    (`Prithvi-EO-2.0-NYC-Pluvial`, `TerraMind-NYC-Adapters`,
+   `Granite-TTM-r2-Battery-Surge`) trained on AMD MI300X. Anyone can
+   reproduce, fork to other cities, or audit. First Street's model
+   is closed; ClimateCheck's methodology is behind a paywall.
 4. **AMD hardware story.** The whole stack runs on MI300X via vLLM
+   (LLM) plus a sibling ROCm container (probes). All Apache-2.0.
    This is the AMD hackathon track's preferred narrative: open
    models, open infra, open data, real GPU acceleration.
 5. **Mellea grounding receipts.** The four checks
+   (`numerics_grounded`, `no_placeholder_tokens`, `citations_dense`,
+   `citations_resolve`) are the audit. The meta card surfaces "4/4
+   grounding checks passed, 1 reroll". That's audit credibility no
+   consumer comp shows.
 6. **Self-aware silence.** Touchstone shows "FloodNet sensor: 0
+   events in 3 years" with `silent_by_design`. Lodestone shows "TTM
+   Battery surge forecast: peak |residual| < 0.3 m, omitted." Most
+   tools always render a value. Riprap's silence is a feature.
 ---
 ### 1. Resident / homebuyer (the FloodHelpNY swap-in)
 > *"I'm thinking about renting an apartment at 80 Pioneer Street,
+>  Brooklyn. Should I worry?"*
 **Demo arc.** Type the address. Watch the planner classify
+`single_address`, then 19 step events fire across the four data
+Stones in around 13 s. Briefing names Sandy 2012 inundation, 65 311
+complaints, 2 FloodNet sensors with 4 events including a 51 mm peak
+on a specific date, Ida 2021 HWM 130 m away, microtopo HAND 3.81 m
+plus TWI 14.79 (very high saturation propensity). Footer shows 7+
+named primary sources.
 **Demo hook.** "Compare what we just generated to First Street's
+number-and-bar-chart for the same address. Which would you trust to
+make a $4,000/month decision?"
 ### 2. Real-estate attorney / disclosure compliance
 **Demo arc.** Same single_address path. Briefing produces a citable
 narrative covering FEMA designation, prior flood claims (where
+present), terrain, recent complaints. Mellea grounding check is the
+qualifier: "this prose is grounded against four invariants and
+passed 4/4."
 **Demo hook.** New York's March-2024 amended Property Condition
+Disclosure Statement requires sellers to disclose flood history and
+FEMA-floodplain status. RPL §231-b requires every residential lease
+to disclose prior flood damage. Riprap is the citable narrative
+tool. Show how the briefing maps line-by-line to the disclosure
+requirements.
 ### 3. NYC OEM / DEP planner
 > *"Hollis, Queens"*
+**Demo arc.** Neighborhood intent fires (9 step events), produces an
+NTA-level briefing. 434 flood-related 311 over 3 years (87 catch-
 basin clogged, 42 street-flooding), 4.3% of neighborhood projected
+to flood under DEP moderate-2050 scenario, 25% of cells with HAND<1
+m. RAG retrieval pulls relevant DEP/NPCC4 policy paragraphs.
 **Demo hook.** DEP just announced a $30B stormwater priority list
 (86 locations) and a $68M Brooklyn Bluebelt expansion in Prospect
+Park. Riprap supports the prioritization argument with citable per-
+NTA evidence. Pair with the EJNYC Flood Vulnerability Index for the
+EJ-spending overlay (35%-to-disadvantaged-communities legal
 mandate).
 ### 4. Insurance underwriter / actuary
 > *"442 East Houston Street, Manhattan"*
 **Demo arc.** Same as resident demo, but emphasize the **provenance
+trace** UI. Every Stone row, every doc_id, every source URL,
 vintage, and tier glyph.
 **Demo hook.** When an underwriter writes a risk memo, the audit
 chain matters. First Street's "we used a proprietary catastrophe
 model" doesn't survive a regulator review the way "we used FEMA
+Sandy 2012 polygon, NYC DEP 2021 stormwater scenario, USGS Ida HWM
+Event 312, NOAA gauge 8518750, NWS station KNYC, Granite TTM r2
+fine-tune (test MAE 0.1091 m vs 0.1467 zero-shot, citable)" does.
 ### 5. Climate journalist / advocacy
 > *"Coney Island, Brooklyn"*
+**Demo arc.** Neighborhood briefing. 87.5% of NTA in 2012 Sandy
 zone, 382 flood complaints over 3 years, 7.8% projected flooded
+under 2050 moderate, 38.9% of DEM cells with HAND<1 m, DEP extreme-
+2080 at 44.2% flooded.
 **Demo hook.** ProPublica/NYTimes/THE CITY-style data journalism.
+Every claim in a Riprap briefing is reproducible. Anyone can paste
 the same query and get a near-identical narrative. The journalist
 can publish the briefing as the methods section.
 > *"What are they building in Gowanus and is it risky"*
+**Demo arc.** Planner classifies `development_check`. FSM pulls DOB
+filings plus flood layers for the project sites. Briefing comments
 on which proposed buildings sit inside Sandy 2012, which intersect
 DEP extreme-2080, what the microtopo says.
+**Demo hook.** Pre-design siting check. The Gowanus rezoning is one
+of NYC's largest active development zones, well known to flood. Show
+how the tool surfaces flood concerns before architects pour
 concrete.
 ---
+## Lateral and unexpected use cases
+Ten bets, ordered roughly from most-buildable to most-speculative.
 1. **Pre-storm cohort briefings.** Subscribe Riprap to NWS flood-
    watch alerts. When a watch lands, fan out one briefing per
+   affected NTA plus push to OEM, press, and advocacy lists. Citable
    evidence on demand for the press cycle that follows.
 2. **Climate-grant evidence sections.** HUD CDBG-DR and FEMA BRIC
    applications need an auditable evidence base. Riprap auto-
    generates the "vulnerability assessment" section so a community
+   group can apply for resilience funding without hiring a
+   consultant.
 3. **Local Law disclosure boilerplate.** Plug into a brokerage's
+   listing flow. When an agent enters an address, auto-generate the
+   NY RPL §231-b lease addendum or §462(2) disclosure draft. ROI is
+   high since the law took effect 2024 and many landlords are still
+   figuring out compliance.
 4. **MTA station-hardening prioritization.** Riprap already has the
+   MTA-entrance probe (KEY-001 in the demo). Run the FSM across all
+   subway entrances; rank by exposure × ridership. The MTA's
+   October-2025 Climate Resilience Roadmap Update is the policy
+   hook.
+5. **DOE school siting.** When DOE reviews proposed school sites or
+   selects schools for retrofit, Riprap briefings (with `expect_311_ge`
+   plus Sandy plus DEP overlays) would catch flood exposure that
+   form-style screens miss.
+6. **Time-machine variant.** Re-run the FSM with snapshot data from
+   a past date. *"What would Riprap have said about Hollis on August
+   31, 2021, the day before Ida?"* Useful for retrospective analysis,
+   expert testimony, and stress-testing the system.
+7. **Cross-city scaffold.** The architecture is NYC-specific by data
+   choice, not by code. Port to Houston (post-Harvey plus Hurricane
    Beryl 2024), Miami (king tides), Boston (CSO floods), Charleston
+   (chronic tidal), with a per-city probe set plus RAG corpus.
 8. **Federation with FloodNet alerts.** When a sensor triggers a
    flood event NOW, fire a Riprap live_now briefing for the
    surrounding NTA: *"what's at stake in the next 6 hours."*
    Connects FloodNet's hyperlocal sensor reads to the OEM decision
    loop.
 9. **EJNYC × Riprap pairing.** Rank all 188 NTAs by Riprap-detected
+   exposure, intersect with state DAC designations. Output: a map of
+   "underserved plus underwater". The most underfunded high-exposure
+   neighborhoods.
 10. **Court testimony / expert witness.** Citable, reproducible
+    flood narrative as a court exhibit. The Mellea passes-record
+    plus provenance trace are the kind of artefact a regulator or
+    judge can audit. Especially relevant after the December-2025
+    Zillow controversy created public discussion of climate-data
+    integrity.
 ---
+## Risks and framing
 - **Real-estate industry pushback.** December 2025: Zillow removed
   First Street's climate scores under MLS pressure because the data
   was hurting transaction volume. A free, citation-grounded
   alternative could face the same reflex. Riprap's defence is that
+  it's a narrative tool for professional analytical work, not a
+  buy/don't-buy verdict. Keep the disclaimer footer prominent.
 - **Redlining hazard.** Exposure narratives can be misused by
   landlords or insurers to discriminate against high-flood-risk
   (often disproportionately disadvantaged) neighborhoods.
   law. Don't position it as legal disclosure-of-record without a
   real-estate-attorney review.
 - **Cold-start latency.** First query after droplet redeploy is
+  around 30 s while models warm. For demos, ping the Space and run
+  one warm-up query 5 minutes before showtime.
+- **Geocoder edge cases.** "PS 188, Lower East Side" geocoded to a
+  Brooklyn PS 188 in our test suite. For demos, pick fully-qualified
+  street addresses; document the disambiguation behavior.
 ---
 Concrete polish items the research surfaces, ranked by demo value:
+1. **Sample-query pills on landing.** Six clickable pills below the
+   search bar, one per persona above. Let the audience demo
    themselves.
+2. **A "What this is" bar at the top of the landing.** Three lines:
    *"Citation-grounded NYC flood briefings. Every number cites a
    primary source. Open-source, public data, audit-grade synthesis."*
 3. **Compare-mode link from the briefing.** Once Riprap delivers a
    single_address briefing, surface "compare with another address"
    as a one-click affordance. The compare intent already exists in
    the planner.
+4. **EJNYC-FVI overlay** on the map sidebar (#9 above). Riprap's
    exposure × DAC designation, two clicks to a powerful editorial
    demo.
 5. **First-query warm-up message** during the cold start: *"loading
+   probes on AMD MI300X. First query after redeploy takes around 30
+   s; subsequent queries 5 to 13 s."*
 ---
 ## Sources
+- [First Street Foundation: Flood Factor methodology](https://firststreet.org/methodology/flood)
+- [FloodHelpNY: NYC and IDEO consumer tool](https://www.floodhelpny.org/en)
+- [ClimateCheck: flood risk methodology](https://climatecheck.com/risks/flood)
+- [Jupiter Intelligence: ClimateScore Global / FloodScore](https://www.jupiterintel.com/climatescore-global)
 - [FEMA Flood Map Service Center](https://msc.fema.gov/)
+- [NY State: RPL §231-b residential lease flood disclosure (2023)](https://www.nysenate.gov/legislation/bills/2021/S5472)
+- [NYSBA: Property Condition Disclosure flood-risk amendment (Mar 2024)](https://nysba.org/breaking-news-new-rules-on-property-condition-disclosure-and-flood-risk-go-into-effect-today/)
+- [CNN: Zillow removes climate risk data under industry pressure (Dec 2025)](https://www.cnn.com/2025/12/02/climate/zillow-climate-data-extreme-weather-first-street-redfin)
 - [NYC Stormwater Resiliency Plan](https://www.nyc.gov/assets/orr/pdf/publications/stormwater-resiliency-plan.pdf)
+- [FloodNet NYC: methodology and sensor network](https://www.floodnet.nyc/methodology)
+- [FloodNet WRR 2024: peer-reviewed sensor paper](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023WR036806)
+- [EJNYC Report: Mayor's Office of Climate and Environmental Justice](https://climate.cityofnewyork.us/ejnyc-report/the-state-of-environmental-justice-in-nyc/)
+- [Flood-LLM: Brisbane case study (MDPI 2026)](https://www.mdpi.com/2071-1050/18/6/2957)
 - [GIS-Integrated Flood LLM (Tandfonline 2024)](https://www.tandfonline.com/doi/full/10.1080/13658816.2024.2306167)
+- [THE CITY: Disadvantaged Communities flood funding (NY Climate Law)](https://www.thecity.nyc/2022/05/02/billions-ny-climate-law-disadvantaged-communities-flood/)
+- [Inman: Redfin First Street integration](https://www.inman.com/2021/02/18/redfin-starts-displaying-flood-risk-data-on-listings/)
+- [FACTUM: citation-hallucination detection in long-form RAG](https://arxiv.org/pdf/2601.05866)
+- [AMD x lablab.ai Developer Hackathon (May 4 to 10, 2026)](https://lablab.ai/ai-hackathons/amd-developer)

services/riprap-models/README.md CHANGED Viewed

@@ -1,11 +1,11 @@
-# Riprap Models — droplet inference service
 GPU inference microservice that runs alongside vLLM on the AMD MI300X
 droplet. Exposes one HTTP endpoint per model class consumed by the
-Riprap FastAPI app's specialists, so all GPU-accelerable forward
-passes (Prithvi-NYC-Pluvial, TerraMind LULC + Buildings, Granite TTM
-r2, Granite Embedding 278M, GLiNER) run on the MI300X regardless of
-which surface — laptop or HF Space — hosts the FastAPI process.
 ## Service contract
@@ -22,7 +22,7 @@ Auth: bearer token on every `/v1/*` route via `RIPRAP_MODELS_API_KEY`.
 Same shape as vLLM. `/healthz` is open so liveness probes don't need
 auth.
-## Deploy — fresh droplet (recommended)
 Use the one-shot bring-up script. Works on any AMD ROCm GPU droplet
 with Docker + GPU device files (`/dev/kfd`, `/dev/dri`) and SSH root
@@ -46,7 +46,7 @@ What it does, in order:
 6. Waits up to 90 s for vLLM `/v1/models` and 60 s for
    riprap-models `/healthz`, exits non-zero if either misses
-Re-running on the same droplet is idempotent — existing containers
 get `docker rm -f`'d and recreated.
 Env knobs:
@@ -65,7 +65,7 @@ After it returns, set the printed env vars in your local shell or HF
 Space variables, run `scripts/probe_addresses.py` to verify, and
 you're live.
-## Deploy — extend an existing container (legacy)
 If you already have a `terramind` container with the heavy ML deps
 baked in (the bootstrap-droplet path), you can skip the Dockerfile
@@ -92,10 +92,11 @@ mapping was set when the container was created.
 What survives a droplet destruction:
-- `services/riprap-models/Dockerfile` + `requirements-full.txt` —
-  every pinned dep, captured from the bootstrap droplet on 2026-05-05
-- `scripts/deploy_droplet.sh` — the bring-up script
-- HF Hub model artefacts — every fine-tune lives at
   `msradam/Prithvi-EO-2.0-NYC-Pluvial`,
   `msradam/TerraMind-NYC-Adapters`,
   `msradam/Granite-TTM-r2-Battery-Surge`. The Dockerfile pulls them
@@ -104,12 +105,13 @@ What survives a droplet destruction:
 What does NOT survive:
 - The HF cache at `${HF_CACHE_HOST}` (default `/root/hf-cache`) on
-  the droplet — every redeploy re-downloads ~12 GB of weights
-  (Granite 4.1 8b for vLLM ~16 GB, Prithvi v2 ~1.3 GB, TerraMind
-  adapters ~600 MB, Granite Embedding ~600 MB, GLiNER ~400 MB,
-  Granite TTM r2 ~6 MB). First query after redeploy takes ~30 s
-  longer than steady-state because of the lazy model load
-- The bearer token — generate a fresh one when re-deploying
 To redeploy:

+# Riprap Models: droplet inference service
 GPU inference microservice that runs alongside vLLM on the AMD MI300X
 droplet. Exposes one HTTP endpoint per model class consumed by the
+Riprap FastAPI app's probes, so all GPU-accelerable forward passes
+(Prithvi-NYC-Pluvial, TerraMind LULC + Buildings, Granite TTM r2,
+Granite Embedding 278M, GLiNER) run on the MI300X regardless of
+which surface (laptop or HF Space) hosts the FastAPI process.
 ## Service contract
 Same shape as vLLM. `/healthz` is open so liveness probes don't need
 auth.
+## Deploy: fresh droplet (recommended)
 Use the one-shot bring-up script. Works on any AMD ROCm GPU droplet
 with Docker + GPU device files (`/dev/kfd`, `/dev/dri`) and SSH root
 6. Waits up to 90 s for vLLM `/v1/models` and 60 s for
    riprap-models `/healthz`, exits non-zero if either misses
+Re-running on the same droplet is idempotent. Existing containers
 get `docker rm -f`'d and recreated.
 Env knobs:
 Space variables, run `scripts/probe_addresses.py` to verify, and
 you're live.
+## Deploy: extend an existing container (legacy)
 If you already have a `terramind` container with the heavy ML deps
 baked in (the bootstrap-droplet path), you can skip the Dockerfile
 What survives a droplet destruction:
+- `services/riprap-models/Dockerfile` plus `requirements-full.txt`.
+  Every pinned dep, captured from the bootstrap droplet on
+  2026-05-05.
+- `scripts/deploy_droplet.sh`. The bring-up script.
+- HF Hub model artefacts. Every fine-tune lives at
   `msradam/Prithvi-EO-2.0-NYC-Pluvial`,
   `msradam/TerraMind-NYC-Adapters`,
   `msradam/Granite-TTM-r2-Battery-Surge`. The Dockerfile pulls them
 What does NOT survive:
 - The HF cache at `${HF_CACHE_HOST}` (default `/root/hf-cache`) on
+  the droplet. Every redeploy re-downloads around 12 GB of weights
+  (Granite 4.1 8b for vLLM around 16 GB, Prithvi v2 around 1.3 GB,
+  TerraMind adapters around 600 MB, Granite Embedding around 600 MB,
+  GLiNER around 400 MB, Granite TTM r2 around 6 MB). First query
+  after redeploy takes around 30 s longer than steady-state because
+  of the lazy model load.
+- The bearer token. Generate a fresh one when re-deploying.
 To redeploy:

slides/README.md CHANGED Viewed

@@ -1,16 +1,20 @@
-# Riprap — pitch deck
 Marp-rendered slides for the AMD x lablab.ai hackathon submission
-video (May 4–10, 2026). Eight slides, ~30 s of voiceover each, sized
-to leave 3+ minutes for the live demo inside the 5-minute video cap.
 ## Files
-- `deck.md` — slide source (Marp markdown).
-- `riprap.css` — Marp theme that ports the SvelteKit UI's design
-  tokens 1:1: IBM Plex (Sans / Mono / Serif), paper register
-  (`#FAFAF7`), Stone palette, the `▌` accent block.
-- `Makefile` — one-command builds.
 ## Render
@@ -23,9 +27,9 @@ npm install -g @marp-team/marp-cli
 Build any of the three artefacts:
 ```bash
-make pdf    # → deck.pdf      (best for sharing)
-make html   # → deck.html     (best for live presenting in browser)
-make pptx   # → deck.pptx     (best for editing in Keynote / PowerPoint)
 make all    # all three
 make clean
 ```
@@ -41,10 +45,10 @@ marp deck.md --theme riprap.css --pptx --output deck.pptx
 ## Notes
 - The theme `@import`s IBM Plex from Google Fonts, so Marp needs
-  network access on first build. Cache it to `~/.npm/_marp`/etc by
-  building once on a connected machine.
 - The deck targets a 1280×720 16:9 frame.
-- Slide 7 is the demo handoff — leave the slide visible while you
   cut to the live screen capture, then return to slide 8 for the
   closing CTA.
@@ -56,7 +60,7 @@ marp deck.md --theme riprap.css --pptx --output deck.pptx
 | 2 | Problem | Zillow yanked First Street in Dec 2025; black-box scores hit a wall | 0:30 |
 | 3 | What it is | Show the cited paragraph; "every number cites its source or it doesn't appear" | 0:35 |
 | 4 | Stack | Three of four hackathon tracks; MI300X, vLLM, three NYC fine-tunes | 0:40 |
-| 5 | Receipts | 5 of 5 probe pass, 5.8–13.1 s, every claim verified | 0:30 |
-| 6 | Civic impact | NY disclosure law, DEP $30B plan, EJNYC FVI — open-source matters | 0:25 |
 | 7 | Demo handoff | Cut to the live HF Space; type the query; let the FSM speak | ~3:00 |
 | 8 | Closing CTA | github.com/msradam/riprap-nyc | 0:10 |

+# Riprap pitch deck
 Marp-rendered slides for the AMD x lablab.ai hackathon submission
+video (May 4 to 10, 2026). Eight slides, around 30 s of voiceover
+each, sized to leave three or more minutes for the live demo inside
+the five-minute video cap.
 ## Files
+- `deck.md`. Slide source (Marp markdown).
+- `riprap.css`. Marp theme that ports the SvelteKit UI's design
+  tokens 1:1: IBM Plex (Sans / Mono / Serif), Civic Hydrology paper
+  register (`#F4F6F9`), Stone palette, the dam mark.
+- `logo.svg`, `logo-paper.svg`. Cleaned dam mark variants used by
+  the title and CTA slides. The original Noun Project file (with
+  embedded attribution text) lives at `assets/` in the repo root.
+- `Makefile`. One-command builds.
 ## Render
 Build any of the three artefacts:
 ```bash
+make pdf    # writes deck.pdf      (best for sharing)
+make html   # writes deck.html     (best for live presenting in browser)
+make pptx   # writes deck.pptx     (best for editing in Keynote / PowerPoint)
 make all    # all three
 make clean
 ```
 ## Notes
 - The theme `@import`s IBM Plex from Google Fonts, so Marp needs
+  network access on first build. Cache it by building once on a
+  connected machine.
 - The deck targets a 1280×720 16:9 frame.
+- Slide 7 is the demo handoff. Leave the slide visible while you
   cut to the live screen capture, then return to slide 8 for the
   closing CTA.
 | 2 | Problem | Zillow yanked First Street in Dec 2025; black-box scores hit a wall | 0:30 |
 | 3 | What it is | Show the cited paragraph; "every number cites its source or it doesn't appear" | 0:35 |
 | 4 | Stack | Three of four hackathon tracks; MI300X, vLLM, three NYC fine-tunes | 0:40 |
+| 5 | Receipts | 5 of 5 probe pass, 5.8 to 13.1 s, every claim verified | 0:30 |
+| 6 | Civic impact | NY disclosure law, DEP $30B plan, EJNYC FVI; open-source matters | 0:25 |
 | 7 | Demo handoff | Cut to the live HF Space; type the query; let the FSM speak | ~3:00 |
 | 8 | Closing CTA | github.com/msradam/riprap-nyc | 0:10 |

slides/deck.md CHANGED Viewed

@@ -3,7 +3,7 @@ marp: true
 theme: riprap
 paginate: true
 size: 16:9
-title: Riprap — Citation-grounded NYC flood briefings
 description: AMD x lablab.ai Developer Hackathon, May 4–10 2026
 ---

 theme: riprap
 paginate: true
 size: 16:9
+title: Riprap. Citation-grounded NYC flood briefings.
 description: AMD x lablab.ai Developer Hackathon, May 4–10 2026
 ---