File size: 12,487 Bytes
e943cf5
 
 
 
 
 
 
 
 
da4b993
 
 
 
caa28aa
2e8df60
f6423e1
caa28aa
 
 
 
 
 
 
 
2e8df60
 
 
 
 
da4b993
2e8df60
 
 
b2f95f6
 
 
 
caa28aa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b2f95f6
 
 
f6423e1
2e8df60
f6423e1
 
 
2e8df60
 
 
 
 
 
 
f6423e1
 
 
 
 
2e8df60
caa28aa
 
2e8df60
 
 
 
 
 
f6423e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
caa28aa
 
2e8df60
 
 
caa28aa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2e8df60
f6423e1
 
 
 
2e8df60
f6423e1
2e8df60
caa28aa
 
f6423e1
2e8df60
 
f6423e1
caa28aa
 
2e8df60
 
 
caa28aa
2e8df60
caa28aa
 
 
2e8df60
caa28aa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2e8df60
caa28aa
f6423e1
caa28aa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2e8df60
 
 
 
 
 
caa28aa
 
 
2e8df60
f6423e1
2e8df60
da4b993
 
 
caa28aa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
---
title: Riprap Nyc
emoji: 😻
colorFrom: red
colorTo: yellow
sdk: docker
pinned: false
---

<p align="left">
  <img src="assets/logo@2x.png" width="72" height="72" alt="Riprap dam mark" />
</p>

# Riprap β€” citation-grounded NYC flood-exposure briefings

Riprap takes any NYC address (or neighborhood, or development-permit
query) and produces a four-section briefing β€” Status, Empirical
evidence, Modeled scenarios, Policy context. Every numeric claim is
anchored to a `[doc_id]` citation that resolves to a named primary
public-record source. If the model cannot cite a number, the model
does not publish that number.

The Capstone reconciler is **IBM Granite 4.1 8B**, served via Ollama
on T4 or vLLM on AMD MI300X, wrapped in **Mellea**-validated rejection
sampling. Sentences that fail one of four grounding checks
(`numerics_grounded`, `no_placeholder_tokens`, `citations_dense`,
`citations_resolve`) are rerolled with surgical feedback until the
budget is exhausted.

Live demo: <https://lablab-ai-amd-developer-hackathon-riprap-nyc.hf.space>

---

## What it looks like

![Riprap flood-exposure briefing for 80 Pioneer Street, Brooklyn](assets/screenshots/hero.png)

*A citation-grounded flood-exposure briefing for 80 Pioneer Street in
Red Hook. Generated in roughly 7 seconds against AMD MI300X. Every
numeric claim cites a primary public-record source.*

---

## Quickstart

Three ways to use Riprap, in increasing order of self-host:

### 1. Try the live demo

The hosted Space runs the full pipeline against a live AMD MI300X
inference backend. Type any NYC address.

<https://lablab-ai-amd-developer-hackathon-riprap-nyc.hf.space>

### 2. Run locally with Docker

```bash
git clone https://github.com/msradam/riprap-nyc
cd riprap-nyc
cp .env.example .env
# edit .env to point RIPRAP_LLM_BASE_URL / RIPRAP_ML_BASE_URL at
# either the live demo's backends or your own self-hosted instance
docker compose up
```

Visit `http://localhost:7860`.

To self-host the GPU inference half (vLLM + the ML specialist
service) on an AMD ROCm or NVIDIA CUDA box, run:

```bash
docker compose --profile with-models up
```

Full single-command MI300X bring-up via DigitalOcean: see
[`docs/DROPLET-RUNBOOK.md`](docs/DROPLET-RUNBOOK.md).

### 3. Develop

```bash
# Python 3.12 venv via uv
uv venv && uv pip install -r requirements.txt

# SvelteKit frontend (committed pre-built; only rebuild if sources change)
cd web/sveltekit && npm ci && npm run build && cd ../..

# Local server (Ollama primary)
.venv/bin/uvicorn web.main:app --host 127.0.0.1 --port 7860

# Local server pointed at AMD MI300X (vLLM primary, Ollama fallback)
RIPRAP_LLM_PRIMARY=vllm \
RIPRAP_LLM_BASE_URL=http://<droplet-ip>:8000/v1 \
RIPRAP_LLM_API_KEY=<token> \
.venv/bin/uvicorn web.main:app --host 127.0.0.1 --port 7860

# End-to-end address suite (5 NYC addresses, intent-aware checks)
.venv/bin/python scripts/probe_addresses.py
```

---

## How Riprap works: the Five Stones

Behind every briefing, around 25 atomic data probes fan out across NYC
datasets, satellite imagery, sensors, and forecasts. The **Five Stones**
group those probes into five legible roles:

> **Cornerstone** remembers. **Keystone** tallies. **Touchstone**
> watches. **Lodestone** projects. **Capstone** writes it all down with
> citations.

| Stone | Role | What fires |
|---|---|---|
| **Cornerstone** | The Hazard Reader. What NYC's ground remembers. | Sandy 2012 inundation extent, NYC DEP stormwater scenarios, 2021 Ida USGS high-water marks, baked Prithvi-EO Ida-attributable polygons, USGS 3DEP DEM + HAND/TWI |
| **Keystone** | The Asset Register. What's exposed. | MTA subway entrances, NYCHA developments, NYC DOE schools, NYS DOH hospitals, **TerraMind-NYC Buildings LoRA** |
| **Touchstone** | The Live Observer. Current state of the city. | FloodNet ultrasonic depth sensors, NYC 311 flood complaints, NWS hourly METAR, NOAA tide-gauge water levels, **Prithvi-EO 2.0 NYC-Pluvial v2**, **TerraMind-NYC LULC LoRA** |
| **Lodestone** | The Projector. What's coming. | NWS public flood alerts, Granite TTM r2 surge nowcast (zero-shot, 6-min cadence, 9.6 h horizon), per-address 311 weekly forecast, FloodNet sensor recurrence forecast, **Granite-TTM-r2-Battery-Surge fine-tune** (96 h hourly horizon) |
| **Capstone** | The Synthesiser. Citation-grounded briefing. | Granite 4.1 + Mellea rejection sampling |

The four data-Stones run sequentially per query; the Capstone
reconciles their documents into one cited paragraph.

---

## NYC-specialised foundation models (Apache 2.0)

Three NYC-specific fine-tunes built on AMD Instinct MI300X via AMD
Developer Cloud, published under permissive licence.

**[`msradam/TerraMind-NYC-Adapters`](https://huggingface.co/msradam/TerraMind-NYC-Adapters).**
LoRA family on TerraMind 1.0 base. LULC mIoU 0.5866 (+6.13 pp over
full-FT baseline), TiM 0.6023, Buildings 0.5511. Trained in around 18
minutes on a single MI300X.

**[`msradam/Prithvi-EO-2.0-NYC-Pluvial`](https://huggingface.co/msradam/Prithvi-EO-2.0-NYC-Pluvial).**
NYC pluvial-flood fine-tune of Prithvi-EO 2.0. Test flood IoU 0.5979
vs 0.10 on the Sen1Floods11 base, a 6Γ— lift. LovΓ‘sz-Softmax loss with
copy-paste augmentation.

**[`msradam/Granite-TTM-r2-Battery-Surge`](https://huggingface.co/msradam/Granite-TTM-r2-Battery-Surge).**
NYC Battery storm-surge nowcast fine-tune of Granite TimeSeries TTM
r2. Test MAE 0.1091 m, βˆ’41% vs persistence and βˆ’25% vs zero-shot.

All three are loaded at runtime by their respective FSM probes in
`app/context/` and `app/live/`. Reproduction recipes live under
`experiments/18..21/`.

---

## Architecture

```
NYC address ──► Granite 4.1 3B planner ──► Plan{intent, targets, specialists}
                                                  β”‚
                                                  β–Ό
                            Five-Stone Burr FSM (one @action per probe)
                            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                            β–Ό           β–Ό           β–Ό           β–Ό          β–Ό
                       Cornerstone  Keystone   Touchstone   Lodestone  (cont.)
                       (hazard)    (assets)    (live)       (forecast)
                            β”‚           β”‚           β”‚           β”‚
                            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                              β–Ό
                         build_documents() β€” Granite-native
                         role="document <doc_id>" messages
                                              β–Ό
                  Capstone: Granite 4.1 8B + Mellea rejection sampling
                  ──► 4-check grounding loop, surgical feedback rerolls
                                              β–Ό
                       Four-section briefing with [doc_id] citations
                                              β–Ό
                       SSE stream β†’ SvelteKit UI (briefing, trace, map)
```

LLM inference is dispatched through `app/llm.py`, a LiteLLM Router
shim with two backends: **Ollama** (T4 / local) and **vLLM** (AMD
MI300X). Same `chat()` signature in both directions; vLLM is primary
for the demo, Ollama is the auto-failover.

Source-of-truth pointers:

- `app/stones/`: the Stones taxonomy (NAME / TAGLINE / SOURCES /
  collect()) over the FSM probes.
- `app/fsm.py`: Burr FSM with one probe per `@action`.
- `app/reconcile.py`: `build_documents()` emits Granite-native
  document-role messages in canonical Stone order.
- `app/mellea_validator.py`: strict reconcile path (4-check rejection
  sampling).
- `app/llm.py`: LiteLLM Router shim. Routes to Ollama or vLLM without
  changing caller code.
- `web/main.py`: FastAPI + SSE. The stream emits
  `plan / step / token / mellea_attempt / final` events plus the
  `stone_start / stone_done` envelope around each Stone group.
- `web/sveltekit/`: primary UI (SvelteKit + adapter-static).

For the long-form architecture document, see [`ARCHITECTURE.md`](ARCHITECTURE.md).

---

## Data sources

Riprap contacts only public-record federal, state, and city sources at
runtime. No commercial APIs, no proprietary scores, no opaque
aggregators.

| Source | Hosting agency | Used for |
|---|---|---|
| Hurricane Sandy 2012 inundation zone | NYC OTI / NOAA Office for Coastal Management | Cornerstone hazard memory |
| NYC DEP Stormwater Flood Maps | NYC Department of Environmental Protection | DEP modeled-scenario layers |
| Hurricane Ida 2021 USGS high-water marks | USGS Short-Term Network | Empirical validation points |
| FloodNet ultrasonic sensor network | NYU CUSP / FloodNet | Live water-depth observations |
| NYC 311 flood complaints | NYC Open Data | Empirical complaint history |
| NOAA tide gauge β€” The Battery | NOAA CO-OPS | Live tide and surge level |
| NWS METAR | National Weather Service | Hourly precipitation |
| NWS public flood alerts | National Weather Service | Active warnings and watches |
| MTA subway entrances | MTA / NYC Open Data | Transit asset register |
| NYCHA developments | NYC Housing Authority | Public-housing exposure |
| NYC DOE schools | NYC Department of Education | Education-asset exposure |
| NYS DOH hospitals | New York State Department of Health | Critical-facility exposure |
| USGS 3DEP 1 m DEM | USGS National Map | HAND / TWI microtopography |
| NYC DOB filings | NYC Department of Buildings | Development-check intent |
| NPCC4 SLR projections | NYC Mayor's Office of Climate & Environmental Justice | Policy-context corpus (RAG) |
| Sentinel-2 MSI imagery | ESA / Copernicus | Prithvi + TerraMind inputs |

The full data licence map and vintage table is enumerated in
[`ARCHITECTURE.md`](ARCHITECTURE.md).

---

## What Riprap is not

The civil engineer carries the stamp. Riprap surfaces the evidence the
engineer judges.

- **Not a hydraulic model.** Riprap does not replace HEC-RAS, SWMM, or
  ICM. It synthesises evidence from completed modelling work; it does
  not produce new flow or stage estimates.
- **Not a stamped deliverable.** The briefing is a starting point for
  a memo, not the memo itself. Professional judgment, field
  reconnaissance, and the engineer's stamp are required for any
  actionable output.
- **Not a substitute for site investigation.** Microtopography is from
  1 m USGS 3DEP LiDAR, appropriate for screening, not for design.
- **Not a risk score.** Riprap does not output a 1–10 or 1–100 number.
  Score-based tools (First Street, ClimateCheck, Jupiter) are
  different products for different audiences. Riprap is the evidence
  audit trail behind any such judgment.

---

## Citation

If you reference Riprap in academic or professional work:

```bibtex
@software{riprap_nyc_2026,
  author       = {Rahman, Adam Munawar},
  title        = {Riprap: Citation-Grounded NYC Flood-Exposure Briefings},
  year         = {2026},
  url          = {https://github.com/msradam/riprap-nyc},
  version      = {v0.5.0},
  note         = {Built for the AMD x lablab.ai Developer Hackathon}
}
```

---

## License

Apache 2.0 (this repository). The three NYC-specialised fine-tunes
above are also Apache 2.0; underlying upstream models retain their
own permissive licences (see each `MODEL_CARD.md`).

HF Space configuration reference:
<https://huggingface.co/docs/hub/spaces-config-reference>.

---

## Acknowledgments

- **AMD Developer Cloud** β€” MI300X compute that made the three
  Apache-2.0 NYC fine-tunes feasible.
- **AMD x lablab.ai Developer Hackathon** β€” the venue.
- **IBM Research** β€” Granite 4.1, Granite Embedding 278M, Granite TTM
  r2, Mellea, and the rest of the open-source Granite ecosystem.
- **NASA / IBM Prithvi-EO 2.0** and **IBM / ESA TerraMind 1.0** β€” the
  geospatial foundation models behind the NYC fine-tunes.
- **NYU CUSP / FloodNet** β€” the public sensor network whose data
  Riprap reads live.
- **Andrew Hicks** β€” civil-engineering review of the methodology, and
  framing for what Riprap is not.
- **The Riprap dam mark** β€” ["Dam" by Chintuza](https://thenounproject.com/icon/dam-4516918/)
  via the Noun Project, licensed CC-BY 3.0. The original SVG embedded
  the attribution as on-canvas text; Riprap's `assets/logo*.svg`
  strips the embedded text and carries the credit here in body copy
  instead, per the Creative Commons attribution requirement.