seriffic Claude Opus 4.7 (1M context) commited on
Commit
62af342
Β·
1 Parent(s): d2e48df

Self-contained droplet redeploy: Dockerfile + bring-up script

Browse files

Three artifacts that turn the bootstrap-droplet's hand-built
container into a reproducible bring-up on any AMD ROCm GPU node:

- services/riprap-models/Dockerfile β€” extends the public
rocm/pytorch:rocm7.2.3_ubuntu24.04_py3.12_pytorch_release_2.9.1
image (same minor torch version as the bootstrap droplet's
bespoke +git build) with our pinned terratorch / granite-tsfm /
transformers / peft / sentence-transformers / gliner stack. Bakes
in the MI300X tuning env (HIP_FORCE_DEV_KERNARG=1 etc) so a fresh
container doesn't need the "remember to set these" incantation.

- services/riprap-models/requirements-full.txt β€” exact pip pins
captured from the running terramind container on 2026-05-05.
Curated: only the leaves the Dockerfile installs on top of the
ROCm PyTorch base; transitive deps resolve from these. Excludes
torch / torchvision / torchaudio / amd-* (provided by the base).

- scripts/deploy_droplet.sh β€” idempotent one-shot bring-up. Takes
IP + bearer token, verifies SSH + GPU device files, pulls vLLM,
builds riprap-models, starts both containers with --restart
unless-stopped, and waits for /v1/models + /healthz. Safe to
re-run on the same droplet (containers get rm -f'd and
recreated). Exits non-zero on healthcheck failure so it's
CI-wrappable.

README runbook covers the destroy + redeploy flow end-to-end:
spin up new droplet, run the script, update HF Space env vars,
restart Space, run probe_addresses.py against the new stack. What
survives destruction: this repo, HF Hub fine-tune artefacts. What
doesn't: the HF cache (re-downloaded ~12 GB on first request) and
the bearer token (generate fresh).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

scripts/deploy_droplet.sh ADDED
@@ -0,0 +1,200 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Riprap GPU-droplet bring-up β€” vLLM + riprap-models, idempotent.
3
+ #
4
+ # Designed for a fresh AMD MI300X droplet (DigitalOcean GPU droplet,
5
+ # AMD Developer Cloud node, etc.) with nothing more than:
6
+ # - Ubuntu 22.04 / 24.04
7
+ # - Docker + AMD ROCm GPU drivers (kfd / dri device files)
8
+ # - SSH root access
9
+ #
10
+ # The script SSHes to the droplet, ensures the right images are
11
+ # pulled, builds the riprap-models container from this repo, starts
12
+ # both services, and runs healthchecks. Re-running on the same
13
+ # droplet is idempotent: existing containers are removed and
14
+ # recreated cleanly.
15
+ #
16
+ # Usage:
17
+ # scripts/deploy_droplet.sh <droplet-ip> <bearer-token>
18
+ #
19
+ # Example:
20
+ # scripts/deploy_droplet.sh 129.212.181.238 "$(cat /tmp/riprap/vllm_token.txt)"
21
+ #
22
+ # Env knobs (optional, all have sensible defaults):
23
+ # SSH_USER default "root"
24
+ # SSH_KEY path to ssh key; default uses ssh-agent
25
+ # VLLM_IMAGE default "vllm/vllm-openai-rocm:v0.17.1"
26
+ # VLLM_PORT default 8001 (host) β†’ 8000 (container)
27
+ # MODELS_PORT default 7860 (host) β†’ 7860 (container)
28
+ # MODEL_REPO default "ibm-granite/granite-4.1-8b"
29
+ # HF_CACHE_HOST default "/root/hf-cache" on droplet
30
+ # SKIP_BUILD "1" to skip building riprap-models image
31
+ # (assume it's already present on droplet)
32
+ #
33
+ # Exits non-zero on any step that fails β€” including the final
34
+ # healthcheck β€” so this is safe to wrap in CI.
35
+ set -euo pipefail
36
+
37
+ if [ "$#" -lt 2 ]; then
38
+ echo "Usage: $0 <droplet-ip> <bearer-token>" >&2
39
+ exit 64
40
+ fi
41
+
42
+ DROPLET_IP="$1"
43
+ TOKEN="$2"
44
+
45
+ SSH_USER="${SSH_USER:-root}"
46
+ SSH_KEY_FLAG=""
47
+ if [ -n "${SSH_KEY:-}" ]; then
48
+ SSH_KEY_FLAG="-i $SSH_KEY"
49
+ fi
50
+ SSH="ssh $SSH_KEY_FLAG -o StrictHostKeyChecking=accept-new -o ConnectTimeout=10 ${SSH_USER}@${DROPLET_IP}"
51
+ SCP="scp $SSH_KEY_FLAG -o StrictHostKeyChecking=accept-new"
52
+
53
+ VLLM_IMAGE="${VLLM_IMAGE:-vllm/vllm-openai-rocm:v0.17.1}"
54
+ VLLM_PORT="${VLLM_PORT:-8001}"
55
+ MODELS_PORT="${MODELS_PORT:-7860}"
56
+ MODEL_REPO="${MODEL_REPO:-ibm-granite/granite-4.1-8b}"
57
+ HF_CACHE_HOST="${HF_CACHE_HOST:-/root/hf-cache}"
58
+ SKIP_BUILD="${SKIP_BUILD:-0}"
59
+
60
+ REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
61
+
62
+ echo "==> Riprap droplet bring-up"
63
+ echo " droplet ip: $DROPLET_IP"
64
+ echo " vllm port: $VLLM_PORT"
65
+ echo " models port: $MODELS_PORT"
66
+ echo " model repo: $MODEL_REPO"
67
+ echo " repo root: $REPO_ROOT"
68
+ echo
69
+
70
+ # ---- 1. Verify SSH + droplet readiness ----------------------------------
71
+ echo "==> 1. SSH connectivity + GPU device check"
72
+ $SSH bash -s <<'REMOTE'
73
+ set -e
74
+ if ! command -v docker > /dev/null; then
75
+ echo "[droplet] docker not installed; aborting" >&2
76
+ exit 1
77
+ fi
78
+ if [ ! -e /dev/kfd ] || [ ! -e /dev/dri ]; then
79
+ echo "[droplet] no AMD GPU device files (/dev/kfd or /dev/dri); aborting" >&2
80
+ exit 1
81
+ fi
82
+ echo "[droplet] docker + AMD GPU device files present"
83
+ docker --version
84
+ REMOTE
85
+
86
+ # ---- 2. Pull vLLM image ---------------------------------------------------
87
+ echo
88
+ echo "==> 2. Pull vLLM image (if not cached)"
89
+ $SSH "docker image inspect $VLLM_IMAGE > /dev/null 2>&1 || docker pull $VLLM_IMAGE"
90
+
91
+ # ---- 3. Sync riprap-models source to droplet -----------------------------
92
+ echo
93
+ echo "==> 3. Sync riprap-models source"
94
+ $SSH "mkdir -p /workspace/riprap-models /workspace/riprap-build"
95
+ # Sync Dockerfile + sources via tar over SSH (rsync may be missing on
96
+ # a minimal droplet; tar is part of any Linux base).
97
+ tar -C "$REPO_ROOT" -cf - services/riprap-models | \
98
+ $SSH "tar -C /workspace/riprap-build -xf -"
99
+
100
+ # ---- 4. Build riprap-models image ----------------------------------------
101
+ if [ "$SKIP_BUILD" = "1" ]; then
102
+ echo
103
+ echo "==> 4. Skipping image build (SKIP_BUILD=1)"
104
+ else
105
+ echo
106
+ echo "==> 4. Build riprap-models image"
107
+ echo " (this takes ~10-20 min on first build; subsequent builds"
108
+ echo " reuse layer cache and are < 1 min)"
109
+ $SSH "cd /workspace/riprap-build && \
110
+ docker build \
111
+ -t riprap-models:latest \
112
+ -f services/riprap-models/Dockerfile \
113
+ ."
114
+ fi
115
+
116
+ # ---- 5. Start vLLM container ---------------------------------------------
117
+ echo
118
+ echo "==> 5. Start vLLM container"
119
+ $SSH bash -s <<REMOTE
120
+ set -e
121
+ docker rm -f vllm > /dev/null 2>&1 || true
122
+ mkdir -p ${HF_CACHE_HOST}
123
+ docker run -d --name vllm \\
124
+ --device=/dev/kfd --device=/dev/dri --group-add=video \\
125
+ --ipc=host --shm-size=16g \\
126
+ -p ${VLLM_PORT}:8000 \\
127
+ -v ${HF_CACHE_HOST}:/root/.cache/huggingface \\
128
+ -e GLOO_SOCKET_IFNAME=eth0 -e VLLM_HOST_IP=127.0.0.1 \\
129
+ --restart unless-stopped \\
130
+ ${VLLM_IMAGE} \\
131
+ --model ${MODEL_REPO} \\
132
+ --host 0.0.0.0 --port 8000 --api-key "${TOKEN}" \\
133
+ --max-model-len 8192 --served-model-name granite-4.1-8b
134
+ echo "[droplet] vllm container started"
135
+ REMOTE
136
+
137
+ # ---- 6. Start riprap-models container ------------------------------------
138
+ echo
139
+ echo "==> 6. Start riprap-models container"
140
+ $SSH bash -s <<REMOTE
141
+ set -e
142
+ docker rm -f riprap-models > /dev/null 2>&1 || true
143
+ docker run -d --name riprap-models \\
144
+ --device=/dev/kfd --device=/dev/dri --group-add=video \\
145
+ --ipc=host --shm-size=8g \\
146
+ -p ${MODELS_PORT}:7860 \\
147
+ -v ${HF_CACHE_HOST}:/root/.cache/huggingface \\
148
+ -e RIPRAP_MODELS_API_KEY="${TOKEN}" \\
149
+ --restart unless-stopped \\
150
+ riprap-models:latest
151
+ echo "[droplet] riprap-models container started"
152
+ REMOTE
153
+
154
+ # ---- 7. Healthchecks -----------------------------------------------------
155
+ echo
156
+ echo "==> 7. Healthchecks"
157
+ echo " waiting up to 90s for vLLM to expose /v1/models..."
158
+ DEADLINE=$((SECONDS + 90))
159
+ while (( SECONDS < DEADLINE )); do
160
+ if curl -sf --max-time 5 "http://${DROPLET_IP}:${VLLM_PORT}/v1/models" \
161
+ -H "Authorization: Bearer ${TOKEN}" > /tmp/vllm-models.json 2>/dev/null; then
162
+ echo " vLLM ready: $(head -c 200 /tmp/vllm-models.json)..."
163
+ break
164
+ fi
165
+ sleep 3
166
+ done
167
+ if (( SECONDS >= DEADLINE )); then
168
+ echo " vLLM did not become ready in 90s; tailing container logs:" >&2
169
+ $SSH "docker logs --tail 30 vllm" >&2
170
+ exit 1
171
+ fi
172
+
173
+ echo " waiting up to 60s for riprap-models /healthz..."
174
+ DEADLINE=$((SECONDS + 60))
175
+ while (( SECONDS < DEADLINE )); do
176
+ if curl -sf --max-time 5 "http://${DROPLET_IP}:${MODELS_PORT}/healthz" \
177
+ > /tmp/models-health.json 2>/dev/null; then
178
+ echo " riprap-models ready: $(cat /tmp/models-health.json)"
179
+ break
180
+ fi
181
+ sleep 2
182
+ done
183
+ if (( SECONDS >= DEADLINE )); then
184
+ echo " riprap-models did not become ready in 60s; tailing container logs:" >&2
185
+ $SSH "docker logs --tail 30 riprap-models" >&2
186
+ exit 1
187
+ fi
188
+
189
+ echo
190
+ echo "==> DONE"
191
+ echo " vLLM http://${DROPLET_IP}:${VLLM_PORT}/v1/models"
192
+ echo " riprap-models http://${DROPLET_IP}:${MODELS_PORT}/healthz"
193
+ echo
194
+ echo "Set these in your local env or HF Space variables:"
195
+ echo " RIPRAP_LLM_PRIMARY=vllm"
196
+ echo " RIPRAP_LLM_BASE_URL=http://${DROPLET_IP}:${VLLM_PORT}/v1"
197
+ echo " RIPRAP_LLM_API_KEY=${TOKEN}"
198
+ echo " RIPRAP_ML_BACKEND=remote"
199
+ echo " RIPRAP_ML_BASE_URL=http://${DROPLET_IP}:${MODELS_PORT}"
200
+ echo " RIPRAP_ML_API_KEY=${TOKEN}"
services/riprap-models/Dockerfile ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Riprap Models β€” droplet inference service.
2
+ #
3
+ # Self-contained ROCm + PyTorch image that runs every GPU-accelerable
4
+ # specialist Riprap consumes (Prithvi-NYC-Pluvial, TerraMind LULC +
5
+ # Buildings, Granite TTM r2, Granite Embedding 278M, GLiNER).
6
+ #
7
+ # Base: AMD's public ROCm 7.2.3 + Python 3.12 + PyTorch 2.9.1 release
8
+ # image. Same minor torch version as the bespoke MI300X image the
9
+ # bootstrap droplet was hand-built with (`torch==2.9.1+git8907517`),
10
+ # but pulled from a public registry so any fresh droplet can recreate
11
+ # the env without internal AMD wheels. The released 2.9.1 has the
12
+ # kernels we need β€” none of riprap-models calls into vLLM-specific
13
+ # attention paths, so the dev-build vs release-build delta is
14
+ # inconsequential for our forward passes.
15
+ #
16
+ # Build: docker build -t riprap-models:latest -f Dockerfile ../..
17
+ # Layout: the build context is the project root so the COPY lines
18
+ # below can reach `services/riprap-models/`.
19
+ FROM rocm/pytorch:rocm7.2.3_ubuntu24.04_py3.12_pytorch_release_2.9.1
20
+
21
+ ENV DEBIAN_FRONTEND=noninteractive \
22
+ PYTHONUNBUFFERED=1 \
23
+ PIP_NO_CACHE_DIR=1 \
24
+ PIP_DISABLE_PIP_VERSION_CHECK=1 \
25
+ HF_HOME=/root/.cache/huggingface \
26
+ TRANSFORMERS_CACHE=/root/.cache/huggingface \
27
+ # MI300X tuning the running container uses; baking them in so a
28
+ # bring-up doesn't require remembering the env-set incantation.
29
+ HIP_FORCE_DEV_KERNARG=1 \
30
+ HSA_NO_SCRATCH_RECLAIM=1 \
31
+ PYTORCH_ROCM_ARCH=gfx942
32
+
33
+ # git is needed by some HF model-card downloads (terratorch yaml repos
34
+ # pull via the git protocol). curl for healthcheck. libgl1 for
35
+ # rasterio's Pillow path. The base ROCm image is Ubuntu 24.04, and
36
+ # already includes most build-time deps we need.
37
+ RUN apt-get update && apt-get install -y --no-install-recommends \
38
+ curl git libgl1 libglib2.0-0 \
39
+ && rm -rf /var/lib/apt/lists/*
40
+
41
+ WORKDIR /workspace/riprap-models
42
+
43
+ # Install deps in two layers so a code-only change doesn't bust the
44
+ # heavy ML wheel cache. requirements.txt holds runtime-narrow
45
+ # packages that the service imports; requirements-full.txt is the
46
+ # super-set the FSM specialists pull in transitively (terratorch's
47
+ # kornia / albumentations chain, granite-tsfm's tsfm_public, etc.).
48
+ COPY services/riprap-models/requirements-full.txt /tmp/req-full.txt
49
+ RUN pip install --upgrade pip && \
50
+ pip install -r /tmp/req-full.txt
51
+
52
+ # Service code itself. Cheap to invalidate; lands last.
53
+ COPY services/riprap-models/main.py /workspace/riprap-models/main.py
54
+ COPY services/riprap-models/requirements.txt /workspace/riprap-models/requirements.txt
55
+
56
+ EXPOSE 7860
57
+
58
+ # `--proxy-headers` so a future LB sees the right client IP. The
59
+ # /healthz route is unauthenticated by design (operators want
60
+ # readiness probes to work without secrets); /v1/* requires the
61
+ # bearer token via RIPRAP_MODELS_API_KEY.
62
+ CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860", \
63
+ "--log-level", "info", "--proxy-headers"]
services/riprap-models/README.md CHANGED
@@ -22,35 +22,119 @@ Auth: bearer token on every `/v1/*` route via `RIPRAP_MODELS_API_KEY`.
22
  Same shape as vLLM. `/healthz` is open so liveness probes don't need
23
  auth.
24
 
25
- ## Deploy
26
 
27
- The droplet's existing `terramind` container already has
28
- `torch+ROCm 7.0`, `terratorch 1.2.7`, `granite-tsfm 0.3.6`,
29
- `transformers 4.57`, `peft`, `safetensors`, `fastapi`, `uvicorn`. The
30
- service code lands under `/workspace/riprap-models/`; only deltas
31
- need installing.
32
 
33
  ```bash
34
- # Copy code (run from project root)
35
- ssh root@129.212.181.238 'mkdir -p /workspace/riprap-models'
36
- rsync -av --delete services/riprap-models/ \
37
- root@129.212.181.238:/workspace/riprap-models/
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
 
39
- # Install deltas + start uvicorn inside the terramind container
40
- ssh root@129.212.181.238 bash <<'REMOTE'
 
 
 
 
 
 
41
  docker cp /workspace/riprap-models terramind:/workspace/
42
- docker exec -d -e RIPRAP_MODELS_API_KEY="$RIPRAP_MODELS_API_KEY" terramind \
43
  bash -c "cd /workspace/riprap-models && \
44
  pip install --no-cache-dir -r requirements.txt && \
45
- uvicorn main:app --host 0.0.0.0 --port 7860 --log-level info \
46
- > /workspace/riprap-models.log 2>&1"
47
  REMOTE
48
  ```
49
 
50
- Service binds inside the container at `:7860`; the host port
51
- mapping was set when the `terramind` container was created
52
- (`docker run -p 7860:7860 ...`), so externally the service is at
53
- `http://129.212.181.238:7860`.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
  ## Local app config
56
 
 
22
  Same shape as vLLM. `/healthz` is open so liveness probes don't need
23
  auth.
24
 
25
+ ## Deploy β€” fresh droplet (recommended)
26
 
27
+ Use the one-shot bring-up script. Works on any AMD ROCm GPU droplet
28
+ with Docker + GPU device files (`/dev/kfd`, `/dev/dri`) and SSH root
29
+ access. No prior container state required.
 
 
30
 
31
  ```bash
32
+ scripts/deploy_droplet.sh <droplet-ip> <bearer-token>
33
+ ```
34
+
35
+ What it does, in order:
36
+
37
+ 1. Verifies SSH + AMD GPU device files on the droplet
38
+ 2. Pulls `vllm/vllm-openai-rocm:v0.17.1`
39
+ 3. Tar-streams `services/riprap-models/` to `/workspace/riprap-build`
40
+ 4. Builds `riprap-models:latest` from `services/riprap-models/Dockerfile`
41
+ (base: `rocm/pytorch:rocm7.2.3_ubuntu24.04_py3.12_pytorch_release_2.9.1`,
42
+ ~10–20 min on first build, < 1 min on rebuild)
43
+ 5. Starts both containers (`vllm` on host port 8001, `riprap-models`
44
+ on host port 7860) with `--restart unless-stopped` so they survive
45
+ reboots
46
+ 6. Waits up to 90 s for vLLM `/v1/models` and 60 s for
47
+ riprap-models `/healthz`, exits non-zero if either misses
48
+
49
+ Re-running on the same droplet is idempotent β€” existing containers
50
+ get `docker rm -f`'d and recreated.
51
+
52
+ Env knobs:
53
+
54
+ | Var | Default | Purpose |
55
+ |---|---|---|
56
+ | `SSH_USER` | `root` | SSH login |
57
+ | `SSH_KEY` | (ssh-agent) | path to private key |
58
+ | `VLLM_PORT` | `8001` | host port mapping for vLLM |
59
+ | `MODELS_PORT` | `7860` | host port mapping for riprap-models |
60
+ | `MODEL_REPO` | `ibm-granite/granite-4.1-8b` | LLM repo |
61
+ | `HF_CACHE_HOST` | `/root/hf-cache` | HF cache mount on droplet |
62
+ | `SKIP_BUILD` | `0` | set `1` to skip Dockerfile build |
63
+
64
+ After it returns, set the printed env vars in your local shell or HF
65
+ Space variables, run `scripts/probe_addresses.py` to verify, and
66
+ you're live.
67
+
68
+ ## Deploy β€” extend an existing container (legacy)
69
 
70
+ If you already have a `terramind` container with the heavy ML deps
71
+ baked in (the bootstrap-droplet path), you can skip the Dockerfile
72
+ build and install the runtime deltas only:
73
+
74
+ ```bash
75
+ ssh root@<ip> 'mkdir -p /workspace/riprap-models'
76
+ rsync -av --delete services/riprap-models/ root@<ip>:/workspace/riprap-models/
77
+ ssh root@<ip> bash <<'REMOTE'
78
  docker cp /workspace/riprap-models terramind:/workspace/
79
+ docker exec -d -e RIPRAP_MODELS_API_KEY="$TOKEN" terramind \
80
  bash -c "cd /workspace/riprap-models && \
81
  pip install --no-cache-dir -r requirements.txt && \
82
+ uvicorn main:app --host 0.0.0.0 --port 7860"
 
83
  REMOTE
84
  ```
85
 
86
+ This path uses `requirements.txt` (deltas only); the Dockerfile path
87
+ above uses `requirements-full.txt` (everything). Service is
88
+ externally reachable at `http://<droplet-ip>:7860` once the host port
89
+ mapping was set when the container was created.
90
+
91
+ ## Destroy + redeploy runbook
92
+
93
+ What survives a droplet destruction:
94
+
95
+ - `services/riprap-models/Dockerfile` + `requirements-full.txt` β€”
96
+ every pinned dep, captured from the bootstrap droplet on 2026-05-05
97
+ - `scripts/deploy_droplet.sh` β€” the bring-up script
98
+ - HF Hub model artefacts β€” every fine-tune lives at
99
+ `msradam/Prithvi-EO-2.0-NYC-Pluvial`,
100
+ `msradam/TerraMind-NYC-Adapters`,
101
+ `msradam/Granite-TTM-r2-Battery-Surge`. The Dockerfile pulls them
102
+ fresh on first request
103
+
104
+ What does NOT survive:
105
+
106
+ - The HF cache at `${HF_CACHE_HOST}` (default `/root/hf-cache`) on
107
+ the droplet β€” every redeploy re-downloads ~12 GB of weights
108
+ (Granite 4.1 8b for vLLM ~16 GB, Prithvi v2 ~1.3 GB, TerraMind
109
+ adapters ~600 MB, Granite Embedding ~600 MB, GLiNER ~400 MB,
110
+ Granite TTM r2 ~6 MB). First query after redeploy takes ~30 s
111
+ longer than steady-state because of the lazy model load
112
+ - The bearer token β€” generate a fresh one when re-deploying
113
+
114
+ To redeploy:
115
+
116
+ ```bash
117
+ # 1. Spin up a new GPU droplet (DigitalOcean / AMD Developer Cloud)
118
+ # 2. Copy your SSH key to it (DO usually does this for you)
119
+ # 3. Run:
120
+ TOKEN=$(openssl rand -base64 24)
121
+ scripts/deploy_droplet.sh <new-ip> "$TOKEN"
122
+
123
+ # 4. Update HF Space env vars to point at the new IP
124
+ huggingface-cli space variables \
125
+ lablab-ai-amd-developer-hackathon/riprap-nyc \
126
+ RIPRAP_LLM_BASE_URL=http://<new-ip>:8001/v1 \
127
+ RIPRAP_LLM_API_KEY=$TOKEN \
128
+ RIPRAP_ML_BASE_URL=http://<new-ip>:7860 \
129
+ RIPRAP_ML_API_KEY=$TOKEN
130
+
131
+ # 5. Restart the HF Space so it picks up the new env vars
132
+ huggingface-cli space restart lablab-ai-amd-developer-hackathon/riprap-nyc
133
+
134
+ # 6. Verify end-to-end against the redeployed stack
135
+ .venv/bin/python scripts/probe_addresses.py \
136
+ --base https://lablab-ai-amd-developer-hackathon-riprap-nyc.hf.space
137
+ ```
138
 
139
  ## Local app config
140
 
services/riprap-models/requirements-full.txt ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Riprap Models β€” full runtime requirements
2
+ #
3
+ # Pinned to the exact versions the bootstrap MI300X container ran with,
4
+ # captured via `pip freeze` inside the running `terramind` container on
5
+ # 2026-05-05. Keep these pins until something in the spec needs to
6
+ # change β€” the AMD ROCm + terratorch + tsfm_public stack has narrow
7
+ # version compatibility windows.
8
+ #
9
+ # Torch / torchvision / torchaudio are NOT pinned here because they
10
+ # come from the base image (rocm/pytorch ROCm 7.2.3 + torch 2.9.1
11
+ # release). Pinning them again would cause pip to attempt a re-install
12
+ # of a different ABI and break the build.
13
+
14
+ # ---- Core HF / transformers stack ----------------------------------------
15
+ transformers==4.57.6
16
+ peft==0.18.1
17
+ accelerate==1.13.0
18
+ safetensors==0.8.0rc0
19
+ huggingface_hub==0.36.2
20
+ sentence-transformers==5.4.1
21
+ gliner==0.2.26
22
+
23
+ # ---- IBM Granite TimeSeries TTM r2 (TTM forecast specialists) ------------
24
+ granite-tsfm==0.3.6
25
+
26
+ # ---- Prithvi-EO / TerraMind serving stack --------------------------------
27
+ # terratorch pulls torchgeo, lightning, jsonargparse, kornia, timm, einops,
28
+ # albumentations, etc. Pinning the leaves explicitly so transitive bumps
29
+ # don't drift the FSM specialists' behaviour silently.
30
+ terratorch==1.2.7
31
+ torchgeo==0.9.0
32
+ torchmetrics==1.9.0
33
+ lightning==2.6.1
34
+ jsonargparse==4.48.0
35
+ albumentations==2.0.8
36
+ albucore==0.0.24
37
+ kornia==0.8.2
38
+ timm==1.0.25
39
+ einops==0.8.2
40
+
41
+ # ---- Geospatial I/O (used by the NYC-cropping helpers) -------------------
42
+ rasterio==1.5.0
43
+ pyproj==3.7.2
44
+ geopandas==1.1.3
45
+ shapely==2.1.2
46
+ pystac==1.14.3
47
+ pystac-client==0.9.0
48
+ rioxarray==0.22.0
49
+ xarray==2026.4.0
50
+ tifffile==2026.5.2
51
+ ImageIO==2.37.3
52
+
53
+ # ---- Numeric core --------------------------------------------------------
54
+ numpy==2.4.4
55
+ pandas==3.0.0
56
+ scipy==1.17.1
57
+ scikit-learn==1.8.0
58
+ pillow==12.1.1
59
+
60
+ # ---- Web / IO ------------------------------------------------------------
61
+ fastapi==0.135.1
62
+ uvicorn==0.41.0
63
+ pydantic==2.12.5
64
+ httpx==0.28.1
65
+ requests==2.32.5