Spaces:

lablab-ai-amd-developer-hackathon
/

riprap-nyc

Running

seriffic Claude Opus 4.7 (1M context) commited on 7 days ago

Commit

0d831ce

1 Parent(s): fee1c30

fix(eo): unblock TerraMind LoRA + Prithvi v2 inference on the L4 Space

The diag wrapper from the previous commit surfaced two real
upstream bugs that the legacy "deps unavailable" / generic 500
masking had been hiding:

1. TerraMind LULC + Buildings: "Expected size 12 but got size 2
for tensor nu" inside terratorch's tiled_inference.

Root cause: terratorch.tasks.tiled_inference doesn't handle
the 5-D (B, C, T, H, W) modality tensor shape that
backbone_use_temporal=True / backbone_temporal_n_timestamps=4
produces, so it slices/concats incorrectly when fusing the
per-modality patches and trips on the 12 (S2 bands) vs 2 (S1
bands) channel mismatch.

Fix: the canonical chip from app/context/eo_chip_cache.py is
already exactly 224×224 — the model's native input
resolution. Tiling is unnecessary at that size. Mirror the
training-time inference at experiments/18_terramind_nyc_lora/
shared/inference_ensemble.py:155 (`task.model(x)` direct on
the modality dict). tiled_inference is preserved as a
fallback for chips larger than 224×224.

2. Prithvi-NYC-Pluvial v2: "AttributeError: 'list' object has no
attribute 'view'" on first inference.

Root cause: when patching the v2 datamodule's missing
`test_transform`, we replaced its kornia AugmentationSequential
with a Normalize built from .means.view(-1).tolist() and
.stds.view(-1).tolist(). kornia ≥ 0.7 stores those values
as-is and calls .view() on them at augment-apply time;
passing a Python list crashes with the AttributeError above.

Fix: pass the underlying torch.Tensor directly via
.view(-1).detach().clone() — same numeric data, but kornia
gets the type it expects.

Same patch applied to the local fallback at
app/flood_layers/prithvi_live.py for parity (the local path is
unreachable on the cpu-basic UI Space for unrelated reasons but
will be live on any deployment with a working CUDA torch).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (2) hide show

app/flood_layers/prithvi_live.py +7 -2
services/riprap-models/main.py +32 -9

app/flood_layers/prithvi_live.py CHANGED Viewed

@@ -209,9 +209,14 @@ def _ensure_model():
                         from albumentations.pytorch import ToTensorV2
                         m.datamodule.test_transform = A.Compose([ToTensorV2()])
                         _old = m.datamodule.aug
                         m.datamodule.aug = _Ka.AugmentationSequential(
-                            _Ka.Normalize(_old.means.view(-1).tolist(),
-                                          _old.stds.view(-1).tolist()),
                             data_keys=None)
                         log.info("prithvi_live: patched v2 datamodule transforms "
                                  "for IBM inference.py compat")

                         from albumentations.pytorch import ToTensorV2
                         m.datamodule.test_transform = A.Compose([ToTensorV2()])
                         _old = m.datamodule.aug
+                        # Pass torch.Tensor (not list via .tolist()).
+                        # kornia 0.7+ stores values as-is and calls
+                        # .view() on them at apply time; passing a
+                        # Python list crashes with `AttributeError:
+                        # 'list' object has no attribute 'view'`.
                         m.datamodule.aug = _Ka.AugmentationSequential(
+                            _Ka.Normalize(_old.means.view(-1).detach().clone(),
+                                          _old.stds.view(-1).detach().clone()),
                             data_keys=None)
                         log.info("prithvi_live: patched v2 datamodule transforms "
                                  "for IBM inference.py compat")

services/riprap-models/main.py CHANGED Viewed

@@ -148,9 +148,14 @@ def _load_prithvi():
                 from albumentations.pytorch import ToTensorV2
                 m.datamodule.test_transform = A.Compose([ToTensorV2()])
                 _old = m.datamodule.aug
                 m.datamodule.aug = _Ka.AugmentationSequential(
-                    _Ka.Normalize(_old.means.view(-1).tolist(),
-                                  _old.stds.view(-1).tolist()),
                     data_keys=None)
                 log.info("prithvi: patched v2 datamodule transforms "
                          "for IBM inference.py compat")
@@ -478,17 +483,35 @@ def _terramind_inference(payload: TerramindIn) -> dict[str, Any]:
         chips["DEM"] = _to_device(_build_chip_tensor(dem))
     import torch
-    from terratorch.tasks.tiled_inference import tiled_inference
-    def _forward(x, **_extra):
         out = task.model(x)
         return out.output if hasattr(out, "output") else out
     with torch.no_grad():
-        logits = tiled_inference(
-            _forward, chips, out_channels=spec["num_classes"],
-            h_crop=224, w_crop=224, h_stride=128, w_stride=128,
-            average_patches=True, blend_overlaps=True, padding="reflect",
-        )
     pred = logits.argmax(dim=1).squeeze(0).cpu().numpy().astype("uint8")
     n = max(int(pred.size), 1)
     fractions = {

                 from albumentations.pytorch import ToTensorV2
                 m.datamodule.test_transform = A.Compose([ToTensorV2()])
                 _old = m.datamodule.aug
+                # Pass torch.Tensor (not Python list via .tolist()) —
+                # kornia 0.7+ stores the values as-is and calls .view()
+                # on them at apply time. With a list that fails with
+                # `AttributeError: 'list' object has no attribute 'view'`.
+                # Cloning detaches from the source datamodule's params.
                 m.datamodule.aug = _Ka.AugmentationSequential(
+                    _Ka.Normalize(_old.means.view(-1).detach().clone(),
+                                  _old.stds.view(-1).detach().clone()),
                     data_keys=None)
                 log.info("prithvi: patched v2 datamodule transforms "
                          "for IBM inference.py compat")
         chips["DEM"] = _to_device(_build_chip_tensor(dem))
     import torch
+    def _forward(x):
         out = task.model(x)
         return out.output if hasattr(out, "output") else out
+    # Call the model directly — same shape contract as the
+    # training-time inference at
+    # experiments/18_terramind_nyc_lora/shared/inference_ensemble.py:
+    # the canonical chip is already the model's native 224×224 input
+    # in (B, C, T, H, W) form, so terratorch's `tiled_inference` is
+    # unnecessary and was the cause of the "Expected size 12 but got
+    # size 2" 5-D handling regression we hit on the L4 deploy.
+    # Tile only when the chip is bigger than the model resolution.
+    s2_t = chips["S2L2A"]
+    h_chip, w_chip = int(s2_t.shape[-2]), int(s2_t.shape[-1])
     with torch.no_grad():
+        if h_chip == 224 and w_chip == 224:
+            logits = _forward(chips)
+        else:
+            from terratorch.tasks.tiled_inference import tiled_inference
+            def _forward_tile(x, **_extra):
+                return _forward(x)
+            logits = tiled_inference(
+                _forward_tile, chips, out_channels=spec["num_classes"],
+                h_crop=224, w_crop=224, h_stride=128, w_stride=128,
+                average_patches=True, blend_overlaps=True, padding="reflect",
+            )
     pred = logits.argmax(dim=1).squeeze(0).cpu().numpy().astype("uint8")
     n = max(int(pred.size), 1)
     fractions = {