Spaces:

getmokshshah
/

depthlens

Sleeping

App Files Files Community

getmokshshah commited on Mar 20

Commit

8cc8ac9

1 Parent(s): db355c8

Pushed project

Browse files

Files changed (13) hide show

.github/workflows/sync-to-hf.yml +21 -0
.gitignore +42 -0
LICENSE +1 -1
README.md +158 -0
app.py +144 -0
download_examples.py +36 -0
examples/.gitkeep +0 -0
inference.py +136 -0
models/__init__.py +3 -0
models/depth_estimator.py +120 -0
requirements.txt +8 -0
utils/__init__.py +15 -0
utils/visualization.py +129 -0

.github/workflows/sync-to-hf.yml ADDED Viewed

	@@ -0,0 +1,21 @@

+name: Sync to Hugging Face Space
+on:
+  push:
+    branches: [main]
+  workflow_dispatch:
+jobs:
+  sync-to-hub:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+          lfs: true
+      - name: Push to HuggingFace Space
+        env:
+          HF_TOKEN: ${{ secrets.HF_TOKEN }}
+        run: |
+          git push --force https://getmokshshah:$HF_TOKEN@huggingface.co/spaces/getmokshshah/depthlens main

.gitignore ADDED Viewed

	@@ -0,0 +1,42 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.egg-info/
+dist/
+build/
+*.egg
+# Virtual environments
+venv/
+env/
+.env/
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+# OS
+.DS_Store
+Thumbs.db
+# Model weights (downloaded at runtime)
+*.pth
+*.pt
+*.onnx
+hub/
+# Data
+*.npy
+results/
+outputs/
+# Example images (downloaded at runtime)
+examples/*.jpg
+examples/*.png
+# Logs
+*.log
+runs/

LICENSE CHANGED Viewed

@@ -1,6 +1,6 @@
 MIT License
-Copyright (c) 2026 getmokshshah
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

 MIT License
+Copyright (c) 2026 Moksh Shah
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal

README.md ADDED Viewed

	@@ -0,0 +1,158 @@

+---
+title: DepthLens
+emoji: 🌀
+colorFrom: teal
+colorTo: yellow
+sdk: gradio
+sdk_version: "4.44.1"
+app_file: app.py
+pinned: false
+license: mit
+---
+# DepthLens — Monocular Depth Estimation
+Estimate depth from a single image using state-of-the-art deep learning models. Upload any photo and get a detailed depth map showing how far away each part of the scene is.
+**[Try the Live Demo →](https://huggingface.co/spaces/getmokshshah/depthlens)**
+---
+## What It Does
+DepthLens takes a single 2D image and predicts a per-pixel depth map — no stereo cameras, no LiDAR, just one photo. The output is a color-mapped visualization showing relative distances: warm colors (red/yellow) for nearby objects and cool colors (blue/purple) for faraway ones.
+This is the same core technique used in autonomous vehicles, AR/VR applications, robotics, and 3D scene reconstruction.
+## 📁 Project Structure
+```
+depthlens/
+├── app.py                  # Gradio web app (for HuggingFace Spaces)
+├── inference.py            # Standalone inference script
+├── requirements.txt        # Python dependencies
+├── models/
+│   └── depth_estimator.py  # Model wrapper with MiDaS integration
+├── utils/
+│   └── visualization.py    # Depth map coloring and overlays
+└── examples/               # Sample images for testing
+```
+## Quick Start
+### 1. Install Dependencies
+```bash
+git clone https://github.com/getmokshshah/depthlens.git
+cd depthlens
+pip install -r requirements.txt
+```
+### 2. Run the Web App Locally
+```bash
+python app.py
+```
+Opens a Gradio interface at `http://localhost:7860` where you can upload images and see depth maps.
+### 3. Run Inference from the Command Line
+```bash
+# Single image
+python inference.py --input photo.jpg --output depth_result.png
+# Folder of images
+python inference.py --input ./photos/ --output ./results/ --batch
+# Choose model size
+python inference.py --input photo.jpg --output depth.png --model large
+# Save raw depth as NumPy array
+python inference.py --input photo.jpg --output depth.npy --save-raw
+```
+## Model Options
+| Model | Speed (CPU) | Quality | Memory | Best For |
+|-------|-------------|---------|--------|----------|
+| `small` (default) | ~0.5s/image | Good | ~200MB | Real-time apps, demos |
+| `large` | ~3s/image | Best | ~1GB | High-quality results |
+The **small** model (MiDaS v2.1 Small) is optimized for mobile and edge devices. It runs fast on CPU while producing accurate relative depth maps. The **large** model (DPT-Large) uses a Vision Transformer backbone for maximum accuracy.
+## Visualization Modes
+DepthLens generates multiple visualization styles:
+- **Colored Depth Map**: A viridis/inferno/magma colormap applied to the depth prediction, producing a striking false-color image
+- **Side-by-Side Comparison**: Original image next to its depth map for easy comparison
+- **Depth Overlay**: Semi-transparent depth map blended on top of the original image
+## How It Works
+1. **Preprocessing**: The input image is resized and normalized to match the model's expected input format using MiDaS transforms
+2. **Depth Prediction**: The image passes through a deep neural network (CNN or Vision Transformer) that outputs a per-pixel inverse depth map
+3. **Normalization**: Raw depth values are normalized to [0, 1] range for visualization
+4. **Colormap Application**: NumPy and Matplotlib apply scientific colormaps to create visually informative depth images
+### Architecture Details
+The **small** model uses the EfficientNet-Lite backbone with a lightweight decoder, designed for fast inference. The **large** model uses DPT (Dense Prediction Transformer) — a Vision Transformer encoder with convolutional decoder heads that produces sharper depth boundaries and more consistent large-scale depth predictions.
+## Understanding the Output
+- **Warm colors** (red, orange, yellow) → **close** to the camera
+- **Cool colors** (blue, purple) → **far** from the camera
+- **Depth values are relative**, not absolute — the model predicts which parts are closer/farther, not exact distances in meters
+## Performance
+Benchmarked on a 2-core CPU (HuggingFace Spaces free tier):
+| Model | Resolution | Inference Time | Peak RAM |
+|-------|-----------|----------------|----------|
+| Small | 256×256 | ~0.4s | ~350MB |
+| Small | 512×512 | ~0.8s | ~500MB |
+| Large | 384×384 | ~2.8s | ~1.2GB |
+## Configuration
+### Inference Options
+| Argument | Default | Description |
+|----------|---------|-------------|
+| `--input` | required | Path to image or folder |
+| `--output` | required | Output path for results |
+| `--model` | `small` | Model size: `small` or `large` |
+| `--colormap` | `inferno` | Colormap: `inferno`, `magma`, `viridis`, `plasma` |
+| `--side-by-side` | `False` | Generate side-by-side comparison |
+| `--overlay` | `False` | Generate depth overlay on original |
+| `--overlay-alpha` | `0.5` | Transparency for overlay mode |
+| `--save-raw` | `False` | Save raw depth as .npy file |
+| `--batch` | `False` | Process a folder of images |
+## Use Cases
+- **3D Scene Understanding**: Understand spatial layout from a single photo
+- **Autonomous Systems**: Depth perception for robots and drones
+- **AR/VR**: Generate depth data for immersive experiences
+- **Photography**: Create depth-based focus effects (synthetic bokeh)
+- **Accessibility**: Help describe spatial relationships in scenes
+## Limitations
+- Depth is **relative**, not metric — objects are ranked near-to-far but without real-world distances
+- Transparent and reflective surfaces (glass, mirrors, water) can confuse the model
+- Very dark or overexposed regions may have unreliable depth predictions
+- The model performs best on natural outdoor scenes and indoor rooms
+## License
+MIT License — free to use for research or commercial projects.
+## Credits
+- **MiDaS**: Ranftl et al., "Towards Robust Monocular Depth Estimation" (Intel ISL)
+- **DPT**: Ranftl et al., "Vision Transformers for Dense Prediction" (ICCV 2021)
+- **Built with**: PyTorch, OpenCV, Gradio, NumPy, Matplotlib

app.py ADDED Viewed

	@@ -0,0 +1,144 @@

+"""
+DepthLens — Monocular Depth Estimation
+Gradio app for HuggingFace Spaces deployment.
+Estimates depth from a single image using MiDaS models.
+Optimized for free-tier CPU inference.
+"""
+import time
+import gradio as gr
+import numpy as np
+from PIL import Image
+from models import DepthEstimator
+from utils import depth_to_colormap, create_side_by_side, create_overlay
+from download_examples import download_examples
+# ──────────────────────────────────────────────
+#  Download example images if missing
+# ──────────────────────────────────────────────
+download_examples()
+# ──────────────────────────────────────────────
+#  Load model at startup (small for CPU speed)
+# ──────────────────────────────────────────────
+print("Starting DepthLens...")
+estimator = DepthEstimator(model_size="small")
+print("Ready!")
+def predict(
+    image: Image.Image,
+    colormap: str,
+    output_mode: str,
+    overlay_alpha: float,
+) -> tuple:
+    """
+    Run depth estimation and return results.
+    Returns:
+        (result_image, depth_colored, stats_string)
+    """
+    if image is None:
+        raise gr.Error("Please upload an image first.")
+    start = time.time()
+    # Run depth estimation
+    image_rgb = image.convert("RGB")
+    depth = estimator.predict(image_rgb)
+    inference_time = time.time() - start
+    # Create colormapped depth
+    depth_colored = depth_to_colormap(depth, colormap.lower())
+    # Create output based on mode
+    if output_mode == "Side-by-Side":
+        result = create_side_by_side(image_rgb, depth_colored)
+    elif output_mode == "Overlay":
+        result = create_overlay(image_rgb, depth_colored, alpha=overlay_alpha)
+    else:
+        result = depth_colored
+    # Stats
+    w, h = image_rgb.size
+    stats = f"{w}×{h} · {inference_time:.2f}s inference · MiDaS Small"
+    return result, depth_colored, stats
+# ──────────────────────────────────────────────
+#  Gradio Interface
+# ──────────────────────────────────────────────
+with gr.Blocks(
+    title="DepthLens — Monocular Depth Estimation",
+    theme=gr.themes.Base(
+        primary_hue="teal",
+        neutral_hue="slate",
+    ),
+) as demo:
+    gr.Markdown(
+        """
+        # DepthLens — Monocular Depth Estimation
+        Upload any image to estimate per-pixel depth using MiDaS.
+        Warm colors = close, cool colors = far.
+        """
+    )
+    with gr.Row():
+        with gr.Column(scale=1):
+            input_image = gr.Image(type="pil", label="Input Image")
+            colormap = gr.Dropdown(
+                choices=["Inferno", "Magma", "Viridis", "Plasma"],
+                value="Inferno",
+                label="Colormap",
+            )
+            output_mode = gr.Radio(
+                choices=["Depth Map", "Side-by-Side", "Overlay"],
+                value="Depth Map",
+                label="Output Mode",
+            )
+            overlay_alpha = gr.Slider(
+                minimum=0.2, maximum=0.8, value=0.5, step=0.1,
+                label="Overlay Opacity",
+                visible=False,
+            )
+            run_btn = gr.Button("Estimate Depth", variant="primary")
+            stats = gr.Textbox(label="Info", interactive=False)
+        with gr.Column(scale=1):
+            result_image = gr.Image(type="pil", label="Result")
+            depth_image = gr.Image(type="pil", label="Depth Map", visible=False)
+    # Show/hide overlay slider
+    def toggle_overlay(mode):
+        return gr.update(visible=(mode == "Overlay"))
+    output_mode.change(toggle_overlay, output_mode, overlay_alpha)
+    # Run prediction
+    run_btn.click(
+        fn=predict,
+        inputs=[input_image, colormap, output_mode, overlay_alpha],
+        outputs=[result_image, depth_image, stats],
+    )
+    # Examples
+    gr.Examples(
+        examples=[
+            ["examples/street.jpg", "Inferno", "Side-by-Side", 0.5],
+            ["examples/landscape.jpg", "Magma", "Depth Map", 0.5],
+            ["examples/indoor.jpg", "Viridis", "Overlay", 0.5],
+        ],
+        inputs=[input_image, colormap, output_mode, overlay_alpha],
+        outputs=[result_image, depth_image, stats],
+        fn=predict,
+        cache_examples=False,
+    )
+if __name__ == "__main__":
+    demo.launch()

download_examples.py ADDED Viewed

	@@ -0,0 +1,36 @@

+"""
+Downloads example images for the DepthLens demo.
+Called automatically by app.py on startup if images are missing.
+Uses Unsplash Source (free, no API key needed).
+"""
+import os
+import urllib.request
+EXAMPLES_DIR = os.path.join(os.path.dirname(__file__), "examples")
+EXAMPLE_URLS = {
+    "street.jpg": "https://images.unsplash.com/photo-1477959858617-67f85cf4f1df?w=640&q=80",
+    "landscape.jpg": "https://images.unsplash.com/photo-1506744038136-46273834b3fb?w=640&q=80",
+    "indoor.jpg": "https://images.unsplash.com/photo-1502672260266-1c1ef2d93688?w=640&q=80",
+}
+def download_examples():
+    """Download example images if they don't already exist."""
+    os.makedirs(EXAMPLES_DIR, exist_ok=True)
+    for filename, url in EXAMPLE_URLS.items():
+        filepath = os.path.join(EXAMPLES_DIR, filename)
+        if os.path.exists(filepath):
+            continue
+        print(f"Downloading {filename}...")
+        try:
+            urllib.request.urlretrieve(url, filepath)
+            print(f"  Saved to {filepath}")
+        except Exception as e:
+            print(f"  Failed to download {filename}: {e}")
+if __name__ == "__main__":
+    download_examples()

examples/.gitkeep ADDED Viewed

File without changes

inference.py ADDED Viewed

	@@ -0,0 +1,136 @@

+"""
+Standalone inference script for DepthLens.
+Usage:
+    python inference.py --input photo.jpg --output depth.png
+    python inference.py --input ./photos/ --output ./results/ --batch
+    python inference.py --input photo.jpg --output depth.png --model large --colormap magma
+"""
+import argparse
+import time
+from pathlib import Path
+import numpy as np
+from PIL import Image
+from models import DepthEstimator
+from utils import depth_to_colormap, create_side_by_side, create_overlay
+IMAGE_EXTENSIONS = {".jpg", ".jpeg", ".png", ".bmp", ".webp", ".tiff"}
+def process_single(
+    estimator: DepthEstimator,
+    input_path: Path,
+    output_path: Path,
+    colormap: str,
+    side_by_side: bool,
+    overlay: bool,
+    overlay_alpha: float,
+    save_raw: bool,
+):
+    """Process a single image and save results."""
+    print(f"  Processing: {input_path.name}")
+    start = time.time()
+    image = Image.open(input_path).convert("RGB")
+    depth = estimator.predict(image)
+    elapsed = time.time() - start
+    print(f"  Inference: {elapsed:.2f}s")
+    # Save colormapped depth
+    depth_colored = depth_to_colormap(depth, colormap)
+    if side_by_side:
+        result = create_side_by_side(image, depth_colored)
+    elif overlay:
+        result = create_overlay(image, depth_colored, alpha=overlay_alpha)
+    else:
+        result = depth_colored
+    # Determine output path
+    out = Path(output_path)
+    if out.suffix.lower() == ".npy" or save_raw:
+        raw_path = out.with_suffix(".npy") if out.suffix else out / (input_path.stem + "_depth.npy")
+        np.save(str(raw_path), depth)
+        print(f"  Saved raw depth: {raw_path}")
+    if out.suffix.lower() != ".npy":
+        result.save(str(out))
+        print(f"  Saved: {out}")
+def process_batch(
+    estimator: DepthEstimator,
+    input_dir: Path,
+    output_dir: Path,
+    colormap: str,
+    side_by_side: bool,
+    overlay: bool,
+    overlay_alpha: float,
+    save_raw: bool,
+):
+    """Process all images in a directory."""
+    output_dir.mkdir(parents=True, exist_ok=True)
+    images = sorted(
+        p for p in input_dir.iterdir()
+        if p.suffix.lower() in IMAGE_EXTENSIONS
+    )
+    if not images:
+        print(f"No images found in {input_dir}")
+        return
+    print(f"Found {len(images)} images in {input_dir}")
+    total_start = time.time()
+    for img_path in images:
+        out_name = img_path.stem + "_depth.png"
+        out_path = output_dir / out_name
+        process_single(
+            estimator, img_path, out_path,
+            colormap, side_by_side, overlay, overlay_alpha, save_raw,
+        )
+    total = time.time() - total_start
+    avg = total / len(images)
+    print(f"\nDone! {len(images)} images in {total:.1f}s (avg {avg:.2f}s/image)")
+def main():
+    parser = argparse.ArgumentParser(description="DepthLens — Monocular Depth Estimation")
+    parser.add_argument("--input", required=True, help="Input image path or directory")
+    parser.add_argument("--output", required=True, help="Output path or directory")
+    parser.add_argument("--model", default="small", choices=["small", "large"], help="Model size")
+    parser.add_argument("--colormap", default="inferno", choices=["inferno", "magma", "viridis", "plasma"])
+    parser.add_argument("--side-by-side", action="store_true", help="Generate side-by-side comparison")
+    parser.add_argument("--overlay", action="store_true", help="Generate depth overlay on original")
+    parser.add_argument("--overlay-alpha", type=float, default=0.5, help="Overlay transparency")
+    parser.add_argument("--save-raw", action="store_true", help="Also save raw depth as .npy")
+    parser.add_argument("--batch", action="store_true", help="Process a folder of images")
+    args = parser.parse_args()
+    estimator = DepthEstimator(model_size=args.model)
+    input_path = Path(args.input)
+    output_path = Path(args.output)
+    if args.batch:
+        process_batch(
+            estimator, input_path, output_path,
+            args.colormap, args.side_by_side, args.overlay, args.overlay_alpha, args.save_raw,
+        )
+    else:
+        output_path.parent.mkdir(parents=True, exist_ok=True)
+        process_single(
+            estimator, input_path, output_path,
+            args.colormap, args.side_by_side, args.overlay, args.overlay_alpha, args.save_raw,
+        )
+if __name__ == "__main__":
+    main()

models/__init__.py ADDED Viewed

	@@ -0,0 +1,3 @@


1	+ from .depth_estimator import DepthEstimator, MODEL_CONFIGS
2	+
3	+ __all__ = ["DepthEstimator", "MODEL_CONFIGS"]

models/depth_estimator.py ADDED Viewed

	@@ -0,0 +1,120 @@

+"""
+Depth estimation model wrapper using MiDaS.
+Supports two model sizes:
+  - small: MiDaS v2.1 Small (EfficientNet-Lite backbone, fast CPU inference)
+  - large: DPT-Large (Vision Transformer backbone, highest quality)
+"""
+import torch
+import numpy as np
+from PIL import Image
+# Model configurations
+MODEL_CONFIGS = {
+    "small": {
+        "repo": "intel-isl/MiDaS",
+        "model_name": "MiDaS_small",
+        "transform_name": "small_transform",
+        "description": "MiDaS v2.1 Small — Fast CPU inference (~0.5s)",
+    },
+    "large": {
+        "repo": "intel-isl/MiDaS",
+        "model_name": "DPT_Large",
+        "transform_name": "dpt_transform",
+        "description": "DPT-Large — Highest quality depth estimation (~3s)",
+    },
+}
+class DepthEstimator:
+    """Monocular depth estimation using MiDaS models."""
+    def __init__(self, model_size: str = "small", device: str = None):
+        """
+        Initialize the depth estimator.
+        Args:
+            model_size: 'small' or 'large'
+            device: 'cpu' or 'cuda' (auto-detected if None)
+        """
+        if model_size not in MODEL_CONFIGS:
+            raise ValueError(f"Unknown model size '{model_size}'. Choose from: {list(MODEL_CONFIGS.keys())}")
+        self.model_size = model_size
+        self.config = MODEL_CONFIGS[model_size]
+        self.device = device or ("cuda" if torch.cuda.is_available() else "cpu")
+        self._load_model()
+    def _load_model(self):
+        """Load the MiDaS model and transforms from PyTorch Hub."""
+        print(f"Loading {self.config['description']}...")
+        # Load model
+        self.model = torch.hub.load(
+            self.config["repo"],
+            self.config["model_name"],
+            trust_repo=True,
+        )
+        self.model.to(self.device)
+        self.model.eval()
+        # Load transforms
+        midas_transforms = torch.hub.load(
+            self.config["repo"],
+            "transforms",
+            trust_repo=True,
+        )
+        if self.model_size == "small":
+            self.transform = midas_transforms.small_transform
+        else:
+            self.transform = midas_transforms.dpt_transform
+        print(f"Model loaded on {self.device}")
+    @torch.no_grad()
+    def predict(self, image: Image.Image) -> np.ndarray:
+        """
+        Predict depth from a PIL Image.
+        Args:
+            image: Input PIL Image (RGB)
+        Returns:
+            depth_map: Normalized depth array (H, W) with values in [0, 1].
+                       Higher values = closer to camera.
+        """
+        # Convert PIL to numpy RGB
+        img_np = np.array(image.convert("RGB"))
+        # Apply MiDaS transform
+        input_tensor = self.transform(img_np).to(self.device)
+        # Run inference
+        prediction = self.model(input_tensor)
+        # Resize to original dimensions
+        prediction = torch.nn.functional.interpolate(
+            prediction.unsqueeze(1),
+            size=img_np.shape[:2],
+            mode="bicubic",
+            align_corners=False,
+        ).squeeze()
+        depth = prediction.cpu().numpy()
+        # Normalize to [0, 1]
+        depth_min = depth.min()
+        depth_max = depth.max()
+        if depth_max - depth_min > 1e-6:
+            depth = (depth - depth_min) / (depth_max - depth_min)
+        else:
+            depth = np.zeros_like(depth)
+        return depth
+    def __repr__(self):
+        return f"DepthEstimator(model_size='{self.model_size}', device='{self.device}')"

requirements.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+torch>=1.13.0
+torchvision>=0.14.0
+timm>=0.6.12
+opencv-python-headless>=4.7.0
+Pillow>=9.4.0
+numpy>=1.23.0
+matplotlib>=3.6.0
+gradio>=4.0.0

utils/__init__.py ADDED Viewed

	@@ -0,0 +1,15 @@

+from .visualization import (
+    depth_to_colormap,
+    create_side_by_side,
+    create_overlay,
+    add_depth_legend,
+    COLORMAPS,
+)
+__all__ = [
+    "depth_to_colormap",
+    "create_side_by_side",
+    "create_overlay",
+    "add_depth_legend",
+    "COLORMAPS",
+]

utils/visualization.py ADDED Viewed

	@@ -0,0 +1,129 @@

+"""
+Visualization utilities for depth map rendering.
+Provides colormapped depth images, side-by-side comparisons, and overlay modes.
+"""
+import numpy as np
+from PIL import Image
+import matplotlib
+import matplotlib.cm as cm
+# Available colormaps for depth visualization
+COLORMAPS = {
+    "inferno": cm.inferno,
+    "magma": cm.magma,
+    "viridis": cm.viridis,
+    "plasma": cm.plasma,
+}
+def depth_to_colormap(depth: np.ndarray, colormap: str = "inferno") -> Image.Image:
+    """
+    Apply a scientific colormap to a normalized depth array.
+    Args:
+        depth: Normalized depth array (H, W), values in [0, 1]
+        colormap: Name of colormap ('inferno', 'magma', 'viridis', 'plasma')
+    Returns:
+        Colormapped PIL Image (RGB)
+    """
+    if colormap not in COLORMAPS:
+        raise ValueError(f"Unknown colormap '{colormap}'. Choose from: {list(COLORMAPS.keys())}")
+    cmap = COLORMAPS[colormap]
+    colored = cmap(depth)  # Returns (H, W, 4) RGBA float array
+    colored = (colored[:, :, :3] * 255).astype(np.uint8)  # Drop alpha, convert to uint8
+    return Image.fromarray(colored)
+def create_side_by_side(
+    original: Image.Image,
+    depth_colored: Image.Image,
+    gap: int = 4,
+    bg_color: tuple = (20, 24, 30),
+) -> Image.Image:
+    """
+    Create a side-by-side comparison of original image and depth map.
+    Args:
+        original: Original PIL Image
+        depth_colored: Colormapped depth PIL Image
+        gap: Pixel gap between images
+        bg_color: Background color for the gap
+    Returns:
+        Combined PIL Image
+    """
+    # Resize depth to match original dimensions
+    depth_resized = depth_colored.resize(original.size, Image.LANCZOS)
+    w, h = original.size
+    canvas = Image.new("RGB", (w * 2 + gap, h), bg_color)
+    canvas.paste(original, (0, 0))
+    canvas.paste(depth_resized, (w + gap, 0))
+    return canvas
+def create_overlay(
+    original: Image.Image,
+    depth_colored: Image.Image,
+    alpha: float = 0.5,
+) -> Image.Image:
+    """
+    Blend the depth map on top of the original image.
+    Args:
+        original: Original PIL Image
+        depth_colored: Colormapped depth PIL Image
+        alpha: Blend factor (0 = only original, 1 = only depth)
+    Returns:
+        Blended PIL Image
+    """
+    depth_resized = depth_colored.resize(original.size, Image.LANCZOS)
+    original_rgb = original.convert("RGB")
+    blended = Image.blend(original_rgb, depth_resized, alpha)
+    return blended
+def add_depth_legend(
+    image: Image.Image,
+    colormap: str = "inferno",
+    bar_height: int = 24,
+    padding: int = 12,
+) -> Image.Image:
+    """
+    Add a color legend bar at the bottom of the image.
+    Args:
+        image: Input PIL Image
+        colormap: Colormap name for the legend
+        bar_height: Height of the legend bar
+        padding: Padding around the bar
+    Returns:
+        Image with legend appended at the bottom
+    """
+    w, h = image.size
+    total_height = h + bar_height + padding * 2
+    canvas = Image.new("RGB", (w, total_height), (20, 24, 30))
+    canvas.paste(image, (0, 0))
+    # Create gradient bar
+    cmap = COLORMAPS.get(colormap, cm.inferno)
+    gradient = np.linspace(0, 1, w - padding * 2).reshape(1, -1)
+    gradient = np.repeat(gradient, bar_height, axis=0)
+    colored = cmap(gradient)[:, :, :3]
+    colored = (colored * 255).astype(np.uint8)
+    bar_img = Image.fromarray(colored)
+    canvas.paste(bar_img, (padding, h + padding))
+    return canvas