Spaces:
Sleeping
Sleeping
Commit ·
8cc8ac9
1
Parent(s): db355c8
Pushed project
Browse files- .github/workflows/sync-to-hf.yml +21 -0
- .gitignore +42 -0
- LICENSE +1 -1
- README.md +158 -0
- app.py +144 -0
- download_examples.py +36 -0
- examples/.gitkeep +0 -0
- inference.py +136 -0
- models/__init__.py +3 -0
- models/depth_estimator.py +120 -0
- requirements.txt +8 -0
- utils/__init__.py +15 -0
- utils/visualization.py +129 -0
.github/workflows/sync-to-hf.yml
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
name: Sync to Hugging Face Space
|
| 2 |
+
|
| 3 |
+
on:
|
| 4 |
+
push:
|
| 5 |
+
branches: [main]
|
| 6 |
+
workflow_dispatch:
|
| 7 |
+
|
| 8 |
+
jobs:
|
| 9 |
+
sync-to-hub:
|
| 10 |
+
runs-on: ubuntu-latest
|
| 11 |
+
steps:
|
| 12 |
+
- uses: actions/checkout@v4
|
| 13 |
+
with:
|
| 14 |
+
fetch-depth: 0
|
| 15 |
+
lfs: true
|
| 16 |
+
|
| 17 |
+
- name: Push to HuggingFace Space
|
| 18 |
+
env:
|
| 19 |
+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
|
| 20 |
+
run: |
|
| 21 |
+
git push --force https://getmokshshah:$HF_TOKEN@huggingface.co/spaces/getmokshshah/depthlens main
|
.gitignore
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Python
|
| 2 |
+
__pycache__/
|
| 3 |
+
*.py[cod]
|
| 4 |
+
*$py.class
|
| 5 |
+
*.egg-info/
|
| 6 |
+
dist/
|
| 7 |
+
build/
|
| 8 |
+
*.egg
|
| 9 |
+
|
| 10 |
+
# Virtual environments
|
| 11 |
+
venv/
|
| 12 |
+
env/
|
| 13 |
+
.env/
|
| 14 |
+
|
| 15 |
+
# IDE
|
| 16 |
+
.vscode/
|
| 17 |
+
.idea/
|
| 18 |
+
*.swp
|
| 19 |
+
*.swo
|
| 20 |
+
|
| 21 |
+
# OS
|
| 22 |
+
.DS_Store
|
| 23 |
+
Thumbs.db
|
| 24 |
+
|
| 25 |
+
# Model weights (downloaded at runtime)
|
| 26 |
+
*.pth
|
| 27 |
+
*.pt
|
| 28 |
+
*.onnx
|
| 29 |
+
hub/
|
| 30 |
+
|
| 31 |
+
# Data
|
| 32 |
+
*.npy
|
| 33 |
+
results/
|
| 34 |
+
outputs/
|
| 35 |
+
|
| 36 |
+
# Example images (downloaded at runtime)
|
| 37 |
+
examples/*.jpg
|
| 38 |
+
examples/*.png
|
| 39 |
+
|
| 40 |
+
# Logs
|
| 41 |
+
*.log
|
| 42 |
+
runs/
|
LICENSE
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
MIT License
|
| 2 |
|
| 3 |
-
Copyright (c) 2026
|
| 4 |
|
| 5 |
Permission is hereby granted, free of charge, to any person obtaining a copy
|
| 6 |
of this software and associated documentation files (the "Software"), to deal
|
|
|
|
| 1 |
MIT License
|
| 2 |
|
| 3 |
+
Copyright (c) 2026 Moksh Shah
|
| 4 |
|
| 5 |
Permission is hereby granted, free of charge, to any person obtaining a copy
|
| 6 |
of this software and associated documentation files (the "Software"), to deal
|
README.md
ADDED
|
@@ -0,0 +1,158 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: DepthLens
|
| 3 |
+
emoji: 🌀
|
| 4 |
+
colorFrom: teal
|
| 5 |
+
colorTo: yellow
|
| 6 |
+
sdk: gradio
|
| 7 |
+
sdk_version: "4.44.1"
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# DepthLens — Monocular Depth Estimation
|
| 14 |
+
|
| 15 |
+
Estimate depth from a single image using state-of-the-art deep learning models. Upload any photo and get a detailed depth map showing how far away each part of the scene is.
|
| 16 |
+
|
| 17 |
+
**[Try the Live Demo →](https://huggingface.co/spaces/getmokshshah/depthlens)**
|
| 18 |
+
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
## What It Does
|
| 22 |
+
|
| 23 |
+
DepthLens takes a single 2D image and predicts a per-pixel depth map — no stereo cameras, no LiDAR, just one photo. The output is a color-mapped visualization showing relative distances: warm colors (red/yellow) for nearby objects and cool colors (blue/purple) for faraway ones.
|
| 24 |
+
|
| 25 |
+
This is the same core technique used in autonomous vehicles, AR/VR applications, robotics, and 3D scene reconstruction.
|
| 26 |
+
|
| 27 |
+
## 📁 Project Structure
|
| 28 |
+
|
| 29 |
+
```
|
| 30 |
+
depthlens/
|
| 31 |
+
├── app.py # Gradio web app (for HuggingFace Spaces)
|
| 32 |
+
├── inference.py # Standalone inference script
|
| 33 |
+
├── requirements.txt # Python dependencies
|
| 34 |
+
├── models/
|
| 35 |
+
│ └── depth_estimator.py # Model wrapper with MiDaS integration
|
| 36 |
+
├── utils/
|
| 37 |
+
│ └── visualization.py # Depth map coloring and overlays
|
| 38 |
+
└── examples/ # Sample images for testing
|
| 39 |
+
```
|
| 40 |
+
|
| 41 |
+
## Quick Start
|
| 42 |
+
|
| 43 |
+
### 1. Install Dependencies
|
| 44 |
+
|
| 45 |
+
```bash
|
| 46 |
+
git clone https://github.com/getmokshshah/depthlens.git
|
| 47 |
+
cd depthlens
|
| 48 |
+
pip install -r requirements.txt
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
### 2. Run the Web App Locally
|
| 52 |
+
|
| 53 |
+
```bash
|
| 54 |
+
python app.py
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
Opens a Gradio interface at `http://localhost:7860` where you can upload images and see depth maps.
|
| 58 |
+
|
| 59 |
+
### 3. Run Inference from the Command Line
|
| 60 |
+
|
| 61 |
+
```bash
|
| 62 |
+
# Single image
|
| 63 |
+
python inference.py --input photo.jpg --output depth_result.png
|
| 64 |
+
|
| 65 |
+
# Folder of images
|
| 66 |
+
python inference.py --input ./photos/ --output ./results/ --batch
|
| 67 |
+
|
| 68 |
+
# Choose model size
|
| 69 |
+
python inference.py --input photo.jpg --output depth.png --model large
|
| 70 |
+
|
| 71 |
+
# Save raw depth as NumPy array
|
| 72 |
+
python inference.py --input photo.jpg --output depth.npy --save-raw
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
## Model Options
|
| 76 |
+
|
| 77 |
+
| Model | Speed (CPU) | Quality | Memory | Best For |
|
| 78 |
+
|-------|-------------|---------|--------|----------|
|
| 79 |
+
| `small` (default) | ~0.5s/image | Good | ~200MB | Real-time apps, demos |
|
| 80 |
+
| `large` | ~3s/image | Best | ~1GB | High-quality results |
|
| 81 |
+
|
| 82 |
+
The **small** model (MiDaS v2.1 Small) is optimized for mobile and edge devices. It runs fast on CPU while producing accurate relative depth maps. The **large** model (DPT-Large) uses a Vision Transformer backbone for maximum accuracy.
|
| 83 |
+
|
| 84 |
+
## Visualization Modes
|
| 85 |
+
|
| 86 |
+
DepthLens generates multiple visualization styles:
|
| 87 |
+
|
| 88 |
+
- **Colored Depth Map**: A viridis/inferno/magma colormap applied to the depth prediction, producing a striking false-color image
|
| 89 |
+
- **Side-by-Side Comparison**: Original image next to its depth map for easy comparison
|
| 90 |
+
- **Depth Overlay**: Semi-transparent depth map blended on top of the original image
|
| 91 |
+
|
| 92 |
+
## How It Works
|
| 93 |
+
|
| 94 |
+
1. **Preprocessing**: The input image is resized and normalized to match the model's expected input format using MiDaS transforms
|
| 95 |
+
2. **Depth Prediction**: The image passes through a deep neural network (CNN or Vision Transformer) that outputs a per-pixel inverse depth map
|
| 96 |
+
3. **Normalization**: Raw depth values are normalized to [0, 1] range for visualization
|
| 97 |
+
4. **Colormap Application**: NumPy and Matplotlib apply scientific colormaps to create visually informative depth images
|
| 98 |
+
|
| 99 |
+
### Architecture Details
|
| 100 |
+
|
| 101 |
+
The **small** model uses the EfficientNet-Lite backbone with a lightweight decoder, designed for fast inference. The **large** model uses DPT (Dense Prediction Transformer) — a Vision Transformer encoder with convolutional decoder heads that produces sharper depth boundaries and more consistent large-scale depth predictions.
|
| 102 |
+
|
| 103 |
+
## Understanding the Output
|
| 104 |
+
|
| 105 |
+
- **Warm colors** (red, orange, yellow) → **close** to the camera
|
| 106 |
+
- **Cool colors** (blue, purple) → **far** from the camera
|
| 107 |
+
- **Depth values are relative**, not absolute — the model predicts which parts are closer/farther, not exact distances in meters
|
| 108 |
+
|
| 109 |
+
## Performance
|
| 110 |
+
|
| 111 |
+
Benchmarked on a 2-core CPU (HuggingFace Spaces free tier):
|
| 112 |
+
|
| 113 |
+
| Model | Resolution | Inference Time | Peak RAM |
|
| 114 |
+
|-------|-----------|----------------|----------|
|
| 115 |
+
| Small | 256×256 | ~0.4s | ~350MB |
|
| 116 |
+
| Small | 512×512 | ~0.8s | ~500MB |
|
| 117 |
+
| Large | 384×384 | ~2.8s | ~1.2GB |
|
| 118 |
+
|
| 119 |
+
## Configuration
|
| 120 |
+
|
| 121 |
+
### Inference Options
|
| 122 |
+
|
| 123 |
+
| Argument | Default | Description |
|
| 124 |
+
|----------|---------|-------------|
|
| 125 |
+
| `--input` | required | Path to image or folder |
|
| 126 |
+
| `--output` | required | Output path for results |
|
| 127 |
+
| `--model` | `small` | Model size: `small` or `large` |
|
| 128 |
+
| `--colormap` | `inferno` | Colormap: `inferno`, `magma`, `viridis`, `plasma` |
|
| 129 |
+
| `--side-by-side` | `False` | Generate side-by-side comparison |
|
| 130 |
+
| `--overlay` | `False` | Generate depth overlay on original |
|
| 131 |
+
| `--overlay-alpha` | `0.5` | Transparency for overlay mode |
|
| 132 |
+
| `--save-raw` | `False` | Save raw depth as .npy file |
|
| 133 |
+
| `--batch` | `False` | Process a folder of images |
|
| 134 |
+
|
| 135 |
+
## Use Cases
|
| 136 |
+
|
| 137 |
+
- **3D Scene Understanding**: Understand spatial layout from a single photo
|
| 138 |
+
- **Autonomous Systems**: Depth perception for robots and drones
|
| 139 |
+
- **AR/VR**: Generate depth data for immersive experiences
|
| 140 |
+
- **Photography**: Create depth-based focus effects (synthetic bokeh)
|
| 141 |
+
- **Accessibility**: Help describe spatial relationships in scenes
|
| 142 |
+
|
| 143 |
+
## Limitations
|
| 144 |
+
|
| 145 |
+
- Depth is **relative**, not metric — objects are ranked near-to-far but without real-world distances
|
| 146 |
+
- Transparent and reflective surfaces (glass, mirrors, water) can confuse the model
|
| 147 |
+
- Very dark or overexposed regions may have unreliable depth predictions
|
| 148 |
+
- The model performs best on natural outdoor scenes and indoor rooms
|
| 149 |
+
|
| 150 |
+
## License
|
| 151 |
+
|
| 152 |
+
MIT License — free to use for research or commercial projects.
|
| 153 |
+
|
| 154 |
+
## Credits
|
| 155 |
+
|
| 156 |
+
- **MiDaS**: Ranftl et al., "Towards Robust Monocular Depth Estimation" (Intel ISL)
|
| 157 |
+
- **DPT**: Ranftl et al., "Vision Transformers for Dense Prediction" (ICCV 2021)
|
| 158 |
+
- **Built with**: PyTorch, OpenCV, Gradio, NumPy, Matplotlib
|
app.py
ADDED
|
@@ -0,0 +1,144 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
DepthLens — Monocular Depth Estimation
|
| 3 |
+
Gradio app for HuggingFace Spaces deployment.
|
| 4 |
+
|
| 5 |
+
Estimates depth from a single image using MiDaS models.
|
| 6 |
+
Optimized for free-tier CPU inference.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
import time
|
| 10 |
+
|
| 11 |
+
import gradio as gr
|
| 12 |
+
import numpy as np
|
| 13 |
+
from PIL import Image
|
| 14 |
+
|
| 15 |
+
from models import DepthEstimator
|
| 16 |
+
from utils import depth_to_colormap, create_side_by_side, create_overlay
|
| 17 |
+
from download_examples import download_examples
|
| 18 |
+
|
| 19 |
+
# ──────────────────────────────────────────────
|
| 20 |
+
# Download example images if missing
|
| 21 |
+
# ──────────────────────────────────────────────
|
| 22 |
+
download_examples()
|
| 23 |
+
|
| 24 |
+
# ──────────────────────────────────────────────
|
| 25 |
+
# Load model at startup (small for CPU speed)
|
| 26 |
+
# ──────────────────────────────────────────────
|
| 27 |
+
print("Starting DepthLens...")
|
| 28 |
+
estimator = DepthEstimator(model_size="small")
|
| 29 |
+
print("Ready!")
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
def predict(
|
| 33 |
+
image: Image.Image,
|
| 34 |
+
colormap: str,
|
| 35 |
+
output_mode: str,
|
| 36 |
+
overlay_alpha: float,
|
| 37 |
+
) -> tuple:
|
| 38 |
+
"""
|
| 39 |
+
Run depth estimation and return results.
|
| 40 |
+
|
| 41 |
+
Returns:
|
| 42 |
+
(result_image, depth_colored, stats_string)
|
| 43 |
+
"""
|
| 44 |
+
if image is None:
|
| 45 |
+
raise gr.Error("Please upload an image first.")
|
| 46 |
+
|
| 47 |
+
start = time.time()
|
| 48 |
+
|
| 49 |
+
# Run depth estimation
|
| 50 |
+
image_rgb = image.convert("RGB")
|
| 51 |
+
depth = estimator.predict(image_rgb)
|
| 52 |
+
|
| 53 |
+
inference_time = time.time() - start
|
| 54 |
+
|
| 55 |
+
# Create colormapped depth
|
| 56 |
+
depth_colored = depth_to_colormap(depth, colormap.lower())
|
| 57 |
+
|
| 58 |
+
# Create output based on mode
|
| 59 |
+
if output_mode == "Side-by-Side":
|
| 60 |
+
result = create_side_by_side(image_rgb, depth_colored)
|
| 61 |
+
elif output_mode == "Overlay":
|
| 62 |
+
result = create_overlay(image_rgb, depth_colored, alpha=overlay_alpha)
|
| 63 |
+
else:
|
| 64 |
+
result = depth_colored
|
| 65 |
+
|
| 66 |
+
# Stats
|
| 67 |
+
w, h = image_rgb.size
|
| 68 |
+
stats = f"{w}×{h} · {inference_time:.2f}s inference · MiDaS Small"
|
| 69 |
+
|
| 70 |
+
return result, depth_colored, stats
|
| 71 |
+
|
| 72 |
+
|
| 73 |
+
# ──────────────────────────────────────────────
|
| 74 |
+
# Gradio Interface
|
| 75 |
+
# ──────────────────────────────────────────────
|
| 76 |
+
with gr.Blocks(
|
| 77 |
+
title="DepthLens — Monocular Depth Estimation",
|
| 78 |
+
theme=gr.themes.Base(
|
| 79 |
+
primary_hue="teal",
|
| 80 |
+
neutral_hue="slate",
|
| 81 |
+
),
|
| 82 |
+
) as demo:
|
| 83 |
+
gr.Markdown(
|
| 84 |
+
"""
|
| 85 |
+
# DepthLens — Monocular Depth Estimation
|
| 86 |
+
Upload any image to estimate per-pixel depth using MiDaS.
|
| 87 |
+
Warm colors = close, cool colors = far.
|
| 88 |
+
"""
|
| 89 |
+
)
|
| 90 |
+
|
| 91 |
+
with gr.Row():
|
| 92 |
+
with gr.Column(scale=1):
|
| 93 |
+
input_image = gr.Image(type="pil", label="Input Image")
|
| 94 |
+
colormap = gr.Dropdown(
|
| 95 |
+
choices=["Inferno", "Magma", "Viridis", "Plasma"],
|
| 96 |
+
value="Inferno",
|
| 97 |
+
label="Colormap",
|
| 98 |
+
)
|
| 99 |
+
output_mode = gr.Radio(
|
| 100 |
+
choices=["Depth Map", "Side-by-Side", "Overlay"],
|
| 101 |
+
value="Depth Map",
|
| 102 |
+
label="Output Mode",
|
| 103 |
+
)
|
| 104 |
+
overlay_alpha = gr.Slider(
|
| 105 |
+
minimum=0.2, maximum=0.8, value=0.5, step=0.1,
|
| 106 |
+
label="Overlay Opacity",
|
| 107 |
+
visible=False,
|
| 108 |
+
)
|
| 109 |
+
run_btn = gr.Button("Estimate Depth", variant="primary")
|
| 110 |
+
stats = gr.Textbox(label="Info", interactive=False)
|
| 111 |
+
|
| 112 |
+
with gr.Column(scale=1):
|
| 113 |
+
result_image = gr.Image(type="pil", label="Result")
|
| 114 |
+
depth_image = gr.Image(type="pil", label="Depth Map", visible=False)
|
| 115 |
+
|
| 116 |
+
# Show/hide overlay slider
|
| 117 |
+
def toggle_overlay(mode):
|
| 118 |
+
return gr.update(visible=(mode == "Overlay"))
|
| 119 |
+
|
| 120 |
+
output_mode.change(toggle_overlay, output_mode, overlay_alpha)
|
| 121 |
+
|
| 122 |
+
# Run prediction
|
| 123 |
+
run_btn.click(
|
| 124 |
+
fn=predict,
|
| 125 |
+
inputs=[input_image, colormap, output_mode, overlay_alpha],
|
| 126 |
+
outputs=[result_image, depth_image, stats],
|
| 127 |
+
)
|
| 128 |
+
|
| 129 |
+
# Examples
|
| 130 |
+
gr.Examples(
|
| 131 |
+
examples=[
|
| 132 |
+
["examples/street.jpg", "Inferno", "Side-by-Side", 0.5],
|
| 133 |
+
["examples/landscape.jpg", "Magma", "Depth Map", 0.5],
|
| 134 |
+
["examples/indoor.jpg", "Viridis", "Overlay", 0.5],
|
| 135 |
+
],
|
| 136 |
+
inputs=[input_image, colormap, output_mode, overlay_alpha],
|
| 137 |
+
outputs=[result_image, depth_image, stats],
|
| 138 |
+
fn=predict,
|
| 139 |
+
cache_examples=False,
|
| 140 |
+
)
|
| 141 |
+
|
| 142 |
+
|
| 143 |
+
if __name__ == "__main__":
|
| 144 |
+
demo.launch()
|
download_examples.py
ADDED
|
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Downloads example images for the DepthLens demo.
|
| 3 |
+
Called automatically by app.py on startup if images are missing.
|
| 4 |
+
Uses Unsplash Source (free, no API key needed).
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
import os
|
| 8 |
+
import urllib.request
|
| 9 |
+
|
| 10 |
+
EXAMPLES_DIR = os.path.join(os.path.dirname(__file__), "examples")
|
| 11 |
+
|
| 12 |
+
EXAMPLE_URLS = {
|
| 13 |
+
"street.jpg": "https://images.unsplash.com/photo-1477959858617-67f85cf4f1df?w=640&q=80",
|
| 14 |
+
"landscape.jpg": "https://images.unsplash.com/photo-1506744038136-46273834b3fb?w=640&q=80",
|
| 15 |
+
"indoor.jpg": "https://images.unsplash.com/photo-1502672260266-1c1ef2d93688?w=640&q=80",
|
| 16 |
+
}
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
def download_examples():
|
| 20 |
+
"""Download example images if they don't already exist."""
|
| 21 |
+
os.makedirs(EXAMPLES_DIR, exist_ok=True)
|
| 22 |
+
|
| 23 |
+
for filename, url in EXAMPLE_URLS.items():
|
| 24 |
+
filepath = os.path.join(EXAMPLES_DIR, filename)
|
| 25 |
+
if os.path.exists(filepath):
|
| 26 |
+
continue
|
| 27 |
+
print(f"Downloading {filename}...")
|
| 28 |
+
try:
|
| 29 |
+
urllib.request.urlretrieve(url, filepath)
|
| 30 |
+
print(f" Saved to {filepath}")
|
| 31 |
+
except Exception as e:
|
| 32 |
+
print(f" Failed to download {filename}: {e}")
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
if __name__ == "__main__":
|
| 36 |
+
download_examples()
|
examples/.gitkeep
ADDED
|
File without changes
|
inference.py
ADDED
|
@@ -0,0 +1,136 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Standalone inference script for DepthLens.
|
| 3 |
+
|
| 4 |
+
Usage:
|
| 5 |
+
python inference.py --input photo.jpg --output depth.png
|
| 6 |
+
python inference.py --input ./photos/ --output ./results/ --batch
|
| 7 |
+
python inference.py --input photo.jpg --output depth.png --model large --colormap magma
|
| 8 |
+
"""
|
| 9 |
+
|
| 10 |
+
import argparse
|
| 11 |
+
import time
|
| 12 |
+
from pathlib import Path
|
| 13 |
+
|
| 14 |
+
import numpy as np
|
| 15 |
+
from PIL import Image
|
| 16 |
+
|
| 17 |
+
from models import DepthEstimator
|
| 18 |
+
from utils import depth_to_colormap, create_side_by_side, create_overlay
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
IMAGE_EXTENSIONS = {".jpg", ".jpeg", ".png", ".bmp", ".webp", ".tiff"}
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
def process_single(
|
| 25 |
+
estimator: DepthEstimator,
|
| 26 |
+
input_path: Path,
|
| 27 |
+
output_path: Path,
|
| 28 |
+
colormap: str,
|
| 29 |
+
side_by_side: bool,
|
| 30 |
+
overlay: bool,
|
| 31 |
+
overlay_alpha: float,
|
| 32 |
+
save_raw: bool,
|
| 33 |
+
):
|
| 34 |
+
"""Process a single image and save results."""
|
| 35 |
+
print(f" Processing: {input_path.name}")
|
| 36 |
+
start = time.time()
|
| 37 |
+
|
| 38 |
+
image = Image.open(input_path).convert("RGB")
|
| 39 |
+
depth = estimator.predict(image)
|
| 40 |
+
|
| 41 |
+
elapsed = time.time() - start
|
| 42 |
+
print(f" Inference: {elapsed:.2f}s")
|
| 43 |
+
|
| 44 |
+
# Save colormapped depth
|
| 45 |
+
depth_colored = depth_to_colormap(depth, colormap)
|
| 46 |
+
|
| 47 |
+
if side_by_side:
|
| 48 |
+
result = create_side_by_side(image, depth_colored)
|
| 49 |
+
elif overlay:
|
| 50 |
+
result = create_overlay(image, depth_colored, alpha=overlay_alpha)
|
| 51 |
+
else:
|
| 52 |
+
result = depth_colored
|
| 53 |
+
|
| 54 |
+
# Determine output path
|
| 55 |
+
out = Path(output_path)
|
| 56 |
+
if out.suffix.lower() == ".npy" or save_raw:
|
| 57 |
+
raw_path = out.with_suffix(".npy") if out.suffix else out / (input_path.stem + "_depth.npy")
|
| 58 |
+
np.save(str(raw_path), depth)
|
| 59 |
+
print(f" Saved raw depth: {raw_path}")
|
| 60 |
+
|
| 61 |
+
if out.suffix.lower() != ".npy":
|
| 62 |
+
result.save(str(out))
|
| 63 |
+
print(f" Saved: {out}")
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
def process_batch(
|
| 67 |
+
estimator: DepthEstimator,
|
| 68 |
+
input_dir: Path,
|
| 69 |
+
output_dir: Path,
|
| 70 |
+
colormap: str,
|
| 71 |
+
side_by_side: bool,
|
| 72 |
+
overlay: bool,
|
| 73 |
+
overlay_alpha: float,
|
| 74 |
+
save_raw: bool,
|
| 75 |
+
):
|
| 76 |
+
"""Process all images in a directory."""
|
| 77 |
+
output_dir.mkdir(parents=True, exist_ok=True)
|
| 78 |
+
|
| 79 |
+
images = sorted(
|
| 80 |
+
p for p in input_dir.iterdir()
|
| 81 |
+
if p.suffix.lower() in IMAGE_EXTENSIONS
|
| 82 |
+
)
|
| 83 |
+
|
| 84 |
+
if not images:
|
| 85 |
+
print(f"No images found in {input_dir}")
|
| 86 |
+
return
|
| 87 |
+
|
| 88 |
+
print(f"Found {len(images)} images in {input_dir}")
|
| 89 |
+
total_start = time.time()
|
| 90 |
+
|
| 91 |
+
for img_path in images:
|
| 92 |
+
out_name = img_path.stem + "_depth.png"
|
| 93 |
+
out_path = output_dir / out_name
|
| 94 |
+
process_single(
|
| 95 |
+
estimator, img_path, out_path,
|
| 96 |
+
colormap, side_by_side, overlay, overlay_alpha, save_raw,
|
| 97 |
+
)
|
| 98 |
+
|
| 99 |
+
total = time.time() - total_start
|
| 100 |
+
avg = total / len(images)
|
| 101 |
+
print(f"\nDone! {len(images)} images in {total:.1f}s (avg {avg:.2f}s/image)")
|
| 102 |
+
|
| 103 |
+
|
| 104 |
+
def main():
|
| 105 |
+
parser = argparse.ArgumentParser(description="DepthLens — Monocular Depth Estimation")
|
| 106 |
+
parser.add_argument("--input", required=True, help="Input image path or directory")
|
| 107 |
+
parser.add_argument("--output", required=True, help="Output path or directory")
|
| 108 |
+
parser.add_argument("--model", default="small", choices=["small", "large"], help="Model size")
|
| 109 |
+
parser.add_argument("--colormap", default="inferno", choices=["inferno", "magma", "viridis", "plasma"])
|
| 110 |
+
parser.add_argument("--side-by-side", action="store_true", help="Generate side-by-side comparison")
|
| 111 |
+
parser.add_argument("--overlay", action="store_true", help="Generate depth overlay on original")
|
| 112 |
+
parser.add_argument("--overlay-alpha", type=float, default=0.5, help="Overlay transparency")
|
| 113 |
+
parser.add_argument("--save-raw", action="store_true", help="Also save raw depth as .npy")
|
| 114 |
+
parser.add_argument("--batch", action="store_true", help="Process a folder of images")
|
| 115 |
+
args = parser.parse_args()
|
| 116 |
+
|
| 117 |
+
estimator = DepthEstimator(model_size=args.model)
|
| 118 |
+
|
| 119 |
+
input_path = Path(args.input)
|
| 120 |
+
output_path = Path(args.output)
|
| 121 |
+
|
| 122 |
+
if args.batch:
|
| 123 |
+
process_batch(
|
| 124 |
+
estimator, input_path, output_path,
|
| 125 |
+
args.colormap, args.side_by_side, args.overlay, args.overlay_alpha, args.save_raw,
|
| 126 |
+
)
|
| 127 |
+
else:
|
| 128 |
+
output_path.parent.mkdir(parents=True, exist_ok=True)
|
| 129 |
+
process_single(
|
| 130 |
+
estimator, input_path, output_path,
|
| 131 |
+
args.colormap, args.side_by_side, args.overlay, args.overlay_alpha, args.save_raw,
|
| 132 |
+
)
|
| 133 |
+
|
| 134 |
+
|
| 135 |
+
if __name__ == "__main__":
|
| 136 |
+
main()
|
models/__init__.py
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from .depth_estimator import DepthEstimator, MODEL_CONFIGS
|
| 2 |
+
|
| 3 |
+
__all__ = ["DepthEstimator", "MODEL_CONFIGS"]
|
models/depth_estimator.py
ADDED
|
@@ -0,0 +1,120 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Depth estimation model wrapper using MiDaS.
|
| 3 |
+
|
| 4 |
+
Supports two model sizes:
|
| 5 |
+
- small: MiDaS v2.1 Small (EfficientNet-Lite backbone, fast CPU inference)
|
| 6 |
+
- large: DPT-Large (Vision Transformer backbone, highest quality)
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
import torch
|
| 10 |
+
import numpy as np
|
| 11 |
+
from PIL import Image
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
# Model configurations
|
| 15 |
+
MODEL_CONFIGS = {
|
| 16 |
+
"small": {
|
| 17 |
+
"repo": "intel-isl/MiDaS",
|
| 18 |
+
"model_name": "MiDaS_small",
|
| 19 |
+
"transform_name": "small_transform",
|
| 20 |
+
"description": "MiDaS v2.1 Small — Fast CPU inference (~0.5s)",
|
| 21 |
+
},
|
| 22 |
+
"large": {
|
| 23 |
+
"repo": "intel-isl/MiDaS",
|
| 24 |
+
"model_name": "DPT_Large",
|
| 25 |
+
"transform_name": "dpt_transform",
|
| 26 |
+
"description": "DPT-Large — Highest quality depth estimation (~3s)",
|
| 27 |
+
},
|
| 28 |
+
}
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
class DepthEstimator:
|
| 32 |
+
"""Monocular depth estimation using MiDaS models."""
|
| 33 |
+
|
| 34 |
+
def __init__(self, model_size: str = "small", device: str = None):
|
| 35 |
+
"""
|
| 36 |
+
Initialize the depth estimator.
|
| 37 |
+
|
| 38 |
+
Args:
|
| 39 |
+
model_size: 'small' or 'large'
|
| 40 |
+
device: 'cpu' or 'cuda' (auto-detected if None)
|
| 41 |
+
"""
|
| 42 |
+
if model_size not in MODEL_CONFIGS:
|
| 43 |
+
raise ValueError(f"Unknown model size '{model_size}'. Choose from: {list(MODEL_CONFIGS.keys())}")
|
| 44 |
+
|
| 45 |
+
self.model_size = model_size
|
| 46 |
+
self.config = MODEL_CONFIGS[model_size]
|
| 47 |
+
self.device = device or ("cuda" if torch.cuda.is_available() else "cpu")
|
| 48 |
+
|
| 49 |
+
self._load_model()
|
| 50 |
+
|
| 51 |
+
def _load_model(self):
|
| 52 |
+
"""Load the MiDaS model and transforms from PyTorch Hub."""
|
| 53 |
+
print(f"Loading {self.config['description']}...")
|
| 54 |
+
|
| 55 |
+
# Load model
|
| 56 |
+
self.model = torch.hub.load(
|
| 57 |
+
self.config["repo"],
|
| 58 |
+
self.config["model_name"],
|
| 59 |
+
trust_repo=True,
|
| 60 |
+
)
|
| 61 |
+
self.model.to(self.device)
|
| 62 |
+
self.model.eval()
|
| 63 |
+
|
| 64 |
+
# Load transforms
|
| 65 |
+
midas_transforms = torch.hub.load(
|
| 66 |
+
self.config["repo"],
|
| 67 |
+
"transforms",
|
| 68 |
+
trust_repo=True,
|
| 69 |
+
)
|
| 70 |
+
|
| 71 |
+
if self.model_size == "small":
|
| 72 |
+
self.transform = midas_transforms.small_transform
|
| 73 |
+
else:
|
| 74 |
+
self.transform = midas_transforms.dpt_transform
|
| 75 |
+
|
| 76 |
+
print(f"Model loaded on {self.device}")
|
| 77 |
+
|
| 78 |
+
@torch.no_grad()
|
| 79 |
+
def predict(self, image: Image.Image) -> np.ndarray:
|
| 80 |
+
"""
|
| 81 |
+
Predict depth from a PIL Image.
|
| 82 |
+
|
| 83 |
+
Args:
|
| 84 |
+
image: Input PIL Image (RGB)
|
| 85 |
+
|
| 86 |
+
Returns:
|
| 87 |
+
depth_map: Normalized depth array (H, W) with values in [0, 1].
|
| 88 |
+
Higher values = closer to camera.
|
| 89 |
+
"""
|
| 90 |
+
# Convert PIL to numpy RGB
|
| 91 |
+
img_np = np.array(image.convert("RGB"))
|
| 92 |
+
|
| 93 |
+
# Apply MiDaS transform
|
| 94 |
+
input_tensor = self.transform(img_np).to(self.device)
|
| 95 |
+
|
| 96 |
+
# Run inference
|
| 97 |
+
prediction = self.model(input_tensor)
|
| 98 |
+
|
| 99 |
+
# Resize to original dimensions
|
| 100 |
+
prediction = torch.nn.functional.interpolate(
|
| 101 |
+
prediction.unsqueeze(1),
|
| 102 |
+
size=img_np.shape[:2],
|
| 103 |
+
mode="bicubic",
|
| 104 |
+
align_corners=False,
|
| 105 |
+
).squeeze()
|
| 106 |
+
|
| 107 |
+
depth = prediction.cpu().numpy()
|
| 108 |
+
|
| 109 |
+
# Normalize to [0, 1]
|
| 110 |
+
depth_min = depth.min()
|
| 111 |
+
depth_max = depth.max()
|
| 112 |
+
if depth_max - depth_min > 1e-6:
|
| 113 |
+
depth = (depth - depth_min) / (depth_max - depth_min)
|
| 114 |
+
else:
|
| 115 |
+
depth = np.zeros_like(depth)
|
| 116 |
+
|
| 117 |
+
return depth
|
| 118 |
+
|
| 119 |
+
def __repr__(self):
|
| 120 |
+
return f"DepthEstimator(model_size='{self.model_size}', device='{self.device}')"
|
requirements.txt
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
torch>=1.13.0
|
| 2 |
+
torchvision>=0.14.0
|
| 3 |
+
timm>=0.6.12
|
| 4 |
+
opencv-python-headless>=4.7.0
|
| 5 |
+
Pillow>=9.4.0
|
| 6 |
+
numpy>=1.23.0
|
| 7 |
+
matplotlib>=3.6.0
|
| 8 |
+
gradio>=4.0.0
|
utils/__init__.py
ADDED
|
@@ -0,0 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from .visualization import (
|
| 2 |
+
depth_to_colormap,
|
| 3 |
+
create_side_by_side,
|
| 4 |
+
create_overlay,
|
| 5 |
+
add_depth_legend,
|
| 6 |
+
COLORMAPS,
|
| 7 |
+
)
|
| 8 |
+
|
| 9 |
+
__all__ = [
|
| 10 |
+
"depth_to_colormap",
|
| 11 |
+
"create_side_by_side",
|
| 12 |
+
"create_overlay",
|
| 13 |
+
"add_depth_legend",
|
| 14 |
+
"COLORMAPS",
|
| 15 |
+
]
|
utils/visualization.py
ADDED
|
@@ -0,0 +1,129 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Visualization utilities for depth map rendering.
|
| 3 |
+
|
| 4 |
+
Provides colormapped depth images, side-by-side comparisons, and overlay modes.
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
import numpy as np
|
| 8 |
+
from PIL import Image
|
| 9 |
+
import matplotlib
|
| 10 |
+
import matplotlib.cm as cm
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
# Available colormaps for depth visualization
|
| 14 |
+
COLORMAPS = {
|
| 15 |
+
"inferno": cm.inferno,
|
| 16 |
+
"magma": cm.magma,
|
| 17 |
+
"viridis": cm.viridis,
|
| 18 |
+
"plasma": cm.plasma,
|
| 19 |
+
}
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
def depth_to_colormap(depth: np.ndarray, colormap: str = "inferno") -> Image.Image:
|
| 23 |
+
"""
|
| 24 |
+
Apply a scientific colormap to a normalized depth array.
|
| 25 |
+
|
| 26 |
+
Args:
|
| 27 |
+
depth: Normalized depth array (H, W), values in [0, 1]
|
| 28 |
+
colormap: Name of colormap ('inferno', 'magma', 'viridis', 'plasma')
|
| 29 |
+
|
| 30 |
+
Returns:
|
| 31 |
+
Colormapped PIL Image (RGB)
|
| 32 |
+
"""
|
| 33 |
+
if colormap not in COLORMAPS:
|
| 34 |
+
raise ValueError(f"Unknown colormap '{colormap}'. Choose from: {list(COLORMAPS.keys())}")
|
| 35 |
+
|
| 36 |
+
cmap = COLORMAPS[colormap]
|
| 37 |
+
colored = cmap(depth) # Returns (H, W, 4) RGBA float array
|
| 38 |
+
colored = (colored[:, :, :3] * 255).astype(np.uint8) # Drop alpha, convert to uint8
|
| 39 |
+
|
| 40 |
+
return Image.fromarray(colored)
|
| 41 |
+
|
| 42 |
+
|
| 43 |
+
def create_side_by_side(
|
| 44 |
+
original: Image.Image,
|
| 45 |
+
depth_colored: Image.Image,
|
| 46 |
+
gap: int = 4,
|
| 47 |
+
bg_color: tuple = (20, 24, 30),
|
| 48 |
+
) -> Image.Image:
|
| 49 |
+
"""
|
| 50 |
+
Create a side-by-side comparison of original image and depth map.
|
| 51 |
+
|
| 52 |
+
Args:
|
| 53 |
+
original: Original PIL Image
|
| 54 |
+
depth_colored: Colormapped depth PIL Image
|
| 55 |
+
gap: Pixel gap between images
|
| 56 |
+
bg_color: Background color for the gap
|
| 57 |
+
|
| 58 |
+
Returns:
|
| 59 |
+
Combined PIL Image
|
| 60 |
+
"""
|
| 61 |
+
# Resize depth to match original dimensions
|
| 62 |
+
depth_resized = depth_colored.resize(original.size, Image.LANCZOS)
|
| 63 |
+
|
| 64 |
+
w, h = original.size
|
| 65 |
+
canvas = Image.new("RGB", (w * 2 + gap, h), bg_color)
|
| 66 |
+
canvas.paste(original, (0, 0))
|
| 67 |
+
canvas.paste(depth_resized, (w + gap, 0))
|
| 68 |
+
|
| 69 |
+
return canvas
|
| 70 |
+
|
| 71 |
+
|
| 72 |
+
def create_overlay(
|
| 73 |
+
original: Image.Image,
|
| 74 |
+
depth_colored: Image.Image,
|
| 75 |
+
alpha: float = 0.5,
|
| 76 |
+
) -> Image.Image:
|
| 77 |
+
"""
|
| 78 |
+
Blend the depth map on top of the original image.
|
| 79 |
+
|
| 80 |
+
Args:
|
| 81 |
+
original: Original PIL Image
|
| 82 |
+
depth_colored: Colormapped depth PIL Image
|
| 83 |
+
alpha: Blend factor (0 = only original, 1 = only depth)
|
| 84 |
+
|
| 85 |
+
Returns:
|
| 86 |
+
Blended PIL Image
|
| 87 |
+
"""
|
| 88 |
+
depth_resized = depth_colored.resize(original.size, Image.LANCZOS)
|
| 89 |
+
original_rgb = original.convert("RGB")
|
| 90 |
+
|
| 91 |
+
blended = Image.blend(original_rgb, depth_resized, alpha)
|
| 92 |
+
return blended
|
| 93 |
+
|
| 94 |
+
|
| 95 |
+
def add_depth_legend(
|
| 96 |
+
image: Image.Image,
|
| 97 |
+
colormap: str = "inferno",
|
| 98 |
+
bar_height: int = 24,
|
| 99 |
+
padding: int = 12,
|
| 100 |
+
) -> Image.Image:
|
| 101 |
+
"""
|
| 102 |
+
Add a color legend bar at the bottom of the image.
|
| 103 |
+
|
| 104 |
+
Args:
|
| 105 |
+
image: Input PIL Image
|
| 106 |
+
colormap: Colormap name for the legend
|
| 107 |
+
bar_height: Height of the legend bar
|
| 108 |
+
padding: Padding around the bar
|
| 109 |
+
|
| 110 |
+
Returns:
|
| 111 |
+
Image with legend appended at the bottom
|
| 112 |
+
"""
|
| 113 |
+
w, h = image.size
|
| 114 |
+
total_height = h + bar_height + padding * 2
|
| 115 |
+
|
| 116 |
+
canvas = Image.new("RGB", (w, total_height), (20, 24, 30))
|
| 117 |
+
canvas.paste(image, (0, 0))
|
| 118 |
+
|
| 119 |
+
# Create gradient bar
|
| 120 |
+
cmap = COLORMAPS.get(colormap, cm.inferno)
|
| 121 |
+
gradient = np.linspace(0, 1, w - padding * 2).reshape(1, -1)
|
| 122 |
+
gradient = np.repeat(gradient, bar_height, axis=0)
|
| 123 |
+
colored = cmap(gradient)[:, :, :3]
|
| 124 |
+
colored = (colored * 255).astype(np.uint8)
|
| 125 |
+
bar_img = Image.fromarray(colored)
|
| 126 |
+
|
| 127 |
+
canvas.paste(bar_img, (padding, h + padding))
|
| 128 |
+
|
| 129 |
+
return canvas
|