Diffusers
Safetensors
zeyuren2002 commited on
Commit
d547008
·
verified ·
1 Parent(s): b38fd83

Add files using upload-large-folder tool

Browse files
Files changed (50) hide show
  1. .gitmodules +0 -0
  2. LICENSE +28 -0
  3. PHASE0_EVALMDE_HANDOFF.md +245 -0
  4. README.md +90 -0
  5. compute_metrics_example.py +12 -0
  6. evalmde/__init__.py +0 -0
  7. evalmde/__pycache__/__init__.cpython-310.pyc +0 -0
  8. evalmde/metrics/__init__.py +0 -0
  9. evalmde/metrics/__pycache__/boundary.cpython-310.pyc +0 -0
  10. evalmde/metrics/boundary.py +346 -0
  11. evalmde/metrics/rel_normal.py +231 -0
  12. evalmde/metrics/sawa_h.py +45 -0
  13. evalmde/metrics/standard.py +214 -0
  14. evalmde/metrics/triangle.py +93 -0
  15. evalmde/utils/__init__.py +0 -0
  16. evalmde/utils/__pycache__/__init__.cpython-310.pyc +0 -0
  17. evalmde/utils/__pycache__/blender.cpython-310.pyc +0 -0
  18. evalmde/utils/__pycache__/common.cpython-310.pyc +0 -0
  19. evalmde/utils/__pycache__/constants.cpython-310.pyc +0 -0
  20. evalmde/utils/__pycache__/depth.cpython-310.pyc +0 -0
  21. evalmde/utils/__pycache__/depth_to_mesh.cpython-310.pyc +0 -0
  22. evalmde/utils/__pycache__/downsample.cpython-310.pyc +0 -0
  23. evalmde/utils/__pycache__/image.cpython-310.pyc +0 -0
  24. evalmde/utils/__pycache__/np_and_th.cpython-310.pyc +0 -0
  25. evalmde/utils/__pycache__/proj.cpython-310.pyc +0 -0
  26. evalmde/utils/__pycache__/torch.cpython-310.pyc +0 -0
  27. evalmde/utils/blender.py +213 -0
  28. evalmde/utils/common.py +60 -0
  29. evalmde/utils/constants.py +2 -0
  30. evalmde/utils/depth.py +132 -0
  31. evalmde/utils/depth_to_mesh.py +150 -0
  32. evalmde/utils/downsample.py +72 -0
  33. evalmde/utils/image.py +45 -0
  34. evalmde/utils/np_and_th.py +27 -0
  35. evalmde/utils/proj.py +41 -0
  36. evalmde/utils/torch.py +26 -0
  37. evalmde/visualization/__init__.py +14 -0
  38. evalmde/visualization/cfg.py +54 -0
  39. evalmde/visualization/render_contour_line.py +256 -0
  40. evalmde/visualization/render_textureless_relighting.py +130 -0
  41. induce_valid_triangle_from_gt_depth.py +29 -0
  42. infinigen5_12612.log +256 -0
  43. infinigen_all_12900.log +0 -0
  44. setup.py +39 -0
  45. smoke_all_12114.log +218 -0
  46. smoke_all_12115.log +207 -0
  47. smoke_all_12351.log +235 -0
  48. smoke_evalmde_12112.log +2 -0
  49. smoke_evalmde_12113.log +34 -0
  50. smoke_lotus_v1_12348.log +20 -0
.gitmodules ADDED
File without changes
LICENSE ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ BSD 3-Clause License
2
+
3
+ Copyright (c) 2025, Princeton Vision & Learning Lab
4
+
5
+ Redistribution and use in source and binary forms, with or without
6
+ modification, are permitted provided that the following conditions are met:
7
+
8
+ 1. Redistributions of source code must retain the above copyright notice, this
9
+ list of conditions and the following disclaimer.
10
+
11
+ 2. Redistributions in binary form must reproduce the above copyright notice,
12
+ this list of conditions and the following disclaimer in the documentation
13
+ and/or other materials provided with the distribution.
14
+
15
+ 3. Neither the name of the copyright holder nor the names of its
16
+ contributors may be used to endorse or promote products derived from
17
+ this software without specific prior written permission.
18
+
19
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
20
+ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
21
+ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
22
+ DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
23
+ FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
24
+ DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
25
+ SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
26
+ CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
27
+ OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
28
+ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
PHASE0_EVALMDE_HANDOFF.md ADDED
@@ -0,0 +1,245 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Phase 0 EvalMDE Adaptation — Handoff
2
+
3
+ **Date:** 2026-05-14
4
+ **Status:** EvalMDE workspace bootstrapped; main eval script + sbatch still to write.
5
+
6
+ ---
7
+
8
+ ## Goal
9
+
10
+ Run the 7 MoGe-Phase-0 models on **Infinigen 95 scenes** under the **EvalMDE protocol**
11
+ (raw native input, no homography warp), producing **RelNormal + SAWA-H + standard metrics**.
12
+
13
+ EvalMDE and MoGe are independent workflows. EvalMDE workspace is at `/home/ywan0794/EvalMDE/`.
14
+ Model wrappers are *copied* from MoGe (single source of truth still in MoGe/baselines/),
15
+ because the wrappers' `infer(image, intrinsics)` API doesn't depend on MoGe's eval pipeline.
16
+
17
+ ---
18
+
19
+ ## What's done
20
+
21
+ ### 1. EvalMDE env (Python 3.10) — built and verified
22
+ `evalmde` conda env has: torch 2.7.0+cu126, opencv, scipy, utils3d, pipeline, evalmde package,
23
+ bpy 4.0 (Blender python, for textureless-relighting visualization).
24
+ Sample run `python compute_metrics_example.py` outputs `sawa_h=1.268, rel_normal=0.390` ✓.
25
+
26
+ ### 2. 7 baselines (model wrappers) — copied from MoGe + verified
27
+ `/home/ywan0794/EvalMDE/baselines/`:
28
+ - `depth_pro.py` → emits `depth_metric` (+ `intrinsics` from FOV head)
29
+ - `marigold.py` → emits `depth_affine_invariant` (paper: scale_inv+shift_inv → affine)
30
+ - `lotus.py` → emits `disparity_affine_invariant` when `--disparity` set
31
+ - `depthmaster.py` → emits `depth_affine_invariant`
32
+ - `ppd.py` → emits `depth_affine_invariant` (training quantile normalization)
33
+ - `da3_mono.py` → emits `depth_scale_invariant`
34
+ - `fe2e.py` → emits `depth_affine_invariant` (Lpred clamped to [0,1])
35
+
36
+ `MGEBaselineInterface` copied to `/home/ywan0794/EvalMDE/test/baseline.py`.
37
+
38
+ ### 3. EvalMDE-native dataloader skeleton — written
39
+ `/home/ywan0794/EvalMDE/scripts/dataloader.py` (`EvalMDELoaderPipeline`):
40
+ - Reads `<scene>/rgb.png` + `<scene>/gt_depth.npz` (keys: `depth (H,W)`, `intr (4,) [fx,fy,cx,cy]px`, `valid (H,W) bool`)
41
+ - Pixel intrinsics → 3×3 normalized matrix `[fx/W, fy/H, cx/W, cy/H]` (MoGe convention)
42
+ - Computes 3D pointmap from depth + native pixel intrinsics
43
+ - NaN/invalid pixels replaced with `1.0` (matches `evalmde/utils/depth.py:load_data` convention)
44
+ - Returns dict with: `image [3,H,W] float [0,1]`, `depth`, `depth_mask`, `intrinsics (3,3)`,
45
+ `points (H,W,3)`, `is_metric=True`, `_intr_px (4,)` (for EvalMDE metrics raw npz)
46
+
47
+ ### 4. Infinigen download — IN PROGRESS (background)
48
+ - Source: Princeton GDrive `1amzb6KyF2USFQ5W4CeYKFCh1F-yOQsmp`
49
+ - Target: `/home/ywan0794/EvalMDE/data/infinigen/`
50
+ - Log: `/tmp/dl_infinigen.log`
51
+ - Estimated 50-100 GB
52
+ - Check state: `du -sh /home/ywan0794/EvalMDE/data/infinigen/`
53
+
54
+ ### 5. Production MoGe-protocol eval — independent track, already running
55
+ - `sbatch eval_scripts/eval_all_slurm.sh` submitted earlier (job 12110 etc.)
56
+ - 5 models pending (Marigold/Lotus/DepthMaster/PPD/FE2E), 2 already done (DA3-Mono/Depth Pro)
57
+ - Results in `/home/ywan0794/MoGe/eval_output/<model>_<TS>.json`
58
+ - **EvalMDE adaptation is a separate effort, doesn't block production MoGe eval.**
59
+
60
+ ---
61
+
62
+ ## TODO (was 4 items, now 2 remain)
63
+
64
+ ### ✅ TODO-1: Fix baseline imports — SUPERSEDED by sys.path approach in run_inference.py
65
+ `EvalMDE/baselines/*.py` still have `from moge.test.baseline import MGEBaselineInterface`.
66
+ **Resolved via Option A**: `scripts/run_inference.py` does `sys.path.insert(0, '/home/ywan0794/MoGe')`
67
+ so baselines still resolve their interface from MoGe. No sed needed.
68
+
69
+ ### ✅ TODO-2 (inference driver): `scripts/run_inference.py` — WRITTEN
70
+ `/home/ywan0794/EvalMDE/scripts/run_inference.py`:
71
+ - Click CLI with `--baseline /path/to/baselines/<m>.py --data-root <infinigen> --output-root <out> --model-name <name>`
72
+ - Passes remaining click args through to baseline's `load.main(ctx.args)`
73
+ - For each scene with `rgb.png + gt_depth.npz`: loads rgb, builds normalized 3×3 K from GT pixel intr,
74
+ calls `baseline.infer_for_evaluation(image, K_norm)`, picks depth in priority order
75
+ (`depth_metric > depth_scale_invariant > depth_affine_invariant > 1/disparity_affine_invariant`),
76
+ writes `<out>/<model>/<scene>/pred_depth.npz` with EvalMDE keys `{depth, intr (4,) px, valid}`
77
+ - For pred intrinsics: uses model-predicted intr if present (Depth Pro), else GT intr
78
+
79
+ ### ❗ Original TODO-2 (script/eval.py) was REWORKED into 2 stages: inference + metric.
80
+ This is cleaner: inference runs in per-model env, metric runs in evalmde env.
81
+
82
+ ### TODO-3: Write `scripts/compute_metrics.py` (run in evalmde env)
83
+ Reads each model's pred_depth.npz + GT gt_depth.npz, computes EvalMDE metrics + standard MDE metrics.
84
+
85
+ Pseudocode:
86
+ ```python
87
+ import sys, json, click
88
+ from pathlib import Path
89
+ import numpy as np
90
+
91
+ from evalmde.utils.depth import load_data
92
+ from evalmde.metrics.rel_normal import compute_rel_normal
93
+ from evalmde.metrics.sawa_h import compute_sawa_h
94
+
95
+ @click.command()
96
+ @click.option('--gt-root', required=True, type=click.Path()) # Infinigen root
97
+ @click.option('--pred-root', required=True, type=click.Path()) # output of run_inference.py
98
+ @click.option('--model-name', required=True, type=str)
99
+ @click.option('--output', required=True, type=click.Path())
100
+ def main(gt_root, pred_root, model_name, output):
101
+ gt_root = Path(gt_root); pred_root = Path(pred_root) / model_name
102
+ scenes = sorted(d.name for d in pred_root.iterdir() if (d / 'pred_depth.npz').exists())
103
+
104
+ results = []
105
+ for scene in scenes:
106
+ gt_d, gt_intr, gt_v = load_data(gt_root / scene / 'gt_depth.npz')
107
+ pr_d, pr_intr, pr_v = load_data(pred_root / scene / 'pred_depth.npz')
108
+
109
+ # SAWA-H aligns internally (affine via least-squares). RelNormal uses surface normals
110
+ # which are invariant to scale but NOT to shift — for affine-invariant preds, the
111
+ # shift will skew normals at far depths. Acceptable caveat in Phase 0; document it.
112
+ sawa = compute_sawa_h (pr_d, pr_intr, pr_v, gt_d, gt_intr, gt_v)
113
+ rnorm = compute_rel_normal(pr_d, pr_intr, pr_v, gt_d, gt_intr, gt_v)
114
+
115
+ # Standard AbsRel + δ1 after affine alignment (re-implement, ~10 lines):
116
+ mask = gt_v & pr_v
117
+ gtm, prm = gt_d[mask], pr_d[mask]
118
+ # fit y = a*x + b on (prm, gtm)
119
+ A = np.stack([prm, np.ones_like(prm)], axis=-1)
120
+ a, b = np.linalg.lstsq(A, gtm, rcond=None)[0]
121
+ aligned = pr_d * a + b
122
+ am = aligned[mask]
123
+ abs_rel = np.mean(np.abs(am - gtm) / np.maximum(gtm, 1e-6))
124
+ delta1 = np.mean(np.maximum(am/gtm, gtm/am) < 1.25)
125
+
126
+ results.append({'scene': scene, 'sawa_h': float(sawa), 'rel_normal': float(rnorm),
127
+ 'abs_rel': float(abs_rel), 'delta1': float(delta1)})
128
+
129
+ # Per-scene + aggregate mean
130
+ summary = {'per_scene': results,
131
+ 'mean': {k: float(np.mean([r[k] for r in results])) for k in ['sawa_h','rel_normal','abs_rel','delta1']}}
132
+ json.dump(summary, open(output, 'w'), indent=2)
133
+
134
+
135
+ if __name__ == '__main__':
136
+ main()
137
+ ```
138
+
139
+ **Note on alignment**: `compute_sawa_h` aligns internally (via `align_depth_least_square` + `align_affine_lstsq`),
140
+ so passing RAW pred (affine-invariant) is correct. `compute_rel_normal` does NOT align — its
141
+ inputs should be in a comparable depth scale. For Phase 0 simplicity, pass raw pred; document
142
+ the affine-shift caveat in the analysis. For stricter eval, pre-affine-align before RelNormal.
143
+
144
+ ### TODO-4: Scene list / config
145
+ Once Infinigen download succeeds (currently blocked, see issue below), `run_inference.py`
146
+ auto-discovers all scene dirs under `--data-root`. If a subset is wanted, write
147
+ `scenes.txt` and add filtering in run_inference.py (~3 lines).
148
+
149
+ ### TODO-5: sbatch `eval_scripts/eval_evalmde_all_slurm.sh`
150
+ Same pattern as MoGe's `sanity_all_slurm.sh`: single sbatch, single H100, serial per-model.
151
+ For each of 7 models: `conda activate <env>; python scripts/run_inference.py --baseline baselines/<m>.py ...`
152
+ Then after all 7 inferences done: `conda activate evalmde; for m in ...; do python scripts/compute_metrics.py --model-name $m ...; done`
153
+
154
+ Each per-model env needs `evalmde` pip-installed so it can `from evalmde.metrics...` — actually
155
+ **no, this is wrong**: per-model envs only run inference (which needs torch + model wrapper deps,
156
+ no evalmde). Only the metric-aggregation stage runs in evalmde env. So envs need no extra install.
157
+
158
+ ### TODO-3: Scene list / config
159
+ Once Infinigen download finishes, inspect actual layout:
160
+ ```bash
161
+ ls /home/ywan0794/EvalMDE/data/infinigen/ | head -20
162
+ ```
163
+ If scenes are `scene_001/`, `scene_002/`, ...: dataloader auto-discovers them.
164
+ If grouped under sub-folders or different naming: may need a manual `scenes.txt` split file.
165
+
166
+ ### TODO-4: sbatch `EvalMDE/eval_scripts/eval_evalmde_all_slurm.sh`
167
+ Mirror MoGe's `sanity_all_slurm.sh` structure:
168
+ - Single sbatch, single H100, serial per-model
169
+ - For each model: activate model's conda env, run `python scripts/eval.py --baseline baselines/<m>.py --data-root data/infinigen --output results/<m>.json`
170
+ - After all inference done, optionally re-aggregate in evalmde env for cross-model summary
171
+
172
+ Per-model env mapping same as MoGe:
173
+ | model | env |
174
+ |---|---|
175
+ | depth_pro | depth-pro |
176
+ | marigold | marigold |
177
+ | lotus | lotus |
178
+ | depthmaster | depthmaster |
179
+ | ppd | ppd |
180
+ | da3_mono | da3 |
181
+ | fe2e | fe2e |
182
+
183
+ Plus: each env needs `evalmde` package installed (`pip install -e /home/ywan0794/EvalMDE`)
184
+ so `from evalmde.metrics.* import compute_rel_normal, compute_sawa_h` works inside model envs.
185
+
186
+ ---
187
+
188
+ ## Paper-canonical inference parameters (locked, confirmed against each repo)
189
+
190
+ | Model | Args | Source |
191
+ |---|---|---|
192
+ | Depth Pro | `--precision fp32` | `create_model_and_transforms()` default |
193
+ | Marigold | v1-1 + `--denoise_steps 4 --ensemble_size 1` | (user decision: balanced speed) |
194
+ | Lotus | g-v2-1-disparity + `--mode generation --disparity --timestep 999 --fp16 --seed 42` | `Lotus/eval.sh` |
195
+ | DepthMaster | `--processing_res 768` | `DepthMaster/scripts/infer.sh` |
196
+ | PPD | `--semantics_model MoGe2 --semantics_pth checkpoints/moge2.pt --model_pth checkpoints/ppd_moge.pth --sampling_steps 4` | `PPD/ppd/configs/eval.yaml` |
197
+ | DA3-Mono | `--hf_id depth-anything/DA3MONO-LARGE` | DA3 README |
198
+ | FE2E | `--prompt_type empty --single_denoise --cfg_guidance 6.0 --size_level 768` | `FE2E/README.md` eval block |
199
+
200
+ ---
201
+
202
+ ## Key insights to preserve
203
+
204
+ 1. **EvalMDE protocol uses raw native input, no homography warp.** MoGe's eval pipeline
205
+ does aggressive canonical-view warping (`dataloader.py:_process_instance:119-180`).
206
+ That is MoGe-paper-specific; EvalMDE explicitly uses raw inputs (see `compute_metrics_example.py`).
207
+
208
+ 2. **Output key contract** (per MGEBaselineInterface):
209
+ - `depth_metric` → metric depth in meters (Depth Pro)
210
+ - `depth_scale_invariant` → scale-invariant relative depth (DA3-Mono)
211
+ - `depth_affine_invariant` → affine-invariant depth (Marigold/DepthMaster/PPD/FE2E)
212
+ - `disparity_affine_invariant` → affine-invariant disparity (Lotus disparity ckpts)
213
+
214
+ 3. **Pre-alignment for SAWA-H/RelNormal**: SAWA-H itself does affine alignment internally
215
+ (`evalmde/metrics/sawa_h.py:compute_sawa_h` uses `align_depth_least_square` + `align_affine_lstsq`),
216
+ so you can pass RAW pred depth to SAWA-H. RelNormal works on normals which are
217
+ scale-invariant in the limit, but **shift in depth space WILL skew normals at far depths** —
218
+ so for affine-invariant pred models, do an affine align before passing to `compute_rel_normal`.
219
+
220
+ 4. **MoGe's eval can run in parallel with EvalMDE work.** Production `eval_all_slurm.sh`
221
+ already running. Don't disturb.
222
+
223
+ 5. **Lotus disparity ckpt inversion was numerically unstable** (1/disp blows up near
224
+ disparity=0). For EvalMDE, only emit `disparity_affine_invariant` from Lotus, then
225
+ convert: `aligned_disp = scale*disp + shift` (fit in disp space), `aligned_depth = 1/aligned_disp.clamp(1/gt_depth_max)`.
226
+ Reference: `moge/test/metrics.py:202-218` disparity_affine_invariant block.
227
+
228
+ ---
229
+
230
+ ## Resume instructions
231
+
232
+ 1. `cd /home/ywan0794/EvalMDE`
233
+ 2. Check Infinigen download: `du -sh data/infinigen; tail /tmp/dl_infinigen.log`
234
+ 3. Fix imports (TODO-1):
235
+ ```bash
236
+ sed -i 's|from moge.test.baseline|from test.baseline|g' baselines/*.py
237
+ ```
238
+ 4. Write `scripts/eval.py` (TODO-2) using the pseudocode above.
239
+ 5. Test on 1 scene with depth_pro: `python scripts/eval.py --baseline baselines/depth_pro.py --data-root data/infinigen --output /tmp/test.json --repo /home/ywan0794/EvalMDE/ml-depth-pro --checkpoint /home/ywan0794/EvalMDE/ml-depth-pro/checkpoints/depth_pro.pt`
240
+ 6. Inspect `/tmp/test.json`. If sane (rel_normal in [0, 1] rad, sawa_h plausible),
241
+ proceed to write sbatch (TODO-4).
242
+
243
+ ---
244
+
245
+ **End of handoff.**
README.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Toward A Better Understanding of Monocular Depth Evaluation
2
+ This repository contains the source code for our paper:
3
+
4
+ [Toward A Better Understanding of Monocular Depth Evaluation](https://arxiv.org/abs/2510.19814)<br/>
5
+ [Siyang Wu](https://nj-wusiyang.github.io/), Jack Nugent, Willow Yang, [Jia Deng](https://www.cs.princeton.edu/~jiadeng/)
6
+
7
+ ```
8
+ @article{wu2025evaluate,
9
+ title={Toward A Better Understanding of Monocular Depth Evaluation},
10
+ author={Wu, Siyang and Nugent, Jack and Yang, Willow and Deng, Jia},
11
+ journal={arXiv preprint arXiv:2510.19814},
12
+ year={2025}
13
+ }
14
+ ```
15
+
16
+ ## Installation Instructions
17
+ Under `EvalMDE`, run:
18
+ ```bash
19
+ conda create -n evalmde python=3.10 -y
20
+ conda activate evalmde
21
+
22
+ pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126
23
+ pip install -e .
24
+ pip install bpy==4.0.0 --extra-index-url https://download.blender.org/pypi/
25
+ ```
26
+
27
+ ## Data Format
28
+ ### Depth map file
29
+ This repository takes depth map file in `*.npz` format, with keys: `depth, intr, valid`.
30
+ + `depth`: `(H,W)`-shaped numpy array that stores depth value;
31
+ + `intr`: `(4,)`-shaped numpy array that stores camera intrinsics `[fx, fy, cx, cy]`, where units are pixels;
32
+ + `valid`: `(H,W)`-shaped boolean numpy array that stores whether the depth value of a pixel is valid (e.g. a pixel of `inf,nan` or extreme depth value is invalid).
33
+
34
+ `sample_data/gt_depth.npz`, `sample_data/curv_low_freq__0.200_10.0.npz`, `sample_data_2/gt_depth.npz`, `sample_data_2/depthpro_gt_focal.npz` provide examples of depth map files.
35
+ ### Valid triangle (Required For Textureless Relighting Visualization)
36
+ In textureless relighting, we induce a textureless mesh from depth map and camera intrinsics. The mesh consists of triangle faces of vertices `(i,j), (i+1,j), (i,j+1)` and `(i+1,j+1), (i+1,j), (i,j+1)`.
37
+ Triangle faces across occlusion boundaries should be excluded.
38
+
39
+ `valid_triangle.npz` specifies which triangles are included (`True`) and which are not (`False`).
40
+ It has keys: `valid_triangle`, which is a `(H-1,W-1,2)` shaped boolean numpy array, where `valid_triangle[i,j,0/1]` stands for whether `(i,j), (i+1,j), (i,j+1)` and `(i+1,j+1), (i+1,j), (i,j+1)` are included (`True`) or not included (`False`).
41
+ `sample_data/valid_triangle.npz` provides an example.
42
+
43
+ **Induce valid triangle from ground truth depth.** Valid triangle can be induced from ground truth depth by inducing occlusion boundaries by some heuristic.
44
+ In `induce_valid_triangle_from_gt_depth.py`, we provide an example script which detects occlusion boundaries by relative depth between neighboring pixels and set triangles across occlusion boundaries as invalid.
45
+ Running `python induce_valid_triangle_from_gt_depth.py` generates `sample_data_2/valid_triangle.npz`.
46
+
47
+ ## Compute Metric
48
+ Please refer to `compute_metrics_example.py`
49
+
50
+ ## Visualization
51
+ ### Projected Contours
52
+
53
+ <img src="images/projected_contours.png">
54
+
55
+ ```bash
56
+ ROOT=sample_data # Path to directory where rgb.png is located
57
+ # ROOT=sample_data_2
58
+ DEPTH_F=gt_depth.npz # Path to depth map to draw visualization, relative to $ROOT
59
+ # DEPTH_F=curv_low_freq__0.200_10.0.npz # when ROOT=sample_data
60
+ # DEPTH_F=depthpro_gt_focal.npz # when ROOT=sample_data_2
61
+ python evalmde/visualization/render_contour_line.py $ROOT --depth_f $DEPTH_F
62
+ ```
63
+ Running the above command generates projected contours visualization under `sample_data/contour_line` or `sample_data_2/contour_line`.
64
+ Projected contours of different densities along different axes are generated.
65
+ ### Textureless Relighting
66
+
67
+ <img src="images/textureless_relighting.png">
68
+
69
+ ```bash
70
+ ROOT=sample_data # Path to directory where rgb.png is located
71
+ # ROOT=sample_data_2
72
+ DEPTH_F=gt_depth.npz # Path to depth map to draw visualization, relative to $ROOT
73
+ # DEPTH_F=curv_low_freq__0.200_10.0.npz # when ROOT=sample_data
74
+ # DEPTH_F=depthpro_gt_focal.npz # when ROOT=sample_data_2
75
+ LIGHT_L=0 # specifies light direction
76
+ LIGHT_R=5 # specifies light direction
77
+ python evalmde/visualization/render_textureless_relighting.py $ROOT --depth_f $DEPTH_F --light_l $LIGHT_L --light_r $LIGHT_R
78
+ ```
79
+ Running the above command generates textureless relighting visualization under `sample_data/visualization`.
80
+ By default, the script renders visualization using GPU. Add `--cpu` to run everything in cpu.
81
+
82
+ `ROT_LIGHT_NUM_LIGHT,ROT_LIGHT_NUM_LOOP` in `evalmde/visualization/__init__.py` specifies the light configuration.
83
+ `ROT_LIGHT_NUM_LIGHT` locations of the source of directional light are equally spaced along the path that spirals up from `(0,0,-1)` to `(0,0,1)` along the surface of a unit sphere, rotating around `z`-axis for `ROT_LIGHT_NUM_LOOP` times.
84
+ Textureless mesh under the `i`-th source of directional light (`LIGHT_L<=i<LIGHT_R`) are rendered in the above command.
85
+
86
+ ## Dataset
87
+ Dataset can be accessed [here](https://drive.google.com/drive/folders/1amzb6KyF2USFQ5W4CeYKFCh1F-yOQsmp?usp=sharing).
88
+
89
+ ## Acknowledgments
90
+ This repository uses open source projects. We specially thank authors of [MoGe](https://github.com/microsoft/MoGe), [Marigold](https://github.com/prs-eth/Marigold), [DepthPro](https://github.com/apple/ml-depth-pro).
compute_metrics_example.py ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from evalmde.utils.depth import load_data
2
+ # gt_depth, gt_intr, gt_valid = load_data('sample_data/gt_depth.npz')
3
+ # pr_depth, pr_intr, pr_valid = load_data('sample_data/curv_low_freq__0.200_10.0.npz')
4
+ gt_depth, gt_intr, gt_valid = load_data('sample_data_2/gt_depth.npz')
5
+ pr_depth, pr_intr, pr_valid = load_data('sample_data_2/depthpro_gt_focal.npz')
6
+
7
+
8
+ from evalmde.metrics.rel_normal import compute_rel_normal
9
+ from evalmde.metrics.sawa_h import compute_sawa_h
10
+ sawa_h = compute_sawa_h(pr_depth, pr_intr, pr_valid, gt_depth, gt_intr, gt_valid)
11
+ rel_normal = compute_rel_normal(pr_depth, pr_intr, pr_valid, gt_depth, gt_intr, gt_valid)
12
+ print(f'{sawa_h=}, {rel_normal=}')
evalmde/__init__.py ADDED
File without changes
evalmde/__pycache__/__init__.cpython-310.pyc ADDED
Binary file (135 Bytes). View file
 
evalmde/metrics/__init__.py ADDED
File without changes
evalmde/metrics/__pycache__/boundary.cpython-310.pyc ADDED
Binary file (10 kB). View file
 
evalmde/metrics/boundary.py ADDED
@@ -0,0 +1,346 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # source https://github.com/apple/ml-depth-pro/blob/main/src/depth_pro/eval/boundary_metrics.py
2
+
3
+ from typing import List, Tuple
4
+
5
+ import numpy as np
6
+ import torch
7
+
8
+
9
+ def connected_component(r: np.ndarray, c: np.ndarray) -> List[List[int]]:
10
+ """Find connected components in the given row and column indices.
11
+
12
+ Args:
13
+ ----
14
+ r (np.ndarray): Row indices.
15
+ c (np.ndarray): Column indices.
16
+
17
+ Yields:
18
+ ------
19
+ List[int]: Indices of connected components.
20
+
21
+ """
22
+ indices = [0]
23
+ for i in range(1, r.size):
24
+ if r[i] == r[indices[-1]] and c[i] == c[indices[-1]] + 1:
25
+ indices.append(i)
26
+ else:
27
+ yield indices
28
+ indices = [i]
29
+ yield indices
30
+
31
+
32
+ def nms_horizontal(ratio: np.ndarray, threshold: float) -> np.ndarray:
33
+ """Apply Non-Maximum Suppression (NMS) horizontally on the given ratio matrix.
34
+
35
+ Args:
36
+ ----
37
+ ratio (np.ndarray): Input ratio matrix.
38
+ threshold (float): Threshold for NMS.
39
+
40
+ Returns:
41
+ -------
42
+ np.ndarray: Binary mask after applying NMS.
43
+
44
+ """
45
+ mask = np.zeros_like(ratio, dtype=bool)
46
+ r, c = np.nonzero(ratio > threshold)
47
+ if len(r) == 0:
48
+ return mask
49
+ for ids in connected_component(r, c):
50
+ values = [ratio[r[i], c[i]] for i in ids]
51
+ mi = np.argmax(values)
52
+ mask[r[ids[mi]], c[ids[mi]]] = True
53
+ return mask
54
+
55
+
56
+ def nms_vertical(ratio: np.ndarray, threshold: float) -> np.ndarray:
57
+ """Apply Non-Maximum Suppression (NMS) vertically on the given ratio matrix.
58
+
59
+ Args:
60
+ ----
61
+ ratio (np.ndarray): Input ratio matrix.
62
+ threshold (float): Threshold for NMS.
63
+
64
+ Returns:
65
+ -------
66
+ np.ndarray: Binary mask after applying NMS.
67
+
68
+ """
69
+ return np.transpose(nms_horizontal(np.transpose(ratio), threshold))
70
+
71
+
72
+ def fgbg_depth(
73
+ d: np.ndarray, t: float
74
+ ) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
75
+ """Find foreground-background relations between neighboring pixels.
76
+
77
+ Args:
78
+ ----
79
+ d (np.ndarray): Depth matrix.
80
+ t (float): Threshold for comparison.
81
+
82
+ Returns:
83
+ -------
84
+ Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]: Four matrices indicating
85
+ left, top, right, and bottom foreground-background relations.
86
+
87
+ """
88
+ right_is_big_enough = (d[..., :, 1:] / d[..., :, :-1]) > t
89
+ left_is_big_enough = (d[..., :, :-1] / d[..., :, 1:]) > t
90
+ bottom_is_big_enough = (d[..., 1:, :] / d[..., :-1, :]) > t
91
+ top_is_big_enough = (d[..., :-1, :] / d[..., 1:, :]) > t
92
+ return (
93
+ left_is_big_enough,
94
+ top_is_big_enough,
95
+ right_is_big_enough,
96
+ bottom_is_big_enough,
97
+ )
98
+
99
+
100
+ def fgbg_depth_thinned(
101
+ d: np.ndarray, t: float
102
+ ) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
103
+ """Find foreground-background relations between neighboring pixels with Non-Maximum Suppression.
104
+
105
+ Args:
106
+ ----
107
+ d (np.ndarray): Depth matrix.
108
+ t (float): Threshold for NMS.
109
+
110
+ Returns:
111
+ -------
112
+ Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]: Four matrices indicating
113
+ left, top, right, and bottom foreground-background relations with NMS applied.
114
+
115
+ """
116
+ right_is_big_enough = nms_horizontal(d[..., :, 1:] / d[..., :, :-1], t)
117
+ left_is_big_enough = nms_horizontal(d[..., :, :-1] / d[..., :, 1:], t)
118
+ bottom_is_big_enough = nms_vertical(d[..., 1:, :] / d[..., :-1, :], t)
119
+ top_is_big_enough = nms_vertical(d[..., :-1, :] / d[..., 1:, :], t)
120
+ return (
121
+ left_is_big_enough,
122
+ top_is_big_enough,
123
+ right_is_big_enough,
124
+ bottom_is_big_enough,
125
+ )
126
+
127
+
128
+ def fgbg_binary_mask(
129
+ d: np.ndarray,
130
+ ) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
131
+ """Find foreground-background relations between neighboring pixels in binary masks.
132
+
133
+ Args:
134
+ ----
135
+ d (np.ndarray): Binary depth matrix.
136
+
137
+ Returns:
138
+ -------
139
+ Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]: Four matrices indicating
140
+ left, top, right, and bottom foreground-background relations in binary masks.
141
+
142
+ """
143
+ assert d.dtype == bool
144
+ right_is_big_enough = d[..., :, 1:] & ~d[..., :, :-1]
145
+ left_is_big_enough = d[..., :, :-1] & ~d[..., :, 1:]
146
+ bottom_is_big_enough = d[..., 1:, :] & ~d[..., :-1, :]
147
+ top_is_big_enough = d[..., :-1, :] & ~d[..., 1:, :]
148
+ return (
149
+ left_is_big_enough,
150
+ top_is_big_enough,
151
+ right_is_big_enough,
152
+ bottom_is_big_enough,
153
+ )
154
+
155
+
156
+ def edge_recall_matting(pr: np.ndarray, gt: np.ndarray, t: float) -> float:
157
+ """Calculate edge recall for image matting.
158
+
159
+ Args:
160
+ ----
161
+ pr (np.ndarray): Predicted depth matrix.
162
+ gt (np.ndarray): Ground truth binary mask.
163
+ t (float): Threshold for NMS.
164
+
165
+ Returns:
166
+ -------
167
+ float: Edge recall value.
168
+
169
+ """
170
+ assert gt.dtype == bool
171
+ ap, bp, cp, dp = fgbg_depth_thinned(pr, t)
172
+ ag, bg, cg, dg = fgbg_binary_mask(gt)
173
+ return 0.25 * (
174
+ np.count_nonzero(ap & ag) / max(np.count_nonzero(ag), 1)
175
+ + np.count_nonzero(bp & bg) / max(np.count_nonzero(bg), 1)
176
+ + np.count_nonzero(cp & cg) / max(np.count_nonzero(cg), 1)
177
+ + np.count_nonzero(dp & dg) / max(np.count_nonzero(dg), 1)
178
+ )
179
+
180
+
181
+ def _boundary_f1(
182
+ pr: np.ndarray,
183
+ gt: np.ndarray,
184
+ t: float,
185
+ return_p: bool = False,
186
+ return_r: bool = False,
187
+ ) -> float:
188
+ """Calculate Boundary F1 score.
189
+
190
+ Args:
191
+ ----
192
+ pr (np.ndarray): Predicted depth matrix.
193
+ gt (np.ndarray): Ground truth depth matrix.
194
+ t (float): Threshold for comparison.
195
+ return_p (bool, optional): If True, return precision. Defaults to False.
196
+ return_r (bool, optional): If True, return recall. Defaults to False.
197
+
198
+ Returns:
199
+ -------
200
+ float: Boundary F1 score, or precision, or recall depending on the flags.
201
+
202
+ """
203
+ ap, bp, cp, dp = fgbg_depth(pr, t)
204
+ ag, bg, cg, dg = fgbg_depth(gt, t)
205
+
206
+ r = 0.25 * (
207
+ np.count_nonzero(ap & ag) / max(np.count_nonzero(ag), 1)
208
+ + np.count_nonzero(bp & bg) / max(np.count_nonzero(bg), 1)
209
+ + np.count_nonzero(cp & cg) / max(np.count_nonzero(cg), 1)
210
+ + np.count_nonzero(dp & dg) / max(np.count_nonzero(dg), 1)
211
+ )
212
+ p = 0.25 * (
213
+ np.count_nonzero(ap & ag) / max(np.count_nonzero(ap), 1)
214
+ + np.count_nonzero(bp & bg) / max(np.count_nonzero(bp), 1)
215
+ + np.count_nonzero(cp & cg) / max(np.count_nonzero(cp), 1)
216
+ + np.count_nonzero(dp & dg) / max(np.count_nonzero(dp), 1)
217
+ )
218
+ if r + p == 0:
219
+ return 0.0
220
+ if return_p:
221
+ return p
222
+ if return_r:
223
+ return r
224
+ return 2 * (r * p) / (r + p)
225
+
226
+
227
+ def get_thresholds_and_weights(
228
+ t_min: float, t_max: float, N: int
229
+ ) -> Tuple[np.ndarray, np.ndarray]:
230
+ """Generate thresholds and weights for the given range.
231
+
232
+ Args:
233
+ ----
234
+ t_min (float): Minimum threshold.
235
+ t_max (float): Maximum threshold.
236
+ N (int): Number of thresholds.
237
+
238
+ Returns:
239
+ -------
240
+ Tuple[np.ndarray, np.ndarray]: Array of thresholds and corresponding weights.
241
+
242
+ """
243
+ thresholds = np.linspace(t_min, t_max, N)
244
+ weights = thresholds / thresholds.sum()
245
+ return thresholds, weights
246
+
247
+
248
+ def invert_depth(depth: np.ndarray, eps: float = 1e-6) -> np.ndarray:
249
+ """Inverts a depth map with numerical stability.
250
+
251
+ Args:
252
+ ----
253
+ depth (np.ndarray): Depth map to be inverted.
254
+ eps (float): Minimum value to avoid division by zero (default is 1e-6).
255
+
256
+ Returns:
257
+ -------
258
+ np.ndarray: Inverted depth map.
259
+
260
+ """
261
+ inverse_depth = 1.0 / depth.clip(min=eps)
262
+ return inverse_depth
263
+
264
+
265
+ def SI_boundary_F1(
266
+ predicted_depth: np.ndarray,
267
+ target_depth: np.ndarray,
268
+ t_min: float = 1.05,
269
+ t_max: float = 1.25,
270
+ N: int = 10,
271
+ ) -> float:
272
+ """Calculate Scale-Invariant Boundary F1 Score for depth-based ground-truth.
273
+
274
+ Args:
275
+ ----
276
+ predicted_depth (np.ndarray): Predicted depth matrix.
277
+ target_depth (np.ndarray): Ground truth depth matrix.
278
+ t_min (float, optional): Minimum threshold. Defaults to 1.05.
279
+ t_max (float, optional): Maximum threshold. Defaults to 1.25.
280
+ N (int, optional): Number of thresholds. Defaults to 10.
281
+
282
+ Returns:
283
+ -------
284
+ float: Scale-Invariant Boundary F1 Score.
285
+
286
+ """
287
+ assert predicted_depth.ndim == target_depth.ndim == 2
288
+ thresholds, weights = get_thresholds_and_weights(t_min, t_max, N)
289
+ f1_scores = np.array(
290
+ [
291
+ _boundary_f1(invert_depth(predicted_depth), invert_depth(target_depth), t)
292
+ for t in thresholds
293
+ ]
294
+ )
295
+ return np.sum(f1_scores * weights)
296
+
297
+
298
+ def SI_boundary_Recall(
299
+ predicted_depth: np.ndarray,
300
+ target_mask: np.ndarray,
301
+ t_min: float = 1.05,
302
+ t_max: float = 1.25,
303
+ N: int = 10,
304
+ alpha_threshold: float = 0.1,
305
+ ) -> float:
306
+ """Calculate Scale-Invariant Boundary Recall Score for mask-based ground-truth.
307
+
308
+ Args:
309
+ ----
310
+ predicted_depth (np.ndarray): Predicted depth matrix.
311
+ target_mask (np.ndarray): Ground truth binary mask.
312
+ t_min (float, optional): Minimum threshold. Defaults to 1.05.
313
+ t_max (float, optional): Maximum threshold. Defaults to 1.25.
314
+ N (int, optional): Number of thresholds. Defaults to 10.
315
+ alpha_threshold (float, optional): Threshold for alpha masking. Defaults to 0.1.
316
+
317
+ Returns:
318
+ -------
319
+ float: Scale-Invariant Boundary Recall Score.
320
+
321
+ """
322
+ assert predicted_depth.ndim == target_mask.ndim == 2
323
+ thresholds, weights = get_thresholds_and_weights(t_min, t_max, N)
324
+ thresholded_target = target_mask > alpha_threshold
325
+
326
+ recall_scores = np.array(
327
+ [
328
+ edge_recall_matting(
329
+ invert_depth(predicted_depth), thresholded_target, t=float(t)
330
+ )
331
+ for t in thresholds
332
+ ]
333
+ )
334
+ weighted_recall = np.sum(recall_scores * weights)
335
+ return weighted_recall
336
+
337
+ def boundary_f1(pred, target, mask):
338
+ # set masked values to NaN
339
+ pred = torch.where(mask, pred, torch.nan)
340
+ target = torch.where(mask, target, torch.nan)
341
+
342
+ f1 = SI_boundary_F1(pred.cpu().numpy(), target.cpu().numpy())
343
+
344
+ return None, (1 - f1).item()
345
+
346
+
evalmde/metrics/rel_normal.py ADDED
@@ -0,0 +1,231 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import List
2
+ from math import floor
3
+
4
+ import numpy as np
5
+ import torch
6
+ import torch.nn.functional as F
7
+
8
+ from evalmde.utils.torch import get_angle_between, reformat_as_torch_tensor
9
+ from evalmde.utils.downsample import downsample
10
+ from evalmde.utils.proj import depth_to_xyz
11
+
12
+ DEFAULT_CONFIG={
13
+ 'scales': [1, 2, 4, 8],
14
+ 'num_sample': int(1e6),
15
+ 'radius': 32,
16
+ 'min_radius': 3,
17
+ 'invalid': 'penalty',
18
+ }
19
+
20
+ @torch.no_grad()
21
+ def _fetch_pixel_val(x: torch.Tensor, vertex_slice):
22
+ '''
23
+ :param x: shape (H, W, ...)
24
+ :param vertex_slice:
25
+ :return: shape (H - 1, W - 1, ...)
26
+ '''
27
+ return x[vertex_slice[0], vertex_slice[1]]
28
+
29
+
30
+ @torch.no_grad()
31
+ def get_triangle_valid(valid: torch.Tensor):
32
+ '''
33
+ a triangle is valid if all vertices are valid
34
+ :param valid: shape (H, W)
35
+ :return: triangle_valid
36
+ triangle_valid: shape (H - 1, W - 1, NUM_TRIANGLE)
37
+ '''
38
+ H, W = valid.shape
39
+ device = valid.device
40
+ ret = torch.empty((H - 2, W - 2, NUM_TRIANGLE), dtype=torch.bool, device=device)
41
+ for i, TRIANGLE_SLICE in enumerate(TRIANGLE_SLICES):
42
+ ret[..., i] = _fetch_pixel_val(valid, TRIANGLE_SLICE[0]) & \
43
+ _fetch_pixel_val(valid, TRIANGLE_SLICE[1]) & \
44
+ _fetch_pixel_val(valid, TRIANGLE_SLICE[2])
45
+ return ret
46
+
47
+ TRIANGLE_SLICES=((
48
+ (slice(None, -2), slice(None, -2)),
49
+ (slice(2, None), slice(None, -2)),
50
+ (slice(None, -2), slice(2, None)),
51
+ ),)
52
+ NUM_TRIANGLE = 1
53
+ @torch.no_grad()
54
+ def get_triangle_normal(xyz: torch.Tensor):
55
+ '''
56
+ Normal computation method 2: 2-pixel spacing
57
+ :param xyz: shape (H, W, 3)
58
+ :return: normal, normal_valid
59
+ normal: shape (H - 2, W - 2, NUM_TRIANGLE_2, 3)
60
+ normal_valid: shape (H - 2, W - 2, NUM_TRIANGLE_2)
61
+ '''
62
+ H, W = xyz.shape[:2]
63
+ device = xyz.device
64
+ dtype = xyz.dtype
65
+ normal = torch.empty((H - 2, W - 2, 1, 3), dtype=dtype, device=device)
66
+ normal_valid = torch.empty((H - 2, W - 2, 1), dtype=torch.bool, device=device)
67
+ for i, TRIANGLE_SLICE in enumerate(TRIANGLE_SLICES):
68
+ normal[..., i, :] = torch.linalg.cross(
69
+ F.normalize(_fetch_pixel_val(xyz, TRIANGLE_SLICE[1]) - _fetch_pixel_val(xyz, TRIANGLE_SLICE[0]), dim=-1),
70
+ F.normalize(_fetch_pixel_val(xyz, TRIANGLE_SLICE[2]) - _fetch_pixel_val(xyz, TRIANGLE_SLICE[0]), dim=-1),
71
+ dim=-1
72
+ )
73
+ vec_norm = torch.norm(normal[..., i, :], dim=-1)
74
+ normal_valid[..., i] = vec_norm > 1e-5
75
+ normal[..., i, :] /= vec_norm.clamp(min=1e-5).unsqueeze(-1)
76
+ return normal, normal_valid
77
+
78
+ @torch.no_grad()
79
+ def get_triangle_normal_and_valid(xyz: torch.Tensor, valid: torch.Tensor, flatten: bool = True):
80
+ '''
81
+ if gt_d and depth_layer are not None, filter out triangle across depth layers
82
+ :param xyz:
83
+ :param valid:
84
+ :param flatten:
85
+ :return: normal, valid
86
+ '''
87
+ normal, normal_valid = get_triangle_normal(xyz)
88
+ tri_valid = get_triangle_valid(valid)
89
+ valid = normal_valid & tri_valid
90
+ if flatten:
91
+ normal = normal.reshape(-1, 3)
92
+ valid = valid.reshape(-1)
93
+ return normal, valid
94
+
95
+
96
+ @torch.no_grad()
97
+ def get_angle_between(n1: torch.Tensor, n2: torch.Tensor) -> torch.Tensor:
98
+ '''
99
+ :param n1: shape (..., 3), norm > 0
100
+ :param n2: shape (..., 3), norm > 0
101
+ :return: shape (...)
102
+ '''
103
+ return torch.acos((F.normalize(n1, dim=-1) * F.normalize(n2, dim=-1)).sum(dim=-1).clamp(-1, 1))
104
+
105
+ @torch.no_grad()
106
+ def get_pair_pxl(H: int, W: int, num_sample: int, radius: int, device):
107
+ radius = min(radius, max(H, W))
108
+ i1 = torch.empty((num_sample,), dtype=torch.long, device=device)
109
+ j1 = torch.empty((num_sample,), dtype=torch.long, device=device)
110
+ i2 = torch.empty((num_sample,), dtype=torch.long, device=device)
111
+ j2 = torch.empty((num_sample,), dtype=torch.long, device=device)
112
+
113
+ n = 0
114
+ s = torch.quasirandom.SobolEngine(4)
115
+ while n < num_sample:
116
+ samples = s.draw(floor(num_sample * 1.1)).to(device)
117
+ samples[:,0] *= H
118
+ samples[:,1] *= W
119
+ samples[:,2] *= radius * 2
120
+ samples[:,2] -= radius
121
+ samples[:,3] *= radius * 2
122
+ samples[:,3] -= radius
123
+ points = torch.cat([samples[:,:2], samples[:,:2] + samples[:,2:]], dim=1)
124
+ points = torch.floor(points)
125
+
126
+ valid = (points[:,[0,2]] < H).all(dim=-1) & (points[:,[1,3]] < W).all(dim=-1) & (0 <= points[:,[0,2]]).all(dim=-1) & (0 <= points[:,[1,3]]).all(dim=-1)
127
+ points = points[valid]
128
+ m = min(len(points), num_sample - n)
129
+ i1[n:n+m] = points[:m,0]
130
+ j1[n:n+m] = points[:m,1]
131
+ i2[n:n+m] = points[:m,2]
132
+ j2[n:n+m] = points[:m,3]
133
+ n += m
134
+
135
+ return i1, j1, i2, j2
136
+
137
+
138
+ @torch.no_grad()
139
+ def get_rel_normal_err_heatmap_idx(gt_xyz: torch.Tensor, gt_valid: torch.Tensor,
140
+ pred_xyz: torch.Tensor, pred_valid: torch.Tensor,
141
+ num_sample: int, radius: int):
142
+ '''
143
+ :param gt_xyz:
144
+ :param gt_valid:
145
+ :param pred_xyz:
146
+ :param pred_valid:
147
+ :param num_sample:
148
+ :param radius:
149
+ :return: rel_normal_err, gt_pair_valid, pred_pair_valid
150
+ rel_normal_err: shape (-1,)
151
+ gt_pair_valid: shape (-1,)
152
+ pred_pair_valid: shape (-1,)
153
+ '''
154
+ gt_normal, gt_normal_valid = get_triangle_normal_and_valid(gt_xyz, gt_valid, flatten=False)
155
+ pred_normal, pred_normal_valid = get_triangle_normal_and_valid(pred_xyz, pred_valid, flatten=False)
156
+
157
+ H, W = gt_normal.shape[:2]
158
+ i1, j1, i2, j2 = get_pair_pxl(H, W, num_sample, radius, gt_xyz.device)
159
+
160
+ gt_rel_normal = get_angle_between(gt_normal[i1, j1], gt_normal[i2, j2])
161
+ gt_pair_valid = gt_normal_valid[i1, j1] & gt_normal_valid[i2, j2]
162
+ pred_rel_normal = get_angle_between(pred_normal[i1, j1], pred_normal[i2, j2])
163
+ pred_pair_valid = pred_normal_valid[i1, j1] & pred_normal_valid[i2, j2]
164
+ rel_normal_err = torch.abs(gt_rel_normal - pred_rel_normal) # [0, pi]
165
+ return rel_normal_err, gt_pair_valid, pred_pair_valid, (i1,j1,i2,j2)
166
+
167
+
168
+
169
+ def get_multi_scale_rel_normal_err(gt_xyz: torch.Tensor, gt_valid: torch.Tensor,
170
+ pred_xyz: torch.Tensor, pred_valid: torch.Tensor,
171
+ scales: List[int], num_sample: int, radius: int, min_radius: int, invalid):
172
+ '''
173
+ :param gt_xyz:
174
+ :param gt_valid:
175
+ :param pred_xyz:
176
+ :param pred_valid:
177
+ :param scales: list of down-sample scales
178
+ :param num_sample:
179
+ :param radius:
180
+ :param min_radius:
181
+ :return: list of avg relative normal errors under each scale
182
+ '''
183
+ ret = []
184
+ for sc in scales:
185
+ ds_gt_valid, ds_gt_xyz, ds_pred_valid, ds_pred_xyz = downsample(sc, gt_valid, [gt_xyz, pred_valid, pred_xyz])
186
+ err, gt_pair_valid, pred_pair_valid, _ = get_rel_normal_err_heatmap_idx(ds_gt_xyz, ds_gt_valid, ds_pred_xyz, ds_pred_valid, num_sample, max(radius // sc, min_radius))
187
+ match invalid:
188
+ case 'penalty':
189
+ err = torch.where(gt_pair_valid & ~pred_pair_valid, torch.pi, err)
190
+ err = err[gt_pair_valid]
191
+ case 'ignore':
192
+ err = err[gt_pair_valid & pred_pair_valid]
193
+ case _:
194
+ raise ValueError()
195
+
196
+ if err.shape[0] > 0:
197
+ scalar_err = err.mean().item()
198
+ ret.append(scalar_err)
199
+ if len(ret) == 0:
200
+ ret = [0]
201
+ return ret
202
+
203
+
204
+ def rel_normal(gt_xyz, gt_valid, pred_xyz, pred_valid, cfg=None, **kwargs):
205
+ if cfg is None:
206
+ cfg = DEFAULT_CONFIG | kwargs
207
+ device_args = {k:v for k,v in cfg.items() if k == 'device'}
208
+ cfg.pop('device', None)
209
+ gt_xyz = reformat_as_torch_tensor(gt_xyz, **device_args)
210
+ gt_valid = reformat_as_torch_tensor(gt_valid, **device_args)
211
+ pred_xyz = reformat_as_torch_tensor(pred_xyz, **device_args)
212
+ pred_valid = reformat_as_torch_tensor(pred_valid, **device_args)
213
+ return np.mean(get_multi_scale_rel_normal_err(gt_xyz, gt_valid, pred_xyz, pred_valid, **cfg))
214
+
215
+
216
+ def compute_rel_normal(pred_depth: np.ndarray, pred_intr: np.ndarray, pred_valid: np.ndarray,
217
+ gt_depth: np.ndarray, gt_intr: np.ndarray, gt_valid: np.ndarray) -> float:
218
+ '''
219
+ :param pred_depth: shape (H, W)
220
+ :param pred_intr: shape (4,), [fx, fy, cx, cy]
221
+ :param pred_valid: shape (H, W), dtype: np.bool_
222
+ :param gt_depth: shape (H, W)
223
+ :param gt_intr: shape (4,), [fx, fy, cx, cy]
224
+ :param gt_valid: shape (H, W), dtype: np.bool_
225
+ :return: SAWA-H value
226
+ '''
227
+ err = rel_normal(
228
+ depth_to_xyz(gt_intr, gt_depth), gt_valid,
229
+ depth_to_xyz(pred_intr, pred_depth), pred_valid,
230
+ )
231
+ return err
evalmde/metrics/sawa_h.py ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+
3
+ from evalmde.utils.proj import depth_to_xyz
4
+ from evalmde.utils.depth import align
5
+ from evalmde.utils.torch import reformat_as_torch_tensor
6
+ from evalmde.metrics.standard import rel_depth, delta0125
7
+ from evalmde.metrics.boundary import boundary_f1
8
+ from evalmde.metrics.rel_normal import rel_normal as rel_normal_func
9
+
10
+
11
+ def compute_sawa_h(pred_depth: np.ndarray, pred_intr: np.ndarray, pred_valid: np.ndarray,
12
+ gt_depth: np.ndarray, gt_intr: np.ndarray, gt_valid: np.ndarray) -> float:
13
+ '''
14
+ :param pred_depth: shape (H, W)
15
+ :param pred_intr: shape (4,), [fx, fy, cx, cy]
16
+ :param pred_valid: shape (H, W), dtype: np.bool_
17
+ :param gt_depth: shape (H, W)
18
+ :param gt_intr: shape (4,), [fx, fy, cx, cy]
19
+ :param gt_valid: shape (H, W), dtype: np.bool_
20
+ :return: SAWA-H value
21
+ '''
22
+ wkdr__no_align = 1 - rel_depth(pred_depth, gt_depth, gt_valid)[1]
23
+ delta0125__disparity_af_clip_by_0 = 1 - delta0125(align(
24
+ 1 / reformat_as_torch_tensor(pred_depth),
25
+ reformat_as_torch_tensor(gt_depth),
26
+ reformat_as_torch_tensor(gt_valid),
27
+ 'disparity_affine_clip_by_0'
28
+ ), gt_depth, gt_valid)[1]
29
+ delta0125__depth_af_lst_sq_clip_by_0 = 1 - delta0125(align(
30
+ reformat_as_torch_tensor(pred_depth),
31
+ reformat_as_torch_tensor(gt_depth),
32
+ reformat_as_torch_tensor(gt_valid),
33
+ 'depth_affine_lst_sq_clip_by_0'
34
+ ), gt_depth, gt_valid)[1]
35
+ boundary__no_align = boundary_f1(
36
+ reformat_as_torch_tensor(pred_depth),
37
+ reformat_as_torch_tensor(gt_depth),
38
+ reformat_as_torch_tensor(gt_valid)
39
+ )[1]
40
+ rel_normal = rel_normal_func(
41
+ depth_to_xyz(gt_intr, gt_depth), gt_valid,
42
+ depth_to_xyz(pred_intr, pred_depth), pred_valid,
43
+ )
44
+ err = 3.65 * wkdr__no_align + 0.18 * delta0125__disparity_af_clip_by_0 + 0.01 * delta0125__depth_af_lst_sq_clip_by_0 + 0.20 * boundary__no_align + 1.94 * rel_normal
45
+ return err
evalmde/metrics/standard.py ADDED
@@ -0,0 +1,214 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # source: https://github.com/YvanYin/Metric3D/blob/main/mono/utils/avg_meter.py
2
+ import torch
3
+
4
+
5
+ def reformat_input(x):
6
+ if not isinstance(x, torch.Tensor):
7
+ x = torch.from_numpy(x)
8
+ x = x.to(torch.float)
9
+ device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
10
+ x = x.to(device)
11
+ return x
12
+
13
+
14
+ def absrel_pnt(pred, target, mask):
15
+ pred, target, mask = reformat_input(pred), reformat_input(target), reformat_input(mask)
16
+ assert pred.dim() == 3 and target.dim() == 3 and mask.dim() == 2
17
+ if mask.sum() == 0:
18
+ return None, None
19
+
20
+ dist_gt = torch.norm(target, dim=-1)
21
+ dist_err = torch.norm(pred - target, dim=-1)
22
+ err_heatmap = dist_err / (dist_gt + (1e-10)) * mask
23
+ err_heatmap[mask < .5] = 0
24
+ err = err_heatmap.sum() / mask.sum()
25
+ return err_heatmap.cpu().numpy(), err.item()
26
+
27
+
28
+ def rel_depth(pred, target, mask):
29
+ pred, target, mask = reformat_input(pred), reformat_input(target), reformat_input(mask)
30
+ assert pred.dim() == 2 and target.dim() == 2 and mask.dim() == 2
31
+ if mask.sum() == 0:
32
+ return None, None
33
+ mask = mask > .5
34
+ p, t = pred[mask], target[mask]
35
+ device = p.device
36
+ N = p.shape[0]
37
+ M = int(1e7)
38
+ i = torch.randint(0, N, (M,), device=device, dtype=torch.long)
39
+ j = torch.randint(0, N, (M,), device=device, dtype=torch.long)
40
+ correct = (p[i] < p[j]) == (t[i] < t[j])
41
+ return None, correct.float().mean().item()
42
+
43
+
44
+ def absrel(pred, target, mask):
45
+ pred, target, mask = reformat_input(pred), reformat_input(target), reformat_input(mask)
46
+ assert pred.dim() == 2 and target.dim() == 2 and mask.dim() == 2
47
+ if mask.sum() == 0:
48
+ return None, None
49
+
50
+ t_m = target * mask
51
+ p_m = pred * mask
52
+ t_m[mask < .5] = 0
53
+ p_m[mask < .5] = 0
54
+
55
+ err_heatmap = torch.abs(t_m - p_m) / (t_m + 1e-10) # (H, W)
56
+ err = err_heatmap.sum() / mask.sum()
57
+ return err_heatmap.cpu().numpy(), err.item()
58
+
59
+
60
+ def rmse(pred, target, mask):
61
+ pred, target, mask = reformat_input(pred), reformat_input(target), reformat_input(mask)
62
+ assert pred.dim() == 2 and target.dim() == 2 and mask.dim() == 2
63
+ if mask.sum() == 0:
64
+ return None, None
65
+
66
+ t_m = target * mask
67
+ p_m = pred * mask
68
+ t_m[mask < .5] = 0
69
+ p_m[mask < .5] = 0
70
+
71
+ err_heatmap = (t_m - p_m) ** 2 # (H, W)
72
+ err = torch.sqrt(err_heatmap.sum() / mask.sum())
73
+ return err_heatmap.cpu().numpy(), err.item()
74
+
75
+
76
+ def rmse_log(pred, target, mask):
77
+ pred, target, mask = reformat_input(pred), reformat_input(target), reformat_input(mask)
78
+ assert pred.dim() == 2 and target.dim() == 2 and mask.dim() == 2
79
+ if mask.sum() == 0:
80
+ return None, None
81
+
82
+ t_m = target * mask
83
+ p_m = pred * mask
84
+ t_m[mask < .5] = 0
85
+ p_m[mask < .5] = 0
86
+
87
+ err_heatmap = ((torch.log10(p_m+1e-10) - torch.log10(t_m+1e-10)) * mask) ** 2 # (H, W)
88
+ err = torch.sqrt(err_heatmap.sum() / mask.sum())
89
+ return err_heatmap.cpu().numpy(), err.item()
90
+
91
+
92
+ def delta1(pred, target, mask):
93
+ pred, target, mask = reformat_input(pred), reformat_input(target), reformat_input(mask)
94
+ assert pred.dim() == 2 and target.dim() == 2 and mask.dim() == 2
95
+ if mask.sum() == 0:
96
+ return None, (None, None, None)
97
+
98
+ t_m = target * mask
99
+ p_m = pred
100
+
101
+ gt_pred = t_m / (p_m + 1e-10) # (H, W)
102
+ pred_gt = p_m / (t_m + 1e-10) # (H, W)
103
+ gt_pred_gt = torch.stack([gt_pred, pred_gt], dim=-1) # (H, W, 2)
104
+ ratio_max = torch.amax(gt_pred_gt, dim=-1) # (H, W)
105
+ err_heatmap = (ratio_max - 1) * mask # (H, W)
106
+ ratio_max[mask < .5] = 99999
107
+
108
+ delta_1_sum = torch.sum(ratio_max < 1.25)
109
+ delta_2_sum = torch.sum(ratio_max < 1.25 ** 2)
110
+ delta_3_sum = torch.sum(ratio_max < 1.25 ** 3)
111
+ return err_heatmap.cpu().numpy(), (delta_1_sum / mask.sum()).item()
112
+
113
+
114
+ def delta0125(pred, target, mask):
115
+ pred, target, mask = reformat_input(pred), reformat_input(target), reformat_input(mask)
116
+ assert pred.dim() == 2 and target.dim() == 2 and mask.dim() == 2
117
+ if mask.sum() == 0:
118
+ return None, (None, None, None)
119
+
120
+ t_m = target * mask
121
+ p_m = pred
122
+
123
+ gt_pred = t_m / (p_m + 1e-10) # (H, W)
124
+ pred_gt = p_m / (t_m + 1e-10) # (H, W)
125
+ gt_pred_gt = torch.stack([gt_pred, pred_gt], dim=-1) # (H, W, 2)
126
+ ratio_max = torch.amax(gt_pred_gt, dim=-1) # (H, W)
127
+ err_heatmap = (ratio_max - 1) * mask # (H, W)
128
+ ratio_max[mask < .5] = 99999
129
+
130
+ delta_sum = torch.sum(ratio_max < 1.25 ** 0.125)
131
+ return err_heatmap.cpu().numpy(), (delta_sum / mask.sum()).item()
132
+
133
+
134
+ def delta2(pred, target, mask):
135
+ pred, target, mask = reformat_input(pred), reformat_input(target), reformat_input(mask)
136
+ assert pred.dim() == 2 and target.dim() == 2 and mask.dim() == 2
137
+ if mask.sum() == 0:
138
+ return None, (None, None, None)
139
+
140
+ t_m = target * mask
141
+ p_m = pred
142
+
143
+ gt_pred = t_m / (p_m + 1e-10) # (H, W)
144
+ pred_gt = p_m / (t_m + 1e-10) # (H, W)
145
+ gt_pred_gt = torch.stack([gt_pred, pred_gt], dim=-1) # (H, W, 2)
146
+ ratio_max = torch.amax(gt_pred_gt, dim=-1) # (H, W)
147
+ err_heatmap = (ratio_max - 1) * mask # (H, W)
148
+ ratio_max[mask < .5] = 99999
149
+
150
+ delta_1_sum = torch.sum(ratio_max < 1.25)
151
+ delta_2_sum = torch.sum(ratio_max < 1.25 ** 2)
152
+ delta_3_sum = torch.sum(ratio_max < 1.25 ** 3)
153
+ return err_heatmap.cpu().numpy(), (delta_2_sum / mask.sum()).item()
154
+
155
+
156
+ def delta3(pred, target, mask):
157
+ pred, target, mask = reformat_input(pred), reformat_input(target), reformat_input(mask)
158
+ assert pred.dim() == 2 and target.dim() == 2 and mask.dim() == 2
159
+ if mask.sum() == 0:
160
+ return None, (None, None, None)
161
+
162
+ t_m = target * mask
163
+ p_m = pred
164
+
165
+ gt_pred = t_m / (p_m + 1e-10) # (H, W)
166
+ pred_gt = p_m / (t_m + 1e-10) # (H, W)
167
+ gt_pred_gt = torch.stack([gt_pred, pred_gt], dim=-1) # (H, W, 2)
168
+ ratio_max = torch.amax(gt_pred_gt, dim=-1) # (H, W)
169
+ err_heatmap = (ratio_max - 1) * mask # (H, W)
170
+ ratio_max[mask < .5] = 99999
171
+
172
+ delta_1_sum = torch.sum(ratio_max < 1.25)
173
+ delta_2_sum = torch.sum(ratio_max < 1.25 ** 2)
174
+ delta_3_sum = torch.sum(ratio_max < 1.25 ** 3)
175
+ return err_heatmap.cpu().numpy(), (delta_3_sum / mask.sum()).item()
176
+
177
+
178
+ def log10(pred, target, mask):
179
+ pred, target, mask = reformat_input(pred), reformat_input(target), reformat_input(mask)
180
+ assert pred.dim() == 2 and target.dim() == 2 and mask.dim() == 2
181
+ if mask.sum() == 0:
182
+ return None, None
183
+
184
+ t_m = target * mask
185
+ p_m = pred * mask
186
+ t_m[mask < .5] = 0
187
+ p_m[mask < .5] = 0
188
+
189
+ err_heatmap = torch.abs((torch.log10(p_m+1e-10) - torch.log10(t_m+1e-10)) * mask)
190
+ err = err_heatmap.sum() / mask.sum()
191
+ return err_heatmap.cpu().numpy(), err.item()
192
+
193
+
194
+ def rmse_log_si(pred, target, mask): # RMSE (log, scale-invariant)
195
+ # https://github.com/prs-eth/Marigold/blob/main/src/util/metric.py#L175
196
+ depth_pred, depth_gt, valid_mask = reformat_input(pred), reformat_input(target), reformat_input(mask)
197
+ assert depth_pred.dim() == 2 and depth_gt.dim() == 2 and valid_mask.dim() == 2
198
+ if valid_mask.sum() == 0:
199
+ return None, None
200
+
201
+ valid_mask = valid_mask > .5
202
+ diff = torch.log(depth_pred) - torch.log(depth_gt)
203
+ if valid_mask is not None:
204
+ diff[~valid_mask] = 0
205
+ n = valid_mask.sum((-1, -2))
206
+ else:
207
+ n = depth_gt.shape[-2] * depth_gt.shape[-1]
208
+
209
+ diff2 = torch.pow(diff, 2)
210
+
211
+ first_term = torch.sum(diff2, (-1, -2)) / n
212
+ second_term = torch.pow(torch.sum(diff, (-1, -2)), 2) / (n**2)
213
+ loss = torch.sqrt(torch.mean(first_term - second_term))
214
+ return None, loss.item()
evalmde/metrics/triangle.py ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn.functional as F
3
+
4
+
5
+ '''
6
+ VERTEX_SLICES:
7
+ 0 2
8
+ 1 3
9
+ '''
10
+ VERTEX_SLICES = [
11
+ (slice(None, -1), slice(None, -1)),
12
+ (slice(1, None), slice(None, -1)),
13
+ (slice(None, -1), slice(1, None)),
14
+ (slice(1, None), slice(1, None)),
15
+ ]
16
+ TRIANGLE_SLICES = [
17
+ [VERTEX_SLICES[0], VERTEX_SLICES[1], VERTEX_SLICES[2]],
18
+ [VERTEX_SLICES[2], VERTEX_SLICES[0], VERTEX_SLICES[3]],
19
+ [VERTEX_SLICES[0], VERTEX_SLICES[1], VERTEX_SLICES[3]],
20
+ [VERTEX_SLICES[2], VERTEX_SLICES[1], VERTEX_SLICES[3]],
21
+ ]
22
+ NUM_TRIANGLE = len(TRIANGLE_SLICES)
23
+
24
+
25
+ @torch.no_grad()
26
+ def _fetch_pixel_val(x: torch.Tensor, vertex_slice):
27
+ '''
28
+ :param x: shape (H, W, ...)
29
+ :param vertex_slice:
30
+ :return: shape (H - 1, W - 1, ...)
31
+ '''
32
+ return x[vertex_slice[0], vertex_slice[1]]
33
+
34
+
35
+ @torch.no_grad()
36
+ def get_triangle_valid(valid: torch.Tensor):
37
+ '''
38
+ a triangle is valid if all vertices are valid
39
+ :param valid: shape (H, W)
40
+ :return: triangle_valid
41
+ triangle_valid: shape (H - 1, W - 1, NUM_TRIANGLE)
42
+ '''
43
+ H, W = valid.shape
44
+ device = valid.device
45
+ ret = torch.empty((H - 1, W - 1, NUM_TRIANGLE), dtype=torch.bool, device=device)
46
+ for i, TRIANGLE_SLICE in enumerate(TRIANGLE_SLICES):
47
+ ret[..., i] = _fetch_pixel_val(valid, TRIANGLE_SLICE[0]) & \
48
+ _fetch_pixel_val(valid, TRIANGLE_SLICE[1]) & \
49
+ _fetch_pixel_val(valid, TRIANGLE_SLICE[2])
50
+ return ret
51
+
52
+
53
+ @torch.no_grad()
54
+ def get_triangle_normal(xyz: torch.Tensor):
55
+ '''
56
+ :param xyz: shape (H, W, 3)
57
+ :return: normal, normal_valid
58
+ normal: shape (H - 1, W - 1, NUM_TRIANGLE, 3)
59
+ normal_valid: shape (H - 1, W - 1, NUM_TRIANGLE)
60
+ '''
61
+ H, W = xyz.shape[:2]
62
+ device = xyz.device
63
+ dtype = xyz.dtype
64
+ normal = torch.empty((H - 1, W - 1, NUM_TRIANGLE, 3), dtype=dtype, device=device)
65
+ normal_valid = torch.empty((H - 1, W - 1, NUM_TRIANGLE), dtype=torch.bool, device=device)
66
+ for i, TRIANGLE_SLICE in enumerate(TRIANGLE_SLICES):
67
+ normal[..., i, :] = torch.linalg.cross(
68
+ F.normalize(_fetch_pixel_val(xyz, TRIANGLE_SLICE[1]) - _fetch_pixel_val(xyz, TRIANGLE_SLICE[0]), dim=-1),
69
+ F.normalize(_fetch_pixel_val(xyz, TRIANGLE_SLICE[2]) - _fetch_pixel_val(xyz, TRIANGLE_SLICE[0]), dim=-1),
70
+ dim=-1
71
+ )
72
+ vec_norm = torch.norm(normal[..., i, :], dim=-1) # (H - 1, W - 1)
73
+ normal_valid[..., i] = vec_norm > 1e-5
74
+ normal[..., i, :] /= vec_norm.clamp(min=1e-5).unsqueeze(-1)
75
+ return normal, normal_valid
76
+
77
+
78
+ @torch.no_grad()
79
+ def get_triangle_normal_and_valid(xyz: torch.Tensor, valid: torch.Tensor, flatten: bool = True):
80
+ '''
81
+ if gt_d and depth_layer are not None, filter out triangle across depth layers
82
+ :param xyz:
83
+ :param valid:
84
+ :param flatten:
85
+ :return: normal, valid
86
+ '''
87
+ normal, normal_valid = get_triangle_normal(xyz)
88
+ tri_valid = get_triangle_valid(valid)
89
+ valid = normal_valid & tri_valid
90
+ if flatten:
91
+ normal = normal.reshape(-1, 3)
92
+ valid = valid.reshape(-1)
93
+ return normal, valid
evalmde/utils/__init__.py ADDED
File without changes
evalmde/utils/__pycache__/__init__.cpython-310.pyc ADDED
Binary file (141 Bytes). View file
 
evalmde/utils/__pycache__/blender.cpython-310.pyc ADDED
Binary file (7.32 kB). View file
 
evalmde/utils/__pycache__/common.cpython-310.pyc ADDED
Binary file (1.71 kB). View file
 
evalmde/utils/__pycache__/constants.cpython-310.pyc ADDED
Binary file (199 Bytes). View file
 
evalmde/utils/__pycache__/depth.cpython-310.pyc ADDED
Binary file (3.49 kB). View file
 
evalmde/utils/__pycache__/depth_to_mesh.cpython-310.pyc ADDED
Binary file (4.51 kB). View file
 
evalmde/utils/__pycache__/downsample.cpython-310.pyc ADDED
Binary file (2.99 kB). View file
 
evalmde/utils/__pycache__/image.cpython-310.pyc ADDED
Binary file (1.4 kB). View file
 
evalmde/utils/__pycache__/np_and_th.cpython-310.pyc ADDED
Binary file (1 kB). View file
 
evalmde/utils/__pycache__/proj.cpython-310.pyc ADDED
Binary file (1.6 kB). View file
 
evalmde/utils/__pycache__/torch.cpython-310.pyc ADDED
Binary file (1.08 kB). View file
 
evalmde/utils/blender.py ADDED
@@ -0,0 +1,213 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import shutil
3
+
4
+ import bpy
5
+ import mathutils
6
+ import numpy as np
7
+ import OpenEXR
8
+ import Imath
9
+ from scipy.spatial.transform import Rotation as scipy_Rotation
10
+
11
+ from evalmde.utils.constants import VALID_DEPTH_LB, VALID_DEPTH_UB
12
+ from evalmde.utils.common import pathlib_file
13
+ from evalmde.utils.depth import get_depth_valid
14
+ from evalmde.utils.image import imread_rgb, imwrite_rgb
15
+
16
+
17
+ def bpy_set_tmp_dir(tmp_dir):
18
+ tmp_dir = pathlib_file(tmp_dir)
19
+ tmp_dir.mkdir(parents=True, exist_ok=True)
20
+ bpy.context.preferences.filepaths.temporary_directory = str(tmp_dir)
21
+
22
+
23
+ def bpy_create_cam(cam_name, cam_pose, fx, fy, cx, cy, w, h):
24
+ cam_data = bpy.data.cameras.new(name=cam_name)
25
+ cam_data.sensor_height = cam_data.sensor_width * h / w
26
+ cam_data.lens = (fx + fy) / 2 * cam_data.sensor_width / w
27
+ cam_data.shift_x = (w / 2 - cx) / w
28
+ cam_data.shift_y = (cy - h / 2) / h
29
+
30
+ cam_object = bpy.data.objects.new(cam_name, cam_data)
31
+ bpy.context.collection.objects.link(cam_object)
32
+ cam_object.matrix_world = mathutils.Matrix([cam_pose[0], cam_pose[1], cam_pose[2], cam_pose[3]])
33
+ return cam_object
34
+
35
+
36
+ def bpy_add_ambient_light(energy=1.0):
37
+ world = bpy.data.worlds.new("AmbientWorld")
38
+ bpy.context.scene.world = world
39
+ world.use_nodes = True
40
+ bg = world.node_tree.nodes["Background"]
41
+ bg.inputs[0].default_value = (1, 1, 1, 1)
42
+ bg.inputs[1].default_value = energy
43
+
44
+
45
+ def bpy_enable_gpu(device_type="CUDA"):
46
+ prefs = bpy.context.preferences
47
+ cprefs = prefs.addons['cycles'].preferences
48
+ cprefs.compute_device_type = device_type # "CUDA", "OPTIX", "METAL", "HIP"
49
+ cprefs.get_devices() # Initialize devices
50
+ for device in cprefs.devices:
51
+ device.use = True
52
+ bpy.context.scene.cycles.device = 'GPU'
53
+
54
+
55
+ def bpy_setup_rgb_render():
56
+ bpy.context.scene.use_nodes = True
57
+ tree = bpy.context.scene.node_tree
58
+ tree.nodes.clear()
59
+ render_layers = tree.nodes.new(type='CompositorNodeRLayers')
60
+ rgb_output = tree.nodes.new(type='CompositorNodeOutputFile')
61
+ rgb_output.label = 'RGB Output'
62
+ rgb_output.base_path = ''
63
+ rgb_output.format.file_format = 'PNG'
64
+ rgb_output.file_slots[0].use_node_format = True
65
+ rgb_output.file_slots[0].save_as_render = True
66
+ tree.links.new(render_layers.outputs['Image'], rgb_output.inputs[0])
67
+ return rgb_output
68
+
69
+
70
+ def bpy_setup_depth_render():
71
+ bpy.context.scene.use_nodes = True
72
+ tree = bpy.context.scene.node_tree
73
+ tree.nodes.clear()
74
+ render_layers = tree.nodes.new(type='CompositorNodeRLayers')
75
+ depth_output = tree.nodes.new(type='CompositorNodeOutputFile')
76
+ depth_output.label = 'Depth Output'
77
+ depth_output.base_path = ''
78
+ depth_output.format.file_format = 'OPEN_EXR'
79
+ depth_output.file_slots[0].use_node_format = True
80
+ depth_output.file_slots[0].save_as_render = True
81
+ bpy.context.view_layer.use_pass_z = True
82
+ bpy.context.scene.view_layers["ViewLayer"].use_pass_z = True
83
+ tree.links.new(render_layers.outputs['Depth'], depth_output.inputs[0])
84
+ return depth_output
85
+
86
+
87
+ def bpy_setup_rgbd_render():
88
+ bpy.context.scene.use_nodes = True
89
+ tree = bpy.context.scene.node_tree
90
+ tree.nodes.clear()
91
+
92
+ # Add Render Layers node to get passes
93
+ render_layers = tree.nodes.new(type='CompositorNodeRLayers')
94
+
95
+ # Add Output File node to save the EXR
96
+ depth_output = tree.nodes.new(type='CompositorNodeOutputFile')
97
+ depth_output.label = 'Depth Output'
98
+ depth_output.base_path = ''
99
+ depth_output.format.file_format = 'OPEN_EXR'
100
+ depth_output.file_slots[0].use_node_format = True
101
+ depth_output.file_slots[0].save_as_render = True
102
+
103
+ # Output for RGB
104
+ rgb_output = tree.nodes.new(type='CompositorNodeOutputFile')
105
+ rgb_output.label = 'RGB Output'
106
+ rgb_output.base_path = ''
107
+ rgb_output.format.file_format = 'PNG'
108
+ rgb_output.file_slots[0].use_node_format = True
109
+ rgb_output.file_slots[0].save_as_render = True
110
+
111
+ # Enable the Z (Depth) pass
112
+ bpy.context.view_layer.use_pass_z = True
113
+ bpy.context.scene.view_layers["ViewLayer"].use_pass_z = True
114
+
115
+ # Link the depth pass output from the render layers node
116
+ tree.links.new(render_layers.outputs['Depth'], depth_output.inputs[0])
117
+ tree.links.new(render_layers.outputs['Image'], rgb_output.inputs[0])
118
+
119
+ return depth_output, rgb_output
120
+
121
+
122
+ def save_depth_from_exr(filepath, h, w):
123
+ exr_file = OpenEXR.InputFile(filepath)
124
+
125
+ dw = exr_file.header()['dataWindow']
126
+ size = (dw.max.x - dw.min.x + 1, dw.max.y - dw.min.y + 1)
127
+ assert size == (w, h), f"Expected {(w, h)}, got {size}"
128
+
129
+ pt = Imath.PixelType(Imath.PixelType.FLOAT)
130
+ channels = exr_file.channels(["R", "G", "B"], pt)
131
+ depth = [np.frombuffer(c, dtype=np.float32).reshape(size[1], size[0]) for c in channels]
132
+ assert np.all(depth[0] == depth[1]) and np.all(depth[0] == depth[2])
133
+ return depth[0]
134
+
135
+
136
+ def bpy_create_directional_light(src: np.ndarray, dst: np.ndarray, energy=5.0, name='Sun'):
137
+ light_data = bpy.data.lights.new(name=name, type='SUN')
138
+ light_data.energy = energy
139
+ light_obj = bpy.data.objects.new(name=name, object_data=light_data)
140
+ bpy.context.collection.objects.link(light_obj)
141
+ light_obj.location = (float(src[0]), float(src[1]), float(src[2]))
142
+
143
+ direction = dst - src
144
+ rot_axis = np.cross(np.array([0, 0, -1.]), direction)
145
+ if np.linalg.norm(rot_axis) < 1e-5:
146
+ rot_axis = np.array([1., 0, 0])
147
+ rot_axis /= np.linalg.norm(rot_axis)
148
+ rot_ang = np.arccos(np.clip(((direction / np.linalg.norm(direction)) * np.array([0, 0, -1.])).sum(), -1, 1))
149
+ rot_euler = scipy_Rotation.from_rotvec(rot_ang * rot_axis, degrees=False).as_euler('xyz', degrees=False)
150
+ light_obj.rotation_euler = (float(rot_euler[0]), float(rot_euler[1]), float(rot_euler[2]))
151
+
152
+
153
+ def bpy_render_rgb(cam_object, h, w, num_sample, rgb_node, output_root, out_name):
154
+ bpy.context.scene.cycles.samples = num_sample
155
+ bpy.context.scene.render.resolution_x = w
156
+ bpy.context.scene.render.resolution_y = h
157
+
158
+ bpy.context.scene.camera = cam_object
159
+ rgb_node.base_path = str(output_root)
160
+ rgb_node.file_slots[0].path = f"image_{out_name}-"
161
+ bpy.ops.render.render(write_still=True)
162
+
163
+
164
+ def bpy_render_rgbd(cam_object, h, w, num_sample, depth_node, rgb_node, output_root, out_name):
165
+ bpy.context.scene.cycles.samples = num_sample
166
+ bpy.context.scene.render.resolution_x = w
167
+ bpy.context.scene.render.resolution_y = h
168
+
169
+ bpy.context.scene.camera = cam_object
170
+ depth_node.base_path = str(output_root)
171
+ depth_node.file_slots[0].path = f"depth_{out_name}_"
172
+ rgb_node.base_path = str(output_root)
173
+ rgb_node.file_slots[0].path = f"image_{out_name}-"
174
+ bpy.ops.render.render(write_still=True)
175
+ exr_path = os.path.join(str(output_root), f"depth_{out_name}_0001.exr")
176
+ depth_np = save_depth_from_exr(exr_path, h, w)
177
+ np.save(os.path.join(str(output_root), f"depth_{out_name}.npy"), depth_np)
178
+ os.remove(exr_path)
179
+
180
+
181
+ def bpy_render_rgb_and_filter_invalid(cam_object, h, w, num_sample, depth_node, rgb_node, output_root, out_name, bkg_color, valid_depth_lb=VALID_DEPTH_LB, valid_depth_ub=VALID_DEPTH_UB, save_depth=False):
182
+ '''
183
+ :param cam_object:
184
+ :param h:
185
+ :param w:
186
+ :param num_sample:
187
+ :param depth_node:
188
+ :param rgb_node:
189
+ :param output_root:
190
+ :param out_name:
191
+ :param bkg_color: list of 3 integers, [0, 255]
192
+ :param valid_depth_lb:
193
+ :param valid_depth_ub:
194
+ :param save_depth:
195
+ :return:
196
+ '''
197
+ bpy_render_rgbd(cam_object, h, w, num_sample, depth_node, rgb_node, output_root, out_name)
198
+
199
+ img_f = pathlib_file(os.path.join(str(output_root), f"image_{out_name}-0001.png"))
200
+ depth_f = pathlib_file(os.path.join(str(output_root), f"depth_{out_name}.npy"))
201
+ output_root = pathlib_file(output_root)
202
+ tmp_dir = output_root.parent / f'{output_root.name}__tmp'
203
+ tmp_dir.mkdir(parents=True, exist_ok=True)
204
+
205
+ img = imread_rgb(img_f)
206
+ depth = np.load(depth_f)
207
+ shutil.move(img_f, tmp_dir / img_f.name)
208
+ if not save_depth:
209
+ shutil.move(depth_f, tmp_dir / depth_f.name)
210
+ img[~get_depth_valid(depth, valid_depth_lb, valid_depth_ub)] = np.array(bkg_color)
211
+ imwrite_rgb(output_root / f'image_{out_name}.png', img)
212
+ os.remove(tmp_dir / img_f.name)
213
+ os.remove(tmp_dir / depth_f.name)
evalmde/utils/common.py ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pathlib import Path
2
+ from datetime import datetime
3
+
4
+ import shortuuid
5
+ from omegaconf import DictConfig
6
+
7
+
8
+ def flatten_dict_cfg(cfg): # [dict | DictConfig]) -> DictConfig:
9
+ ret = {}
10
+ if isinstance(cfg, dict):
11
+ cfg = DictConfig(cfg)
12
+ for k, v in cfg.items():
13
+ if isinstance(v, DictConfig):
14
+ ret_v = flatten_dict_cfg(v)
15
+ for _k, _v in ret_v.items():
16
+ ret[f'{k}_{_k}'] = _v
17
+ else:
18
+ ret[k] = v
19
+ return DictConfig(ret)
20
+
21
+
22
+ def current_time():
23
+ current_time = datetime.now()
24
+ readable_time = current_time.strftime("%Y-%m-%d-%H:%M:%S")
25
+ return readable_time
26
+
27
+
28
+ def uuid(length=8):
29
+ """
30
+ https://github.com/wandb/client/blob/master/wandb/util.py#L677
31
+ """
32
+
33
+ # ~3t run ids (36**8)
34
+ run_gen = shortuuid.ShortUUID(alphabet=list("0123456789abcdefghijklmnopqrstuvwxyz"))
35
+ return run_gen.random(length)
36
+
37
+
38
+ def pathlib_file(file_name):
39
+ if isinstance(file_name, str):
40
+ file_name = Path(file_name)
41
+ elif not isinstance(file_name, Path):
42
+ raise TypeError(f'Please check the type of the filename:{file_name}')
43
+ return file_name
44
+
45
+
46
+ def assign_item_to_dict(d: dict, ks: list, v):
47
+ '''
48
+ run d[ks[0]][ks[1]]...[ks[-1]] = v with filling empty keys
49
+ :param d:
50
+ :param ks:
51
+ :param v:
52
+ :return:
53
+ '''
54
+ k = ks[0]
55
+ if len(ks) == 1:
56
+ d[k] = v
57
+ else:
58
+ if k not in d:
59
+ d[k] = dict()
60
+ assign_item_to_dict(d[k], ks[1:], v)
evalmde/utils/constants.py ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ VALID_DEPTH_LB = 1e-2
2
+ VALID_DEPTH_UB = 1e4
evalmde/utils/depth.py ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import Tuple
2
+
3
+ import numpy as np
4
+ import torch
5
+
6
+ from evalmde.utils.constants import VALID_DEPTH_LB, VALID_DEPTH_UB
7
+ from evalmde.utils.torch import reformat_as_torch_tensor
8
+
9
+
10
+ def align_depth_least_square(
11
+ gt_arr: np.ndarray,
12
+ pred_arr: np.ndarray,
13
+ valid_mask_arr: np.ndarray,
14
+ return_scale_shift=True,
15
+ max_resolution=None,
16
+ ):
17
+ # https://github.com/prs-eth/Marigold/blob/62413d56099d36573b2de1eb8/src/util/alignment.py#L8
18
+ ori_shape = pred_arr.shape # input shape
19
+
20
+ gt = gt_arr.squeeze() # [H, W]
21
+ pred = pred_arr.squeeze()
22
+ valid_mask = valid_mask_arr.squeeze()
23
+
24
+ # Downsample
25
+ if max_resolution is not None:
26
+ scale_factor = np.min(max_resolution / np.array(ori_shape[-2:]))
27
+ if scale_factor < 1:
28
+ downscaler = torch.nn.Upsample(scale_factor=scale_factor, mode="nearest")
29
+ gt = downscaler(torch.as_tensor(gt).unsqueeze(0)).numpy()
30
+ pred = downscaler(torch.as_tensor(pred).unsqueeze(0)).numpy()
31
+ valid_mask = (
32
+ downscaler(torch.as_tensor(valid_mask).unsqueeze(0).float())
33
+ .bool()
34
+ .numpy()
35
+ )
36
+
37
+ assert (
38
+ gt.shape == pred.shape == valid_mask.shape
39
+ ), f"{gt.shape}, {pred.shape}, {valid_mask.shape}"
40
+
41
+ gt_masked = gt[valid_mask].reshape((-1, 1))
42
+ pred_masked = pred[valid_mask].reshape((-1, 1))
43
+
44
+ # numpy solver
45
+ _ones = np.ones_like(pred_masked)
46
+ A = np.concatenate([pred_masked, _ones], axis=-1)
47
+ X = np.linalg.lstsq(A, gt_masked, rcond=None)[0]
48
+ scale, shift = X
49
+
50
+ aligned_pred = pred_arr * scale + shift
51
+
52
+ # restore dimensions
53
+ aligned_pred = aligned_pred.reshape(ori_shape)
54
+
55
+ if return_scale_shift:
56
+ return aligned_pred, scale, shift
57
+ else:
58
+ return aligned_pred
59
+
60
+
61
+ def align_affine_lstsq(x: torch.Tensor, y: torch.Tensor, w: torch.Tensor = None) -> Tuple[torch.Tensor, torch.Tensor]:
62
+ # https://github.com/microsoft/MoGe/blob/a8c37341bc0325ca99b9d57981cc3bb2bd3e255b/moge/utils/alignment.py#L399
63
+ """
64
+ Solve `min sum_i w_i * (a * x_i + b - y_i ) ^ 2`, where `a` and `b` are scalars, with respect to `a` and `b` using least squares.
65
+
66
+ ### Parameters:
67
+ - `x: torch.Tensor` of shape (..., N)
68
+ - `y: torch.Tensor` of shape (..., N)
69
+ - `w: torch.Tensor` of shape (..., N)
70
+
71
+ ### Returns:
72
+ - `a: torch.Tensor` of shape (...,)
73
+ - `b: torch.Tensor` of shape (...,)
74
+ """
75
+ w_sqrt = torch.ones_like(x) if w is None else w.sqrt()
76
+ A = torch.stack([w_sqrt * x, torch.ones_like(x)], dim=-1)
77
+ B = (w_sqrt * y)[..., None]
78
+ a, b = torch.linalg.lstsq(A, B)[0].squeeze(-1).unbind(-1)
79
+ return a, b
80
+
81
+
82
+ def get_depth_valid(depth, valid_depth_lb=VALID_DEPTH_LB, valid_depth_ub=VALID_DEPTH_UB):
83
+ if isinstance(depth, np.ndarray):
84
+ return (~np.isnan(depth)) & (~np.isinf(depth)) & (depth >= valid_depth_lb) & (depth <= valid_depth_ub)
85
+ elif isinstance(depth, torch.Tensor):
86
+ return (~torch.isnan(depth)) & (~torch.isinf(depth)) & (depth >= valid_depth_lb) & (depth <= valid_depth_ub)
87
+ else:
88
+ raise ValueError(f'{type(depth)=}')
89
+
90
+
91
+ def load_data(depth_f, as_torch=False):
92
+ data = np.load(depth_f)
93
+ depth, intr, valid = data['depth'], data['intr'], data['valid']
94
+ depth[~valid] = 1
95
+ if as_torch:
96
+ depth = reformat_as_torch_tensor(depth)
97
+ intr = reformat_as_torch_tensor(intr)
98
+ valid = reformat_as_torch_tensor(valid)
99
+ return depth, intr, valid
100
+
101
+
102
+ def align(pred, gt, gt_valid, method, return_align_param=False, eps=1e-4):
103
+ if method == 'no':
104
+ if return_align_param:
105
+ return pred, None
106
+ return pred
107
+
108
+ if method == 'depth_affine_lst_sq_clip_by_0':
109
+ # pred: affine-invariant depth
110
+ # gt: gt depth
111
+ # return: aligned depth
112
+ ret, scale, shift = align_depth_least_square(gt.cpu().numpy(), pred.cpu().numpy(), gt_valid.cpu().numpy())
113
+ ret = torch.from_numpy(ret).to(device=pred.device, dtype=pred.dtype).clamp_min(eps)
114
+ if return_align_param:
115
+ return ret, (float(scale), float(shift))
116
+ return ret
117
+
118
+ if method in ['disparity_affine', 'disparity_affine_clip_by_0']:
119
+ # pred: predicted affine-invariant disparity
120
+ # gt: gt depth
121
+ # return: aligned depth
122
+ scale, shift = align_affine_lstsq(pred[gt_valid], 1 / gt[gt_valid])
123
+ pred_disp = pred * scale + shift
124
+ if method == 'disparity_affine':
125
+ ret = 1 / pred_disp.clamp_min(1 / gt[gt_valid].max().item())
126
+ else:
127
+ ret = 1 / pred_disp.clamp_min(eps)
128
+ if return_align_param:
129
+ return ret, (float(scale), float(shift))
130
+ return ret
131
+
132
+ raise NotImplementedError(f'{method=}')
evalmde/utils/depth_to_mesh.py ADDED
@@ -0,0 +1,150 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+ import open3d as o3d
3
+ import trimesh
4
+
5
+ from evalmde.utils.proj import depth_to_xyz, apply_SE3
6
+
7
+
8
+ def gen_triangle_v_idx(H, W):
9
+ pxl_idx = np.arange(H * W).reshape(H, W)
10
+ triangle_v_idx = np.stack([
11
+ np.stack([pxl_idx[:-1, :-1], pxl_idx[1:, :-1], pxl_idx[:-1, 1:]], axis=-1), # (H - 1, W - 1, 3)
12
+ np.stack([pxl_idx[1:, 1:], pxl_idx[:-1, 1:], pxl_idx[1:, :-1]], axis=-1), # (H - 1, W - 1, 3)
13
+ ], axis=-2) # (H - 1, W - 1, 2, 3)
14
+ return triangle_v_idx
15
+
16
+
17
+ def gen_trimesh_mesh(vs, cs, triangles):
18
+ mesh_vertices = o3d.utility.Vector3dVector(vs.reshape(-1, 3))
19
+ mesh_faces = o3d.utility.Vector3iVector(triangles)
20
+ mesh = o3d.geometry.TriangleMesh(mesh_vertices, mesh_faces)
21
+ mesh.compute_vertex_normals()
22
+
23
+ trimesh_mesh = trimesh.Trimesh(
24
+ vertices=np.asarray(mesh.vertices),
25
+ faces=np.asarray(mesh.triangles),
26
+ vertex_normals=np.asarray(mesh.vertex_normals),
27
+ vertex_colors=cs.reshape(-1, 3),
28
+ process=False
29
+ )
30
+ material = trimesh.visual.material.PBRMaterial(
31
+ vertexColors=True,
32
+ doubleSided=True
33
+ )
34
+ trimesh_mesh.visual.material = material
35
+ return trimesh_mesh
36
+
37
+
38
+ def concatenate_mesh_data(mesh_datas):
39
+ n = 0
40
+ vs, cs, fs = [], [], []
41
+ for v, c, f in mesh_datas:
42
+ vs.append(v)
43
+ cs.append(c)
44
+ fs.append(f + n)
45
+ n += v.shape[0]
46
+ return np.concatenate(vs, axis=0), np.concatenate(cs, axis=0), np.concatenate(fs, axis=0)
47
+
48
+
49
+ def gen_mesh_and_pcd(intr, depth, depth_valid, SE3=np.eye(4), rgb=None, valid_triangle=None, crop_region=None):
50
+ '''
51
+ :param intr: shape (4,)
52
+ :param depth: shape (H, W)
53
+ :param SE3: shape (4, 4), points coords: apply_SE3(SE3, depth_to_xyz(intr, depth))
54
+ :param rgb:
55
+ if rgb.dtype == np.uint8:
56
+ use rgb / 255
57
+ else:
58
+ assert rgb.dtype == np.float32
59
+ use rgb
60
+ :param valid_triangle:
61
+ :param crop_region: [lb_i, ub_i, lb_j, ub_j]
62
+ :return:
63
+ '''
64
+ depth = depth.astype(np.float32)
65
+ SE3 = SE3.astype(np.float32)
66
+
67
+ H, W = depth.shape
68
+
69
+ if crop_region is not None and len(crop_region) > 0:
70
+ lb_i, ub_i, lb_j, ub_j = crop_region
71
+ region_valid = np.zeros_like(depth_valid)
72
+ region_valid[lb_i:ub_i, lb_j:ub_j] = True
73
+ depth_valid = depth_valid & region_valid
74
+
75
+ xyz = apply_SE3(SE3, depth_to_xyz(intr, depth))
76
+
77
+ # create triangles
78
+ triangle_v_idx = gen_triangle_v_idx(H, W)
79
+
80
+ # compute validity based on xyz validity
81
+ valid_flattened = depth_valid.reshape(-1)
82
+ xyz_flattened = xyz.reshape(-1, 3)
83
+ valid_triangle_vertex = \
84
+ valid_flattened[triangle_v_idx[..., 0]] & \
85
+ valid_flattened[triangle_v_idx[..., 1]] & \
86
+ valid_flattened[triangle_v_idx[..., 2]] # (H - 1, W - 1, 2)
87
+ if valid_triangle is None:
88
+ valid_triangle = valid_triangle_vertex
89
+ else:
90
+ valid_triangle = valid_triangle_vertex & valid_triangle
91
+
92
+ if rgb is None:
93
+ vertex_colors = .7 * np.ones_like(xyz_flattened)
94
+ else:
95
+ if rgb.dtype == np.uint8:
96
+ vertex_colors = rgb.reshape(-1, 3).astype(np.float32) / 255.
97
+ else:
98
+ assert rgb.dtype == np.float32
99
+ vertex_colors = rgb.reshape(-1, 3)
100
+
101
+ pxl_displayed = np.zeros((H, W), dtype=np.bool_)
102
+ pxl_displayed[:-1, :-1] |= valid_triangle[..., 0]
103
+ pxl_displayed[1:, :-1] |= valid_triangle[..., 0]
104
+ pxl_displayed[:-1, 1:] |= valid_triangle[..., 0]
105
+ pxl_displayed[1:, 1:] |= valid_triangle[..., 1]
106
+ pxl_displayed[1:, :-1] |= valid_triangle[..., 1]
107
+ pxl_displayed[:-1, 1:] |= valid_triangle[..., 1]
108
+ invisible_to_display = depth_valid & (~pxl_displayed)
109
+
110
+ def get_up_xyz(depth):
111
+ fx, fy, cx, cy = intr[0], intr[1], intr[2], intr[3]
112
+ v, u = np.meshgrid(np.arange(depth.shape[0]), np.arange(depth.shape[1]), indexing='ij')
113
+ up_xyz = apply_SE3(SE3, np.stack([
114
+ np.stack([((u - 1) - cx) / fx * depth, ((v - 1) - cy) / fy * depth, depth], axis=-1),
115
+ np.stack([((u + 1) - cx) / fx * depth, ((v - 1) - cy) / fy * depth, depth], axis=-1),
116
+ np.stack([((u - 1) - cx) / fx * depth, ((v + 1) - cy) / fy * depth, depth], axis=-1),
117
+ np.stack([((u + 1) - cx) / fx * depth, ((v + 1) - cy) / fy * depth, depth], axis=-1),
118
+ ], axis=-2).reshape(H, W, 2, 2, 3))
119
+ return up_xyz
120
+
121
+ depth_range = 1 / (.5 * (intr[0] + intr[1]))
122
+ up_xyz_fnt = get_up_xyz((1 - depth_range) * depth)
123
+ up_xyz_bck = get_up_xyz((1 + depth_range) * depth)
124
+
125
+ up_xyz = np.stack([up_xyz_fnt, up_xyz_bck], axis=2).reshape(H, W, 8, 3) # (H, W, 8, 3)
126
+ up_vertex_idx = np.arange(H * W * 8).reshape(H, W, 8)
127
+ up_triangles_to_stack = []
128
+ for v1, v2, v3, v4 in [
129
+ [0, 2, 3, 1],
130
+ [0, 4, 6, 2],
131
+ [0, 1, 5, 4],
132
+ [7, 5, 1, 3],
133
+ [7, 3, 2, 6],
134
+ [7, 6, 4, 5],
135
+ ]:
136
+ up_triangles_to_stack.append(up_vertex_idx[..., [v1, v2, v3]])
137
+ up_triangles_to_stack.append(up_vertex_idx[..., [v3, v4, v1]])
138
+ up_triangles = np.stack(up_triangles_to_stack, axis=-2) # (H, W, -1, 3)
139
+ up_vertex_colors = np.repeat(vertex_colors.reshape(H, W, 1, 3), 8, axis=-2).reshape(-1, 3)
140
+
141
+ xyz_flattened[~valid_flattened] = 0
142
+ up_xyz[~depth_valid] = 0
143
+
144
+ trimesh_mesh = gen_trimesh_mesh(*concatenate_mesh_data([
145
+ (xyz_flattened, vertex_colors, triangle_v_idx[valid_triangle]),
146
+ (up_xyz.reshape(-1, 3), up_vertex_colors, up_triangles[invisible_to_display].reshape(-1, 3))
147
+ ]))
148
+
149
+ pcd = gen_trimesh_mesh(up_xyz, up_vertex_colors, up_triangles[depth_valid].reshape(-1, 3))
150
+ return trimesh_mesh, pcd
evalmde/utils/downsample.py ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import List
2
+
3
+ import torch
4
+ import torch.nn.functional as F
5
+
6
+ from evalmde.utils.proj import th_uv_grid
7
+
8
+
9
+ def pad(x: torch.Tensor, sc: int) -> torch.Tensor:
10
+ '''
11
+ pad x to bottom and right with 0, so that H % sc == 0 and W % sc == 0
12
+ :param x: shape (H, W, ...)
13
+ :param sc: int
14
+ :return: pad_x
15
+ '''
16
+ H, W, C_shape = x.shape[0], x.shape[1], x.shape[2:]
17
+ x = x.reshape(H, W, -1).permute(2, 0, 1) # (-1, H, W)
18
+ pad_H = (sc - H % sc) % sc
19
+ pad_W = (sc - W % sc) % sc
20
+ x = F.pad(x, (0, pad_W, 0, pad_H), value=0) # (-1, H', W')
21
+ return x.permute(1, 2, 0).reshape((x.shape[-2], x.shape[-1]) + C_shape)
22
+
23
+
24
+ def patchify(x: torch.Tensor, sc: int):
25
+ '''
26
+ reshape (H, W, ...) to (sc, sc, H / sc, W / sc, ...)
27
+ :param x: shape (H, W, ...)
28
+ :param sc: int
29
+ :return: patched_x
30
+ '''
31
+ H, W, C_shape = x.shape[0], x.shape[1], x.shape[2:]
32
+ assert H % sc == 0 and W % sc == 0, f'can\'t patchify ({x.shape=}, {sc=})'
33
+ _H, _W = H // sc, W // sc
34
+ x = x.reshape(_H, sc, _W, sc, -1).permute(1, 3, 0, 2, 4)
35
+ return x.reshape((sc, sc, _H, _W) + C_shape)
36
+
37
+
38
+ def gather(x: torch.Tensor, idx: torch.Tensor):
39
+ '''
40
+ :param x: shape (sc, sc, H / sc, W / sc, ...)
41
+ :param idx: shape (H / sc, W / sc)
42
+ :return: x[idx[i,j] // sc, idx[i,j] % sc, i, j, ...]
43
+ '''
44
+ sc, _, H, W, C_shape = x.shape[0], x.shape[1], x.shape[2], x.shape[3], x.shape[4:]
45
+ x = x.reshape(sc * sc, H, W, -1)
46
+ idx = idx[None, :, :, None].repeat(1, 1, 1, x.shape[-1]) # (1, H / sc, W / sc, -1)
47
+ return torch.gather(x, 0, idx).reshape((H, W) + C_shape)
48
+
49
+
50
+ def downsample(ds_sc: int, valid: torch.Tensor, tensors: List[torch.Tensor]) -> List[torch.Tensor]:
51
+ '''
52
+ :param ds_sc: downsample scale
53
+ :param valid: (H, W), dtype: torch.bool
54
+ :param tensors: list of tensors of shape (H, W, ...)
55
+ :return: [ds_valid, *ds_tensors]
56
+ ds_valid: (ds_H, ds_W)
57
+ ds_tensors: list of tensors of shape (ds_H, ds_W, ...)
58
+ '''
59
+ tensor_kwargs = dict(device=valid.device, dtype=torch.float)
60
+ H, W = valid.shape
61
+ uv = th_uv_grid(H, W, **tensor_kwargs) # (H, W, 2)
62
+ uv = patchify(pad(uv, ds_sc), ds_sc) # (sc, sc, H / sc, W / sc, 2)
63
+ ds_H, ds_W = uv.shape[2], uv.shape[3]
64
+ patch_center = th_uv_grid(ds_H, ds_W, **tensor_kwargs) * ds_sc + .5 * (ds_sc - 1) # (H / sc, W / sc, 2)
65
+ valid = patchify(pad(valid, ds_sc), ds_sc) # (sc, sc, H / sc, W / sc)
66
+ uv_dst = (uv - patch_center[None, None]).norm(dim=-1) # (sc, sc, H / sc, W / sc)
67
+ uv_dst[~valid] = torch.inf # mask out invalid pixels
68
+ uv_dst = uv_dst.reshape(-1, uv_dst.shape[-2], uv_dst.shape[-1]) # (sc * sc, H / sc, W / sc)
69
+ ds_pxl = torch.argmin(uv_dst, dim=0) # (H / sc, W / sc)
70
+ valid = gather(valid, ds_pxl)
71
+ tensors = [gather(patchify(pad(x, ds_sc), ds_sc), ds_pxl) for x in tensors]
72
+ return [valid] + tensors
evalmde/utils/image.py ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import cv2
2
+
3
+ from evalmde.utils.common import pathlib_file
4
+
5
+
6
+ def imread_rgb(img_f):
7
+ return cv2.imread(str(pathlib_file(img_f)))[..., ::-1].copy()
8
+
9
+
10
+ def imwrite_rgb(img_f, img, verbose=False):
11
+ img_f = pathlib_file(img_f)
12
+ img_f.parent.mkdir(parents=True, exist_ok=True)
13
+ cv2.imwrite(str(img_f), img[..., ::-1])
14
+ if verbose:
15
+ print(f'Saved to {img_f.resolve()}')
16
+
17
+
18
+ def resize(img, H=None, W=None, interpolation=cv2.INTER_NEAREST, return_sc=False):
19
+ '''
20
+ if both H and W are specified, resize to smaller one while keeping aspect ratio
21
+ :param img:
22
+ :param H:
23
+ :param W:
24
+ :param interpolation:
25
+ :param return_sc:
26
+ :return:
27
+ '''
28
+ cur_H, cur_W = img.shape[:2]
29
+ if (H is not None) and (W is not None):
30
+ H = int(H)
31
+ W = int(W)
32
+ if H / cur_H < W / cur_W:
33
+ W = None
34
+ else:
35
+ H = None
36
+ if H is not None:
37
+ H = int(H)
38
+ img = cv2.resize(img, (int(img.shape[1] / img.shape[0] * H), H), interpolation=interpolation)
39
+ if W is not None:
40
+ W = int(W)
41
+ img = cv2.resize(img, (W, int(img.shape[0] / img.shape[1] * W)), interpolation=interpolation)
42
+ if return_sc:
43
+ sc = img.shape[0] / cur_H
44
+ return img, sc
45
+ return img
evalmde/utils/np_and_th.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+ import torch
3
+
4
+
5
+ def get_shifted_data(data, di, dj):
6
+ H, W = data.shape
7
+ shifted_data = data[max(di, 0): H + min(di, 0), max(dj, 0): W + min(dj, 0)]
8
+ if isinstance(data, np.ndarray):
9
+ if di < 0:
10
+ shifted_data = np.concatenate([np.zeros_like(shifted_data[di:]), shifted_data], axis=0)
11
+ if di > 0:
12
+ shifted_data = np.concatenate([shifted_data, np.zeros_like(shifted_data[:di])], axis=0)
13
+ if dj < 0:
14
+ shifted_data = np.concatenate([np.zeros_like(shifted_data[:, dj:]), shifted_data], axis=1)
15
+ if dj > 0:
16
+ shifted_data = np.concatenate([shifted_data, np.zeros_like(shifted_data[:, :dj])], axis=1)
17
+ elif isinstance(data, torch.Tensor):
18
+ shifted_data = data[max(di, 0): H + min(di, 0), max(dj, 0): W + min(dj, 0)]
19
+ if di < 0:
20
+ shifted_data = torch.cat([torch.zeros_like(shifted_data[di:]), shifted_data], dim=0)
21
+ if di > 0:
22
+ shifted_data = torch.cat([shifted_data, torch.zeros_like(shifted_data[:di])], dim=0)
23
+ if dj < 0:
24
+ shifted_data = torch.cat([torch.zeros_like(shifted_data[:, dj:]), shifted_data], dim=1)
25
+ if dj > 0:
26
+ shifted_data = torch.cat([shifted_data, torch.zeros_like(shifted_data[:, :dj])], dim=1)
27
+ return shifted_data
evalmde/utils/proj.py ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+ import torch
3
+ import torch.nn.functional as F
4
+
5
+
6
+ def th_uv_grid(H: int, W: int, **tensor_kwargs) -> torch.Tensor:
7
+ '''
8
+ :param H: int
9
+ :param W: int
10
+ :param tensor_kwargs:
11
+ :return: (H, W, 2)
12
+ '''
13
+ v, u = torch.meshgrid(torch.arange(H).to(**tensor_kwargs), torch.arange(W).to(**tensor_kwargs))
14
+ return torch.stack([u, v], dim=-1)
15
+
16
+
17
+ def depth_to_xyz(intr, depth):
18
+ '''
19
+ :param intr: shape (4,)
20
+ :param depth: shape (H, W)
21
+ :return: shape (H, W, 3)
22
+ '''
23
+ fx, fy, cx, cy = intr[0], intr[1], intr[2], intr[3]
24
+ if isinstance(depth, np.ndarray):
25
+ v, u = np.meshgrid(np.arange(depth.shape[0]), np.arange(depth.shape[1]), indexing='ij')
26
+ x = (u - cx) / fx * depth
27
+ y = (v - cy) / fy * depth
28
+ return np.stack([x, y, depth], axis=-1)
29
+ elif isinstance(depth, torch.Tensor):
30
+ tensor_kwargs = dict(device=depth.device, dtype=depth.dtype)
31
+ v, u = torch.meshgrid(torch.arange(depth.shape[0]).to(**tensor_kwargs), torch.arange(depth.shape[1]).to(**tensor_kwargs))
32
+ x = (u - cx) / fx * depth
33
+ y = (v - cy) / fy * depth
34
+ return torch.stack([x, y, depth], dim=-1)
35
+ else:
36
+ raise ValueError(f'{type(depth)=}')
37
+
38
+
39
+ def apply_SE3(SE3, pnt):
40
+ assert SE3.shape == (4, 4) and pnt.shape[-1] == 3
41
+ return (SE3[:3, :3] @ pnt[..., None])[..., 0] + SE3[:3, -1]
evalmde/utils/torch.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import List
2
+
3
+ import numpy as np
4
+ import torch
5
+ import torch.nn.functional as F
6
+
7
+
8
+ @torch.no_grad()
9
+ def get_angle_between(n1: torch.Tensor, n2: torch.Tensor) -> torch.Tensor:
10
+ '''
11
+ :param n1: shape (..., 3), norm > 0
12
+ :param n2: shape (..., 3), norm > 0
13
+ :return: shape (...), in radius
14
+ '''
15
+ return torch.acos((F.normalize(n1, dim=-1) * F.normalize(n2, dim=-1)).sum(dim=-1).clamp(-1, 1))
16
+
17
+
18
+ def reformat_as_torch_tensor(x, device=torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')):
19
+ if isinstance(x, List):
20
+ return torch.tensor(x, device=device)
21
+ elif isinstance(x, np.ndarray):
22
+ return torch.from_numpy(x).to(device=device)
23
+ elif isinstance(x, torch.Tensor):
24
+ return x.to(device=device)
25
+ else:
26
+ raise ValueError(f'Unsupported type: {type(x)}')
evalmde/visualization/__init__.py ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+
3
+
4
+ ROT_LIGHT_NUM_LIGHT = 30
5
+ ROT_LIGHT_NUM_LOOP = 3
6
+
7
+
8
+ def gen_rot_light__light_pos(num_light, num_loop):
9
+ theta = np.linspace(0, np.pi, num_light)
10
+ phi = np.linspace(0, 2 * np.pi * num_loop, num_light)
11
+ x = np.sin(theta) * np.cos(phi)
12
+ z = np.sin(theta) * np.sin(phi)
13
+ y = np.cos(theta)
14
+ return np.stack([x, y, z], axis=-1)
evalmde/visualization/cfg.py ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+
3
+ from evalmde.utils.common import uuid, pathlib_file
4
+ from evalmde.utils.image import imread_rgb
5
+
6
+
7
+ def get_intermediate_mesh_f(args):
8
+ if args.mesh_dir:
9
+ return args.mesh_dir / f'mesh_{uuid(12)}.glb'
10
+ return args.root / f'mesh_{uuid(12)}.glb'
11
+
12
+
13
+ def get_vis_root(args):
14
+ root = args.root
15
+ valid_triangle_name = 'none'
16
+ if args.valid_triangle_f:
17
+ valid_triangle_name = str((root / args.valid_triangle_f).resolve().relative_to(root.resolve()))
18
+ if args.filter_quad:
19
+ valid_triangle_name = valid_triangle_name + '--filter_quad'
20
+ return pathlib_file(root) / 'visualization' / valid_triangle_name[:-4] / str((root / args.depth_f).resolve().relative_to(root.resolve()))[:-4].replace('/', '_')
21
+
22
+
23
+ def get_crop_region(args):
24
+ if len(args.crop_region) == 0:
25
+ return []
26
+ elif len(args.crop_region) == 4:
27
+ return args.crop_region
28
+ else:
29
+ print(f'Warning: invalid length of crop region (expected 4), {args.crop_region=}. Using [] instead.')
30
+ return []
31
+
32
+
33
+ def get_mesh_vertex_col(args, img_shape):
34
+ '''
35
+ :param args:
36
+ :return: in [0, 1]
37
+ '''
38
+ if getattr(args, 'rgb_f', None):
39
+ rgb = imread_rgb(args.root / args.rgb_f).astype(np.float32) / 255
40
+ else:
41
+ rgb = .7 * np.ones(img_shape + (3,), dtype=np.float32)
42
+ print('no rgb, use gray')
43
+ return rgb
44
+
45
+
46
+ def get_valid_triangle(args, img_shape):
47
+ if getattr(args, 'valid_triangle_f', None):
48
+ ret = np.load(args.root / args.valid_triangle_f)['valid_triangle']
49
+ if args.filter_quad:
50
+ ret[..., 0] &= ret[..., 1]
51
+ ret[..., 1] &= ret[..., 0]
52
+ return ret
53
+ else:
54
+ return np.ones((img_shape[0] - 1, img_shape[1] - 1, 2), dtype=np.bool_)
evalmde/visualization/render_contour_line.py ADDED
@@ -0,0 +1,256 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import argparse
2
+ import math
3
+ from pathlib import Path
4
+ import json
5
+
6
+ from PIL import Image
7
+ import cv2
8
+ import torch
9
+ from torchvision import transforms as torch_trans
10
+ import numpy as np
11
+
12
+ from evalmde.utils.proj import depth_to_xyz
13
+ from evalmde.utils.common import assign_item_to_dict, pathlib_file
14
+ from evalmde.utils.image import resize
15
+ from evalmde.utils.image import imread_rgb
16
+ from evalmde.utils.depth import load_data
17
+ from evalmde.utils.np_and_th import get_shifted_data
18
+
19
+
20
+ @torch.no_grad()
21
+ def compute_grid_lb_ub(data, i, j):
22
+ '''
23
+ . . .
24
+
25
+ -------------
26
+ |(0,0)|(0,1)|
27
+ . ------.------ .
28
+ |(1,0)|(1,1)|
29
+ -------------
30
+
31
+ . . .
32
+ '''
33
+ if i == 0 and j == 0:
34
+ x00 = .25 * (data[:-1, :-1] + data[:-1, 1:] + data[1:, :-1] + data[1:, 1:])
35
+ x01 = .5 * (data[:-1, 1:] + data[1:, 1:])
36
+ x10 = .5 * (data[1:, :-1] + data[1:, 1:])
37
+ x11 = 1. * data[1:, 1:]
38
+ elif i == 0 and j == 1:
39
+ x00 = .5 * (data[:-1, :-1] + data[1:, :-1])
40
+ x01 = .25 * (data[:-1, :-1] + data[:-1, 1:] + data[1:, :-1] + data[1:, 1:])
41
+ x10 = 1. * data[1:, :-1]
42
+ x11 = .5 * (data[1:, :-1] + data[1:, 1:])
43
+ elif i == 1 and j == 0:
44
+ x00 = .5 * (data[:-1, :-1] + data[:-1, 1:])
45
+ x01 = 1. * data[:-1, 1:]
46
+ x10 = .25 * (data[:-1, :-1] + data[:-1, 1:] + data[1:, :-1] + data[1:, 1:])
47
+ x11 = .5 * (data[:-1, 1:] + data[1:, 1:])
48
+ else:
49
+ x00 = 1. * data[:-1, :-1]
50
+ x01 = .5 * (data[:-1, :-1] + data[:-1, 1:])
51
+ x10 = .5 * (data[:-1, :-1] + data[1:, :-1])
52
+ x11 = .25 * (data[:-1, :-1] + data[:-1, 1:] + data[1:, :-1] + data[1:, 1:])
53
+ x = torch.stack([x00, x01, x10, x11], dim=-1)
54
+ lb, ub = x.min(dim=-1).values, x.max(dim=-1).values # (H - 1, W - 1), (H - 1, W - 1)
55
+ del x
56
+ return lb, ub
57
+
58
+
59
+ @torch.no_grad()
60
+ def compute_high_res_idx(high_res_shape, data_low_res, valid, valid_high_res, gap, val_lb):
61
+ '''
62
+ :param high_res_shape: (Hu, Wu)
63
+ :param data_low_res: shape (Hl, Wl)
64
+ :param valid_high_res: shape (Hl, Wl)
65
+ :param gap:
66
+ :param val_lb:
67
+ :return: res_high_res
68
+ res_high_res: shape (Hu, Wu)
69
+ '''
70
+ Hu, Wu = high_res_shape
71
+
72
+ # fill invalid pixels with neighbor means
73
+ data_low_res = data_low_res.clone()
74
+ nb_data_sum = torch.zeros_like(data_low_res)
75
+ nb_data_cnt = torch.zeros_like(data_low_res)
76
+ for di in [-1, 0, 1]:
77
+ for dj in [-1, 0, 1]:
78
+ nb_valid = get_shifted_data(valid, di, dj)
79
+ nb_data = get_shifted_data(data_low_res, di, dj)
80
+ nb_data_sum[nb_valid] += nb_data[nb_valid]
81
+ nb_data_cnt[nb_valid] += 1
82
+ nb_data_sum[nb_data_cnt < .5] = 0
83
+ data_low_res[~valid] = (nb_data_sum / nb_data_cnt.clamp(min=1))[~valid]
84
+
85
+ data_high_res = torch_trans.functional.resize(data_low_res[None], (Hu, Wu), torch_trans.InterpolationMode.BILINEAR)[0]
86
+ res_high_res = -torch.ones((Hu, Wu), dtype=torch.int32, device=data_high_res.device)
87
+
88
+ for i in range(2):
89
+ for j in range(2):
90
+ lb, ub = compute_grid_lb_ub(data_high_res, i, j)
91
+ lb_i = torch.clip(torch.ceil((lb - val_lb) / gap), min=0).to(res_high_res.dtype)
92
+ ub_i = torch.clip(torch.floor((ub - val_lb) / gap), max=2e9).to(res_high_res.dtype)
93
+
94
+ multi_line_mask = (lb_i < ub_i) | ((lb_i == ub_i) & (res_high_res[1 - i: Hu - i, 1 - j: Wu - j] != -1))
95
+ single_line_mask = (lb_i == ub_i) & (res_high_res[1 - i: Hu - i, 1 - j: Wu - j] == -1)
96
+
97
+ res_high_res[1 - i: Hu - i, 1 - j: Wu - j][single_line_mask] = lb_i[single_line_mask]
98
+
99
+ multi_line_upd_idx = torch.clip(torch.round((data_high_res[1 - i: Hu - i, 1 - j: Wu - j] - val_lb) / gap), min=0, max=2e9).to(res_high_res.dtype)
100
+ multi_line_upd_idx = torch.where(multi_line_upd_idx < lb_i, lb_i, multi_line_upd_idx)
101
+ multi_line_upd_idx = torch.where(multi_line_upd_idx > ub_i, ub_i, multi_line_upd_idx)
102
+ multi_line_upd_mask = ((res_high_res[1 - i: Hu - i, 1 - j: Wu - j] == -1) | (
103
+ torch.abs(data_high_res[1 - i: Hu - i, 1 - j: Wu - j] - (res_high_res[1 - i: Hu - i, 1 - j: Wu - j] * gap + val_lb)) >
104
+ torch.abs(data_high_res[1 - i: Hu - i, 1 - j: Wu - j] - (multi_line_upd_idx * gap + val_lb))
105
+ )) & multi_line_mask
106
+ res_high_res[1 - i: Hu - i, 1 - j: Wu - j][multi_line_upd_mask] = multi_line_upd_idx[multi_line_upd_mask]
107
+ res_high_res[~valid_high_res] = -1
108
+ return res_high_res
109
+
110
+
111
+ def get_contour_line_gap(data: torch.Tensor, valid: torch.Tensor, num_gap, qt):
112
+ if not valid.any():
113
+ return 1
114
+ qt_lb = data[valid].quantile(qt).item()
115
+ qt_ub = data[valid].quantile(1 - qt).item()
116
+ gap = (qt_ub - qt_lb) / (num_gap * (1 - qt * 2))
117
+ return gap
118
+
119
+
120
+ @torch.no_grad()
121
+ def gen_contour_line(rgb_high_res, data, valid, valid_high_res, is_z, num_gap, shift, thickness=0, qt=0.05, colormap=cv2.COLORMAP_JET):
122
+ device = 'cuda' if torch.cuda.is_available() else 'cpu'
123
+ data = torch.from_numpy(data).to(device)
124
+ valid = torch.from_numpy(valid).to(device)
125
+ valid_high_res = torch.from_numpy(valid_high_res).to(device)
126
+ if is_z:
127
+ data = 1 / data
128
+
129
+ gap = get_contour_line_gap(data, valid, num_gap, qt)
130
+ data_lb = data[valid].min().item() if valid.any() else 0
131
+ val_lb = data_lb + gap * shift
132
+
133
+ high_res_shape = rgb_high_res.shape[:2]
134
+ res_high_res = compute_high_res_idx(high_res_shape, data, valid, valid_high_res, gap, val_lb)
135
+
136
+ res = res_high_res.clone()
137
+ dlt_rng = int(math.floor(thickness))
138
+ for di in range(-dlt_rng, dlt_rng + 1):
139
+ for dj in range(-dlt_rng, dlt_rng + 1):
140
+ if di * di + dj * dj > thickness * thickness:
141
+ continue
142
+ nb_res = get_shifted_data(res_high_res, di, dj)
143
+ upd_mask = get_shifted_data(valid_high_res, di, dj) & (res == -1) & (nb_res != -1) & valid_high_res
144
+ res[upd_mask] = nb_res[upd_mask]
145
+
146
+ if (res != -1).any():
147
+ res[res != -1] -= res[res != -1].min()
148
+ res = res.cpu().numpy()
149
+ num_val = max(2, res.max().item() + 1)
150
+ valid_high_res = valid_high_res.cpu().numpy()
151
+
152
+ base_col = cv2.applyColorMap(np.arange(256, dtype=np.uint8)[None], colormap)[0].astype(np.float32) # (256, 3)
153
+ idx = np.arange(num_val, dtype=np.float32) / (num_val - 1) * 255 # (itr,)
154
+ idx_lb = np.floor(idx).astype(np.int32) # (itr,)
155
+ coef_lb = (idx_lb.astype(np.float32) + 1 - idx)[:, None] # (itr, 1)
156
+ col = base_col[idx_lb] * coef_lb + base_col[np.clip(idx_lb + 1, a_min=None, a_max=255)] * (1 - coef_lb) # (itr, 3)
157
+ col = np.round(col).astype(np.uint8)
158
+
159
+ img = np.zeros_like(rgb_high_res)
160
+ non_colored_mask = valid_high_res & (res == -1)
161
+ img[non_colored_mask] = rgb_high_res[non_colored_mask]
162
+ colored_mask = valid_high_res & (res != -1)
163
+ img[colored_mask] = col[res[colored_mask]]
164
+ return img, colored_mask, col
165
+
166
+
167
+ def pil_ds(img: np.ndarray, H, W):
168
+ pil_img = Image.fromarray(img, mode='RGB')
169
+ pil_img = pil_img.resize((W, H), Image.Resampling.LANCZOS)
170
+ return np.array(pil_img)
171
+
172
+
173
+ def render_contour_line_imgs(xyz: np.ndarray, valid: np.ndarray, rgb_low_res: np.ndarray, save_shape, out_root):
174
+ '''
175
+ :param xyz:
176
+ :param valid:
177
+ :param rgb_low_res:
178
+ :param save_shape: (H, W)
179
+ :param out_root:
180
+ :return:
181
+ '''
182
+ # hyperparams
183
+ texture_strength = 0.8
184
+ draw_dim_lb = 4 * np.linalg.norm([1920, 1080])
185
+
186
+ out_root = pathlib_file(out_root)
187
+
188
+ dim = np.linalg.norm(rgb_low_res.shape[:2])
189
+ us_sc = int(math.ceil(draw_dim_lb / dim))
190
+ us_shape = (us_sc * rgb_low_res.shape[0], us_sc * rgb_low_res.shape[1])
191
+ rgb_high_res = np.round(texture_strength * cv2.resize(rgb_low_res, (us_shape[1], us_shape[0]))).astype(np.uint8)
192
+ valid_high_res = torch_trans.functional.resize(torch.from_numpy(valid)[None], rgb_high_res.shape[:2], torch_trans.InterpolationMode.NEAREST_EXACT)[0].numpy()
193
+
194
+ summary = {}
195
+ for thickness in [5 * np.linalg.norm(rgb_high_res.shape[:2]) / (4 * np.linalg.norm([1920, 1080]))]:
196
+ for rel_num_gap in [0.015, 0.03, 0.06, 0.09, 0.12, 0.24, 0.42, 0.6]:
197
+ num_gap = int(dim * rel_num_gap)
198
+ for shift in [0.5]:
199
+ imgs, colored_masks, col_maps = {}, {}, {}
200
+ for i, name in enumerate(['x', 'y', 'z']):
201
+ imgs[name], colored_masks[name], col_maps[name] = \
202
+ gen_contour_line(rgb_high_res, xyz[..., i], valid, valid_high_res, name == 'z',
203
+ num_gap, shift, thickness)
204
+
205
+ out_f = out_root / name / f'thickness__{thickness:.1f}___num_gap__{num_gap}___shift__{shift:.2f}.png'
206
+ out_f.parent.mkdir(parents=True, exist_ok=True)
207
+ cv2.imwrite(out_f.as_posix(), pil_ds(imgs[name][us_sc:-us_sc, us_sc:-us_sc, ::-1].copy(), save_shape[0], save_shape[1]))
208
+ print(f'Saved to {out_f.resolve()}')
209
+ assign_item_to_dict(summary, [name, thickness, num_gap, shift], str(out_f.resolve().relative_to(out_root.resolve())))
210
+
211
+ img_xy = rgb_high_res.copy()
212
+ img_xy[np.logical_and(colored_masks['x'], colored_masks['y'])] = np.round(.5 * (imgs['x'].astype(np.float32) + imgs['y'].astype(np.float32))).astype(np.uint8)[np.logical_and(colored_masks['x'], colored_masks['y'])]
213
+ img_xy[np.logical_and(colored_masks['x'], np.logical_not(colored_masks['y']))] = imgs['x'][np.logical_and(colored_masks['x'], np.logical_not(colored_masks['y']))]
214
+ img_xy[np.logical_and(np.logical_not(colored_masks['x']), colored_masks['y'])] = imgs['y'][np.logical_and(np.logical_not(colored_masks['x']), colored_masks['y'])]
215
+ img_xy[~valid_high_res] = 0
216
+ # img_xy = caption_img_xy(img_xy, col_maps)
217
+ out_f = out_root / 'xy' / f'thickness__{thickness:.1f}___num_gap__{num_gap}___shift__{shift:.2f}.png'
218
+ out_f.parent.mkdir(parents=True, exist_ok=True)
219
+ cv2.imwrite(out_f.as_posix(), pil_ds(img_xy[us_sc:-us_sc, us_sc:-us_sc, ::-1].copy(), save_shape[0], save_shape[1]))
220
+ print(f'Saved to {out_f.resolve()}')
221
+ assign_item_to_dict(summary, ['xy', thickness, num_gap, shift], str(out_f.resolve().relative_to(out_root.resolve())))
222
+ with (out_root / 'summary.json').open('w') as F:
223
+ json.dump(summary, F)
224
+
225
+
226
+ def get_out_dir(work_dir, depth_f):
227
+ return work_dir / 'contour_line' / str((work_dir / depth_f).resolve().relative_to(work_dir.resolve()))[:-4].replace('/', '_')
228
+
229
+
230
+ def main(args):
231
+ save_dim_ub = args.save_dim_ub
232
+
233
+ root = args.root
234
+ rgb_f = root / args.rgb_f
235
+ data_f = root / args.depth_f
236
+
237
+ raw_rgb = imread_rgb(rgb_f)
238
+
239
+ save_sc = int(math.floor(save_dim_ub / np.linalg.norm(raw_rgb.shape[:2])))
240
+ save_shape = (save_sc * raw_rgb.shape[0], save_sc * raw_rgb.shape[1])
241
+
242
+ depth, intr, valid = load_data(data_f)
243
+ xyz = depth_to_xyz(intr, depth)
244
+
245
+ render_contour_line_imgs(xyz, valid, raw_rgb, save_shape, get_out_dir(root, args.depth_f))
246
+
247
+
248
+ if __name__ == "__main__":
249
+ parser = argparse.ArgumentParser()
250
+ parser.add_argument("root", type=Path)
251
+ parser.add_argument("--depth_f", type=str, help='Path to depth file, relative to root.')
252
+ parser.add_argument('--rgb_f', type=str, nargs='?', const=None, default='rgb.png', help='Path to rgb file, relative to root.')
253
+ parser.add_argument("--save_dim_ub", type=float, default=np.linalg.norm([1920, 1080]))
254
+ args = parser.parse_args()
255
+
256
+ main(args)
evalmde/visualization/render_textureless_relighting.py ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import argparse
3
+ import json
4
+ from pathlib import Path
5
+
6
+ import bpy
7
+ import mathutils
8
+ import numpy as np
9
+
10
+ from evalmde.utils.image import imread_rgb, resize, imwrite_rgb
11
+ from evalmde.utils.proj import apply_SE3
12
+ from evalmde.visualization import gen_rot_light__light_pos, ROT_LIGHT_NUM_LIGHT, ROT_LIGHT_NUM_LOOP
13
+ from evalmde.visualization.cfg import (get_intermediate_mesh_f, get_vis_root,
14
+ get_crop_region, get_mesh_vertex_col, get_valid_triangle)
15
+ from evalmde.utils.common import pathlib_file, current_time
16
+ from evalmde.utils.depth_to_mesh import gen_mesh_and_pcd
17
+ from evalmde.utils.depth import load_data
18
+ from evalmde.utils.blender import (bpy_create_cam, bpy_add_ambient_light, bpy_set_tmp_dir, bpy_create_directional_light,
19
+ bpy_setup_rgbd_render, bpy_enable_gpu, bpy_render_rgb_and_filter_invalid)
20
+
21
+
22
+ def render(mesh_f, output_root,
23
+ base_cam_pose, cam_intr_params, ds_ratio, num_sample,
24
+ light_i, light_src, overwrite, save_blend,
25
+ ambient, crop_region, cpu):
26
+ cam_pose = base_cam_pose.copy()
27
+ light_src_in_cam = light_src.copy()
28
+ light_src_in_world = apply_SE3(cam_pose, light_src_in_cam)
29
+ light_dst_in_world = apply_SE3(cam_pose, np.array([0, 0, 0.]))
30
+
31
+ cam_pose[..., 1:3] *= -1
32
+
33
+ output_root = pathlib_file(output_root)
34
+ output_root.mkdir(parents=True, exist_ok=True)
35
+ h, w, fx, fy, cx, cy = cam_intr_params
36
+
37
+ bpy.ops.wm.read_factory_settings(use_empty=True)
38
+ bpy_set_tmp_dir(output_root.parent / f'{output_root.name}__tmp')
39
+ if not cpu:
40
+ bpy_enable_gpu()
41
+
42
+ assert mesh_f.exists(), mesh_f
43
+ bpy.ops.import_scene.gltf(filepath=str(mesh_f))
44
+ for obj in bpy.context.scene.objects:
45
+ if obj.type == 'MESH':
46
+ obj.location = (0, 0, 0)
47
+ obj.scale = (1, 1, 1)
48
+ obj.rotation_mode = 'XYZ'
49
+ obj.rotation_euler = mathutils.Euler((-np.pi / 2, 0, 0), 'XYZ')
50
+
51
+ # Set render engine and resolution
52
+ bpy.context.scene.render.engine = 'CYCLES'
53
+ bpy.context.scene.render.resolution_percentage = 100
54
+ bpy_create_directional_light(light_src_in_world, light_dst_in_world)
55
+ bpy_add_ambient_light(ambient)
56
+
57
+ depth_node, rgb_node = bpy_setup_rgbd_render()
58
+ if (not overwrite) and (output_root / f'image_{light_i:06}.png').exists() and \
59
+ (output_root / f'metadata_{light_i:06}.json').exists():
60
+ try:
61
+ with (output_root / f'metadata_{light_i:06}.json').open('r') as F:
62
+ metadata = json.load(F)
63
+ if metadata['num_sample'] == num_sample:
64
+ return
65
+ except Exception as E:
66
+ print(f'{light_i=}, {E=}')
67
+
68
+ cam_object = bpy_create_cam(f"cam_{light_i:06}", cam_pose,
69
+ int(fx), int(fy), int(cx), int(cy), int(w), int(h))
70
+ bpy_render_rgb_and_filter_invalid(cam_object, int(h), int(w), num_sample, depth_node,
71
+ rgb_node, str(output_root), f'{light_i:06}', [0, 0, 0], save_depth=False)
72
+
73
+ if (output_root / f'image_{light_i:06}.png').exists():
74
+ img = imread_rgb(output_root / f'image_{light_i:06}.png')
75
+ if crop_region is not None and len(crop_region) > 0:
76
+ lb_i, ub_i, lb_j, ub_j = crop_region
77
+ img = img[lb_i:ub_i, lb_j:ub_j]
78
+ img = resize(img, H=ds_ratio * img.shape[0])
79
+ imwrite_rgb(output_root / f'image_{light_i:06}.png', img)
80
+ with (output_root / f'metadata_{light_i:06}.json').open('w') as F:
81
+ json.dump({'num_sample': num_sample, 'time': current_time()}, F)
82
+
83
+ if save_blend and light_i == 0:
84
+ out_f = output_root / f'{mesh_f.stem}.blend'
85
+ bpy.ops.wm.save_as_mainfile(filepath=str(out_f))
86
+ print(f'Saved to {out_f.resolve()}')
87
+
88
+
89
+ if __name__ == "__main__":
90
+ parser = argparse.ArgumentParser()
91
+ parser.add_argument('root', type=Path)
92
+ parser.add_argument('--num_sample', type=int, default=256)
93
+ parser.add_argument('--depth_f', type=str, nargs='?', const=None, default='gt_depth.npz', help='Path to depth file, relative to root.')
94
+ parser.add_argument('--valid_triangle_f', type=str, nargs='?', const=None, default='valid_triangle.npz', help='Path to valid triangle file, relative to root.')
95
+ parser.add_argument('--overwrite', action='store_true')
96
+ parser.add_argument('--save_blend', action='store_true')
97
+ parser.add_argument('--filter_quad', action='store_true', help='Filter out neighboring square if any of triangle is invalid')
98
+ parser.add_argument('--ds_ratio', type=float, default=1)
99
+ parser.add_argument('--ambient', type=float, default=0.2)
100
+ parser.add_argument('--light_l', type=int)
101
+ parser.add_argument('--light_r', type=int)
102
+ parser.add_argument('--crop_region', nargs='*', type=int, default=[], help='Specify 4 integers: lb_i, ub_i, lb_j, ub_j, and only render mesh of [lb_i, ub_i)x[lb_j, ub_j)')
103
+ parser.add_argument('--mesh_dir', type=Path, nargs='?', const=None, default=None)
104
+ parser.add_argument('--cpu', action='store_true')
105
+ args = parser.parse_args()
106
+
107
+ root = args.root
108
+ mesh_f = get_intermediate_mesh_f(args)
109
+ vis_root = get_vis_root(args)
110
+
111
+ crop_region = get_crop_region(args)
112
+ depth, intr, valid = load_data(root / args.depth_f)
113
+ rgb = get_mesh_vertex_col(args, depth.shape)
114
+ valid_triangle = get_valid_triangle(args, depth.shape)
115
+
116
+ mesh, pcd = gen_mesh_and_pcd(intr, depth, valid, rgb=rgb, valid_triangle=valid_triangle, crop_region=crop_region)
117
+ del pcd
118
+
119
+ light_pos = gen_rot_light__light_pos(ROT_LIGHT_NUM_LIGHT, ROT_LIGHT_NUM_LOOP)
120
+
121
+ mesh_f.parent.mkdir(parents=True, exist_ok=True)
122
+ # mesh.show()
123
+ mesh.export(mesh_f)
124
+ # print(f'Mesh saved to {mesh_f.resolve()}')
125
+ for light_i in range(args.light_l, args.light_r):
126
+ render(mesh_f, vis_root / 'textureless_relighting',
127
+ np.eye(4), list(depth.shape) + intr.tolist(), args.ds_ratio, args.num_sample,
128
+ light_i, light_pos[light_i], args.overwrite, args.save_blend,
129
+ args.ambient, crop_region, args.cpu)
130
+ os.remove(mesh_f)
induce_valid_triangle_from_gt_depth.py ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pathlib import Path
2
+ gt_depth_f = Path('sample_data_2/gt_depth.npz')
3
+ valid_triangle_f = Path('sample_data_2/valid_triangle.npz')
4
+
5
+
6
+ THRESH = 1.1
7
+
8
+ import numpy as np
9
+ def induce_valid_triangle_from_gt_depth(gt_depth: np.ndarray, valid: np.ndarray):
10
+ '''
11
+ :param gt_depth: shape (H, W)
12
+ :param valid: shape (H, W)
13
+ :return: valid_triangle, shape (H - 1, W - 1, 2)
14
+ '''
15
+ min_d_0 = np.min(np.stack([gt_depth[:-1, :-1], gt_depth[1:, :-1], gt_depth[:-1, 1:]], axis=0), axis=0)
16
+ max_d_0 = np.max(np.stack([gt_depth[:-1, :-1], gt_depth[1:, :-1], gt_depth[:-1, 1:]], axis=0), axis=0)
17
+ valid_0 = valid[:-1, :-1] & valid[:-1, 1:] & valid[1:, :-1] & (max_d_0 <= THRESH * min_d_0)
18
+
19
+ min_d_1 = np.min(np.stack([gt_depth[1:, 1:], gt_depth[1:, :-1], gt_depth[:-1, 1:]], axis=0), axis=0)
20
+ max_d_1 = np.max(np.stack([gt_depth[1:, 1:], gt_depth[1:, :-1], gt_depth[:-1, 1:]], axis=0), axis=0)
21
+ valid_1 = valid[1:, 1:] & valid[:-1, 1:] & valid[1:, :-1] & (max_d_1 <= THRESH * min_d_1)
22
+ return np.stack([valid_0, valid_1], axis=-1)
23
+
24
+
25
+ from evalmde.utils.depth import load_data
26
+ gt_depth, gt_intr, gt_valid = load_data(gt_depth_f)
27
+ valid_triangle = induce_valid_triangle_from_gt_depth(gt_depth, gt_valid)
28
+ np.savez(valid_triangle_f, valid_triangle=valid_triangle)
29
+ print(f'Saved to {valid_triangle_f.resolve()}')
infinigen5_12612.log ADDED
@@ -0,0 +1,256 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ============================================
2
+ infinigen5 started at Thu May 14 06:28:59 PM AEST 2026
3
+ Data: /home/ywan0794/EvalMDE/data/infinigen/test_scenes_release_cleaned_final Output: /home/ywan0794/EvalMDE/output/infinigen5
4
+ ============================================
5
+ Thu May 14 18:28:59 2026
6
+ +-----------------------------------------------------------------------------------------+
7
+ | NVIDIA-SMI 550.163.01 Driver Version: 550.163.01 CUDA Version: 12.4 |
8
+ |-----------------------------------------+------------------------+----------------------+
9
+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
10
+ | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
11
+ | | | MIG M. |
12
+ |=========================================+========================+======================|
13
+ | 0 NVIDIA H100 NVL Off | 00000000:61:00.0 Off | 0 |
14
+ | N/A 51C P0 98W / 400W | 14MiB / 95830MiB | 0% Default |
15
+ | | | Disabled |
16
+ +-----------------------------------------+------------------------+----------------------+
17
+
18
+ +-----------------------------------------------------------------------------------------+
19
+ | Processes: |
20
+ | GPU GI CI PID Type Process name GPU Memory |
21
+ | ID ID Usage |
22
+ |=========================================================================================|
23
+ | 0 N/A N/A 4274 G /usr/lib/xorg/Xorg 4MiB |
24
+ +-----------------------------------------------------------------------------------------+
25
+
26
+ ============================================
27
+ [depth_pro inference] Thu May 14 06:28:59 PM AEST 2026 env=depth-pro
28
+ ============================================
29
+ Found 5 scenes
30
+ [1/5] indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: shape=(720, 1280)
31
+ Saved 5 predictions to /home/ywan0794/EvalMDE/output/infinigen5/depth_pro
32
+ [INF-OK] depth_pro
33
+
34
+ ============================================
35
+ [marigold inference] Thu May 14 06:29:39 PM AEST 2026 env=marigold
36
+ ============================================
37
+ The config attributes {'prediction_type': 'depth'} were passed to MarigoldDepthPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
38
+ Keyword arguments {'prediction_type': 'depth'} are not expected by MarigoldDepthPipeline and will be ignored.
39
+
40
+
41
+
42
+ Found 5 scenes
43
+ [1/5] indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: shape=(720, 1280)
44
+ Saved 5 predictions to /home/ywan0794/EvalMDE/output/infinigen5/marigold
45
+ [INF-OK] marigold
46
+
47
+ ============================================
48
+ [lotus inference] Thu May 14 06:29:57 PM AEST 2026 env=lotus
49
+ ============================================
50
+
51
+ Found 5 scenes
52
+ [1/5] indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: shape=(720, 1280)
53
+ Saved 5 predictions to /home/ywan0794/EvalMDE/output/infinigen5/lotus
54
+ [INF-OK] lotus
55
+
56
+ ============================================
57
+ [depthmaster inference] Thu May 14 06:30:10 PM AEST 2026 env=depthmaster
58
+ ============================================
59
+ The config attributes {'default_denoising_steps': 10, 'scheduler': ['diffusers', 'DDIMScheduler']} were passed to DepthMasterPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
60
+ Keyword arguments {'default_denoising_steps': 10, 'scheduler': ['diffusers', 'DDIMScheduler']} are not expected by DepthMasterPipeline and will be ignored.
61
+
62
+
63
+ An error occurred while trying to fetch /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet: Error no file named diffusion_pytorch_model.safetensors found in directory /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet.
64
+ Defaulting to unsafe serialization. Pass `allow_pickle=False` to raise an error instead.
65
+ Some weights of the model checkpoint at /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet were not used when initializing UNet2DConditionModel:
66
+ ['fftblock.conv_s1.weight, fftblock.norm.weight, fftblock.conv_f4.bias, fftblock.conv_f3.bias, fftblock.conv_f1.weight, fftblock.conv_f1.bias, fftblock.conv_f2.weight, fftblock.norm.bias, fftblock.fuse.weight, fftblock.conv_f4.weight, fftblock.fuse.bias, fftblock.conv_s2.bias, fftblock.conv_f3.weight, fftblock.conv_s2.weight, fftblock.conv_f2.bias, fftblock.conv_s1.bias']
67
+
68
+ Expected types for unet: (<class 'depthmaster.modules.unet_2d_condition_s2.UNet2DConditionModel'>,), got <class 'diffusers.models.unets.unet_2d_condition.UNet2DConditionModel'>.
69
+ An error occurred while trying to fetch /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet: Error no file named diffusion_pytorch_model.safetensors found in directory /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet.
70
+ Defaulting to unsafe serialization. Pass `allow_pickle=False` to raise an error instead.
71
+ Found 5 scenes
72
+ [1/5] indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: shape=(720, 1280)
73
+ Saved 5 predictions to /home/ywan0794/EvalMDE/output/infinigen5/depthmaster
74
+ [INF-OK] depthmaster
75
+
76
+ ============================================
77
+ [ppd inference] Thu May 14 06:30:28 PM AEST 2026 env=ppd
78
+ ============================================
79
+ xFormers not available
80
+ xFormers not available
81
+ Found 5 scenes
82
+ [1/5] indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: shape=(720, 1280)
83
+ Saved 5 predictions to /home/ywan0794/EvalMDE/output/infinigen5/ppd
84
+ [INF-OK] ppd
85
+
86
+ ============================================
87
+ [da3_mono inference] Thu May 14 06:31:08 PM AEST 2026 env=da3
88
+ ============================================
89
+ [WARN ] Dependency `gsplat` is required for rendering 3DGS. Install via: pip install git+https://github.com/nerfstudio-project/gsplat.git@0b4dddf04cb687367602c01196913cde6a743d70
90
+ Found 5 scenes
91
+ [1/5] indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: shape=(720, 1280)
92
+ Saved 5 predictions to /home/ywan0794/EvalMDE/output/infinigen5/da3_mono
93
+ [INF-OK] da3_mono
94
+
95
+ ============================================
96
+ [fe2e inference] Thu May 14 06:31:30 PM AEST 2026 env=fe2e
97
+ ============================================
98
+ [INFO] prompt_type=empty, 跳过Qwen模型加载
99
+ create LoRA network from weights
100
+ train all blocks only
101
+ create LoRA for DIT all blocks: 304 modules.
102
+ enable LoRA for U-Net
103
+ weights are merged
104
+ Found 5 scenes
105
+ [1/5] indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: shape=(720, 1280)
106
+ Saved 5 predictions to /home/ywan0794/EvalMDE/output/infinigen5/fe2e
107
+ [INF-OK] fe2e
108
+
109
+ ============================================
110
+ Stage 2: metric aggregation (evalmde env)
111
+ ============================================
112
+ --- metric: depth_pro ---
113
+ Found 5 scenes for depth_pro
114
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
115
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
116
+ indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: sawa_h raw=1.295 aln=1.334 | relnorm raw=0.206 aln=0.240 | boundF1_err raw=0.874 aln=0.735
117
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___142efe82: sawa_h raw=1.010 aln=1.015 | relnorm raw=0.272 aln=0.264 | boundF1_err raw=0.693 aln=0.736
118
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___27c2f1fd: sawa_h raw=0.706 aln=0.698 | relnorm raw=0.208 aln=0.199 | boundF1_err raw=0.625 aln=0.666
119
+ nature__arctic-long_focal-2025-09-06-14-10-00___4bc42ce8: sawa_h raw=0.695 aln=0.831 | relnorm raw=0.189 aln=0.203 | boundF1_err raw=0.833 aln=0.803
120
+ nature__desert-long_focal-2025-09-07-11-43-28___cbe875d: sawa_h raw=0.443 aln=0.544 | relnorm raw=0.075 aln=0.083 | boundF1_err raw=0.892 aln=0.779
121
+
122
+ Mean RAW : {'wkdr_no_align': 0.06662386655807495, 'delta0125_disparity_affine_err': 0.3108652591705322, 'delta0125_depth_affine_err': 0.5642925873398781, 'boundary_f1_err': 0.7832580580591701, 'rel_normal': 0.1899009395071665, 'sawa_h': 0.8298352197168052}
123
+ Mean ALIGNED: {'wkdr_no_align': 0.06664336919784546, 'delta0125_disparity_affine_err': 0.5713996738195419, 'delta0125_depth_affine_err': 0.5642925873398781, 'boundary_f1_err': 0.7437403012403084, 'rel_normal': 0.1979266966601841, 'sawa_h': 0.8844690165018712}
124
+ Saved → /home/ywan0794/EvalMDE/output/infinigen5/depth_pro_metrics.json
125
+ [METRIC-OK] depth_pro
126
+ --- metric: marigold ---
127
+ Found 5 scenes for marigold
128
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
129
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
130
+ indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: sawa_h raw=2.214 aln=1.792 | relnorm raw=0.378 aln=0.233 | boundF1_err raw=0.979 aln=0.973
131
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___142efe82: sawa_h raw=1.646 aln=1.294 | relnorm raw=0.532 aln=0.393 | boundF1_err raw=0.903 aln=0.858
132
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___27c2f1fd: sawa_h raw=1.332 aln=0.943 | relnorm raw=0.401 aln=0.265 | boundF1_err raw=0.845 aln=0.920
133
+ nature__arctic-long_focal-2025-09-06-14-10-00___4bc42ce8: sawa_h raw=1.548 aln=1.536 | relnorm raw=0.417 aln=0.415 | boundF1_err raw=0.984 aln=0.963
134
+ nature__desert-long_focal-2025-09-07-11-43-28___cbe875d: sawa_h raw=1.120 aln=1.097 | relnorm raw=0.248 aln=0.254 | boundF1_err raw=0.933 aln=0.927
135
+
136
+ Mean RAW : {'wkdr_no_align': 0.12133046388626098, 'delta0125_disparity_affine_err': 0.9506231024861336, 'delta0125_depth_affine_err': 0.5405407793819904, 'boundary_f1_err': 0.9286170091147612, 'rel_normal': 0.39523183786670485, 'sawa_h': 1.5718469267105362}
137
+ Mean ALIGNED: {'wkdr_no_align': 0.12138602733612061, 'delta0125_disparity_affine_err': 0.5170192375779152, 'delta0125_depth_affine_err': 0.5405403502285481, 'boundary_f1_err': 0.9283054545742395, 'rel_normal': 0.31193256845236983, 'sawa_h': 1.332338139755596}
138
+ Saved → /home/ywan0794/EvalMDE/output/infinigen5/marigold_metrics.json
139
+ [METRIC-OK] marigold
140
+ --- metric: lotus ---
141
+ Found 5 scenes for lotus
142
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
143
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
144
+ indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: sawa_h raw=1.433 aln=1.027 | relnorm raw=0.348 aln=0.209 | boundF1_err raw=0.955 aln=0.905
145
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___142efe82: sawa_h raw=1.705 aln=1.539 | relnorm raw=0.478 aln=0.384 | boundF1_err raw=0.922 aln=0.859
146
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___27c2f1fd: sawa_h raw=1.029 aln=0.674 | relnorm raw=0.296 aln=0.196 | boundF1_err raw=0.833 aln=0.715
147
+ nature__arctic-long_focal-2025-09-06-14-10-00___4bc42ce8: sawa_h raw=1.131 aln=1.119 | relnorm raw=0.307 aln=0.304 | boundF1_err raw=0.969 aln=0.945
148
+ nature__desert-long_focal-2025-09-07-11-43-28___cbe875d: sawa_h raw=0.872 aln=0.860 | relnorm raw=0.160 aln=0.159 | boundF1_err raw=0.969 aln=0.936
149
+
150
+ Mean RAW : {'wkdr_no_align': 0.06971110105514526, 'delta0125_disparity_affine_err': 0.9483198569156229, 'delta0125_depth_affine_err': 0.6215518534183502, 'boundary_f1_err': 0.9296553861535948, 'rel_normal': 0.31790587652021685, 'sawa_h': 1.2340270893102154}
151
+ Mean ALIGNED: {'wkdr_no_align': 0.06982688903808594, 'delta0125_disparity_affine_err': 0.6784043271094561, 'delta0125_depth_affine_err': 0.6215518534183502, 'boundary_f1_err': 0.8718338372591947, 'rel_normal': 0.250615008987667, 'sawa_h': 1.0437563272908121}
152
+ Saved → /home/ywan0794/EvalMDE/output/infinigen5/lotus_metrics.json
153
+ [METRIC-OK] lotus
154
+ --- metric: depthmaster ---
155
+ Found 5 scenes for depthmaster
156
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
157
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
158
+ indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: sawa_h raw=4.224 aln=1.305 | relnorm raw=0.530 aln=0.136 | boundF1_err raw=0.997 aln=0.850
159
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___142efe82: sawa_h raw=4.541 aln=1.198 | relnorm raw=0.401 aln=0.341 | boundF1_err raw=0.999 aln=0.743
160
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___27c2f1fd: sawa_h raw=4.862 aln=1.305 | relnorm raw=0.666 aln=0.306 | boundF1_err raw=1.000 aln=0.885
161
+ nature__arctic-long_focal-2025-09-06-14-10-00___4bc42ce8: sawa_h raw=4.517 aln=1.027 | relnorm raw=0.302 aln=0.273 | boundF1_err raw=1.000 aln=0.996
162
+ nature__desert-long_focal-2025-09-07-11-43-28___cbe875d: sawa_h raw=4.075 aln=0.810 | relnorm raw=0.097 aln=0.148 | boundF1_err raw=0.999 aln=0.975
163
+
164
+ Mean RAW : {'wkdr_no_align': 0.9019776806235313, 'delta0125_disparity_affine_err': 0.9493729234673083, 'delta0125_depth_affine_err': 0.6568526294082403, 'boundary_f1_err': 0.9990700138797344, 'rel_normal': 0.39917262599449505, 'sawa_h': 4.443883083999355}
165
+ Mean ALIGNED: {'wkdr_no_align': 0.0985716462135315, 'delta0125_disparity_affine_err': 0.651621462404728, 'delta0125_depth_affine_err': 0.6576177909970283, 'boundary_f1_err': 0.8899633566556169, 'rel_normal': 0.24088768568456684, 'sawa_h': 1.1289693313813944}
166
+ Saved → /home/ywan0794/EvalMDE/output/infinigen5/depthmaster_metrics.json
167
+ [METRIC-OK] depthmaster
168
+ --- metric: ppd ---
169
+ Found 5 scenes for ppd
170
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
171
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
172
+ indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: sawa_h raw=3.384 aln=2.552 | relnorm raw=1.066 aln=0.680 | boundF1_err raw=0.959 aln=0.878
173
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___142efe82: sawa_h raw=2.301 aln=1.944 | relnorm raw=0.897 aln=0.760 | boundF1_err raw=0.866 aln=0.760
174
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___27c2f1fd: sawa_h raw=2.149 aln=1.496 | relnorm raw=0.867 aln=0.624 | boundF1_err raw=0.865 aln=0.626
175
+ nature__arctic-long_focal-2025-09-06-14-10-00___4bc42ce8: sawa_h raw=1.505 aln=1.543 | relnorm raw=0.532 aln=0.553 | boundF1_err raw=0.996 aln=0.996
176
+ nature__desert-long_focal-2025-09-07-11-43-28___cbe875d: sawa_h raw=1.538 aln=1.560 | relnorm raw=0.485 aln=0.497 | boundF1_err raw=0.968 aln=0.974
177
+
178
+ Mean RAW : {'wkdr_no_align': 0.08756059408187866, 'delta0125_disparity_affine_err': 0.950529617164284, 'delta0125_depth_affine_err': 0.6146524578332901, 'boundary_f1_err': 0.9310581919470687, 'rel_normal': 0.7692545256509766, 'sawa_h': 2.1754034422190696}
179
+ Mean ALIGNED: {'wkdr_no_align': 0.08780105113983154, 'delta0125_disparity_affine_err': 0.639212078601122, 'delta0125_depth_affine_err': 0.6143273778259755, 'boundary_f1_err': 0.8466643323124072, 'rel_normal': 0.6228359304030748, 'sawa_h': 1.8193098560312937}
180
+ Saved → /home/ywan0794/EvalMDE/output/infinigen5/ppd_metrics.json
181
+ [METRIC-OK] ppd
182
+ --- metric: da3_mono ---
183
+ Found 5 scenes for da3_mono
184
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
185
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
186
+ indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: sawa_h raw=0.632 aln=0.587 | relnorm raw=0.131 aln=0.115 | boundF1_err raw=0.962 aln=0.982
187
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___142efe82: sawa_h raw=0.884 aln=0.793 | relnorm raw=0.247 aln=0.223 | boundF1_err raw=0.836 aln=0.873
188
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___27c2f1fd: sawa_h raw=0.691 aln=0.643 | relnorm raw=0.192 aln=0.165 | boundF1_err raw=0.870 aln=0.934
189
+ nature__arctic-long_focal-2025-09-06-14-10-00___4bc42ce8: sawa_h raw=0.877 aln=0.898 | relnorm raw=0.194 aln=0.200 | boundF1_err raw=0.958 aln=0.950
190
+ nature__desert-long_focal-2025-09-07-11-43-28___cbe875d: sawa_h raw=0.605 aln=0.607 | relnorm raw=0.099 aln=0.099 | boundF1_err raw=0.936 aln=0.935
191
+
192
+ Mean RAW : {'wkdr_no_align': 0.033317041397094724, 'delta0125_disparity_affine_err': 0.5275852054357528, 'delta0125_depth_affine_err': 0.40731881856918334, 'boundary_f1_err': 0.9124339414097011, 'rel_normal': 0.17258852279331505, 'sawa_h': 0.7379542487644946}
193
+ Mean ALIGNED: {'wkdr_no_align': 0.03334666490554809, 'delta0125_disparity_affine_err': 0.4545227389782667, 'delta0125_depth_affine_err': 0.40731837004423144, 'boundary_f1_err': 0.9345920700164015, 'rel_normal': 0.16045701111115004, 'sawa_h': 0.7058076191806922}
194
+ Saved → /home/ywan0794/EvalMDE/output/infinigen5/da3_mono_metrics.json
195
+ [METRIC-OK] da3_mono
196
+ --- metric: fe2e ---
197
+ Found 5 scenes for fe2e
198
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
199
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
200
+ indoor__slow_solve-long_focal-2025-05-16-09-19-19___209591a0: sawa_h raw=1.291 aln=0.960 | relnorm raw=0.239 aln=0.139 | boundF1_err raw=0.910 aln=0.883
201
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___142efe82: sawa_h raw=1.080 aln=0.869 | relnorm raw=0.341 aln=0.286 | boundF1_err raw=0.826 aln=0.793
202
+ indoor__slow_solve-long_focal-2025-09-06-19-53-32___27c2f1fd: sawa_h raw=0.994 aln=0.625 | relnorm raw=0.285 aln=0.175 | boundF1_err raw=0.852 aln=0.804
203
+ nature__arctic-long_focal-2025-09-06-14-10-00___4bc42ce8: sawa_h raw=0.902 aln=1.025 | relnorm raw=0.224 aln=0.288 | boundF1_err raw=0.986 aln=0.967
204
+ nature__desert-long_focal-2025-09-07-11-43-28___cbe875d: sawa_h raw=0.731 aln=0.837 | relnorm raw=0.119 aln=0.171 | boundF1_err raw=0.950 aln=0.970
205
+
206
+ Mean RAW : {'wkdr_no_align': 0.047560691833496094, 'delta0125_disparity_affine_err': 0.952283101901412, 'delta0125_depth_affine_err': 0.5054923050105572, 'boundary_f1_err': 0.9046317621819318, 'rel_normal': 0.24158862658068156, 'sawa_h': 0.9996706945875289}
207
+ Mean ALIGNED: {'wkdr_no_align': 0.04835277795791626, 'delta0125_disparity_affine_err': 0.5249058477580547, 'delta0125_depth_affine_err': 0.5062363661825657, 'boundary_f1_err': 0.8833090678944918, 'rel_normal': 0.21159110840747117, 'sawa_h': 0.8631816196940623}
208
+ Saved → /home/ywan0794/EvalMDE/output/infinigen5/fe2e_metrics.json
209
+ [METRIC-OK] fe2e
210
+
211
+ ============================================
212
+ infinigen5 finished at Thu May 14 06:32:37 PM AEST 2026
213
+ === Summary ===
214
+ [INF-OK] depth_pro
215
+ [INF-OK] marigold
216
+ [INF-OK] lotus
217
+ [INF-OK] depthmaster
218
+ [INF-OK] ppd
219
+ [INF-OK] da3_mono
220
+ [INF-OK] fe2e
221
+ [METRIC-OK] depth_pro
222
+ [METRIC-OK] marigold
223
+ [METRIC-OK] lotus
224
+ [METRIC-OK] depthmaster
225
+ [METRIC-OK] ppd
226
+ [METRIC-OK] da3_mono
227
+ [METRIC-OK] fe2e
228
+ === Per-model means ===
229
+ Traceback (most recent call last):
230
+ File "<string>", line 1, in <module>
231
+ KeyError: 'mean'
232
+ depth_pro:
233
+ Traceback (most recent call last):
234
+ File "<string>", line 1, in <module>
235
+ KeyError: 'mean'
236
+ marigold:
237
+ Traceback (most recent call last):
238
+ File "<string>", line 1, in <module>
239
+ KeyError: 'mean'
240
+ lotus:
241
+ Traceback (most recent call last):
242
+ File "<string>", line 1, in <module>
243
+ KeyError: 'mean'
244
+ depthmaster:
245
+ Traceback (most recent call last):
246
+ File "<string>", line 1, in <module>
247
+ KeyError: 'mean'
248
+ ppd:
249
+ Traceback (most recent call last):
250
+ File "<string>", line 1, in <module>
251
+ KeyError: 'mean'
252
+ da3_mono:
253
+ Traceback (most recent call last):
254
+ File "<string>", line 1, in <module>
255
+ KeyError: 'mean'
256
+ fe2e:
infinigen_all_12900.log ADDED
The diff for this file is too large to render. See raw diff
 
setup.py ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os.path as osp
2
+ from setuptools import setup, find_packages
3
+
4
+ ROOT = osp.dirname(osp.abspath(__file__))
5
+
6
+ setup(
7
+ name='evalmde',
8
+ packages=find_packages(),
9
+ install_requires=[
10
+ "numpy==2.0.0",
11
+ "opencv-python==4.12.0.88",
12
+ "open3d==0.19.0",
13
+ "pyglet==1.5.28",
14
+ "imageio==2.33.1",
15
+ "hydra-core==1.3.0",
16
+ "pyrender==0.1.45",
17
+ "evo==1.26.0",
18
+ "loguru==0.7.2",
19
+ "shortuuid==1.0.13",
20
+ "DateTime==5.5",
21
+ "plyfile==1.1",
22
+ "HTML4Vision==0.4.3",
23
+ "timm==1.0.9",
24
+ "imgaug==0.4.0",
25
+ "iopath==0.1.10",
26
+ "imagecorruptions==1.1.2",
27
+ "gitpython==3.1.44",
28
+ "pomegranate==1.1.1",
29
+ "matplotlib==3.9.0",
30
+ "wandb==0.22.2",
31
+ "cvxpy==1.6.5",
32
+ "mathutils==3.3.0",
33
+ "OpenEXR==3.3.3",
34
+ "Imath==0.0.2",
35
+ "pywavelets==1.8.0",
36
+ "h5py==3.14.0",
37
+ ],
38
+ )
39
+
smoke_all_12114.log ADDED
@@ -0,0 +1,218 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ============================================
2
+ smoke-all started at Thu May 14 10:45:07 AM AEST 2026
3
+ Data: /home/ywan0794/EvalMDE/data/smoke Output: /home/ywan0794/EvalMDE/output/smoke_all
4
+ ============================================
5
+ Thu May 14 10:45:07 2026
6
+ +-----------------------------------------------------------------------------------------+
7
+ | NVIDIA-SMI 550.163.01 Driver Version: 550.163.01 CUDA Version: 12.4 |
8
+ |-----------------------------------------+------------------------+----------------------+
9
+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
10
+ | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
11
+ | | | MIG M. |
12
+ |=========================================+========================+======================|
13
+ | 0 NVIDIA H100 NVL Off | 00000000:61:00.0 Off | 0 |
14
+ | N/A 38C P0 61W / 400W | 14MiB / 95830MiB | 0% Default |
15
+ | | | Disabled |
16
+ +-----------------------------------------+------------------------+----------------------+
17
+
18
+ +-----------------------------------------------------------------------------------------+
19
+ | Processes: |
20
+ | GPU GI CI PID Type Process name GPU Memory |
21
+ | ID ID Usage |
22
+ |=========================================================================================|
23
+ | 0 N/A N/A 4274 G /usr/lib/xorg/Xorg 4MiB |
24
+ +-----------------------------------------------------------------------------------------+
25
+
26
+ ============================================
27
+ [depth_pro inference] Thu May 14 10:45:07 AM AEST 2026 env=depth-pro
28
+ ============================================
29
+ Found 2 scenes
30
+ [1/2] sample_data: shape=(720, 1280)
31
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/depth_pro
32
+ [INF-OK] depth_pro
33
+
34
+ ============================================
35
+ [marigold inference] Thu May 14 10:45:25 AM AEST 2026 env=marigold
36
+ ============================================
37
+ The config attributes {'prediction_type': 'depth'} were passed to MarigoldDepthPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
38
+ Keyword arguments {'prediction_type': 'depth'} are not expected by MarigoldDepthPipeline and will be ignored.
39
+
40
+
41
+
42
+ Found 2 scenes
43
+ [1/2] sample_data: shape=(720, 1280)
44
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/marigold
45
+ [INF-OK] marigold
46
+
47
+ ============================================
48
+ [lotus inference] Thu May 14 10:45:49 AM AEST 2026 env=lotus
49
+ ============================================
50
+
51
+ Found 2 scenes
52
+ [1/2] sample_data: shape=(720, 1280)
53
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/lotus
54
+ [INF-OK] lotus
55
+
56
+ ============================================
57
+ [depthmaster inference] Thu May 14 10:46:09 AM AEST 2026 env=depthmaster
58
+ ============================================
59
+ The config attributes {'default_denoising_steps': 10, 'scheduler': ['diffusers', 'DDIMScheduler']} were passed to DepthMasterPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
60
+ Keyword arguments {'default_denoising_steps': 10, 'scheduler': ['diffusers', 'DDIMScheduler']} are not expected by DepthMasterPipeline and will be ignored.
61
+
62
+
63
+
64
+ Defaulting to unsafe serialization. Pass `allow_pickle=False` to raise an error instead.
65
+ Some weights of the model checkpoint at /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet were not used when initializing UNet2DConditionModel:
66
+ ['fftblock.conv_s1.weight, fftblock.conv_f4.weight, fftblock.conv_f3.weight, fftblock.fuse.weight, fftblock.conv_s2.bias, fftblock.conv_s2.weight, fftblock.conv_s1.bias, fftblock.conv_f2.bias, fftblock.fuse.bias, fftblock.conv_f1.weight, fftblock.norm.bias, fftblock.conv_f1.bias, fftblock.norm.weight, fftblock.conv_f3.bias, fftblock.conv_f2.weight, fftblock.conv_f4.bias']
67
+
68
+ Expected types for unet: (<class 'depthmaster.modules.unet_2d_condition_s2.UNet2DConditionModel'>,), got <class 'diffusers.models.unets.unet_2d_condition.UNet2DConditionModel'>.
69
+ An error occurred while trying to fetch /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet: Error no file named diffusion_pytorch_model.safetensors found in directory /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet.
70
+ Defaulting to unsafe serialization. Pass `allow_pickle=False` to raise an error instead.
71
+ Found 2 scenes
72
+ [1/2] sample_data: shape=(720, 1280)
73
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/depthmaster
74
+ [INF-OK] depthmaster
75
+
76
+ ============================================
77
+ [ppd inference] Thu May 14 10:46:51 AM AEST 2026 env=ppd
78
+ ============================================
79
+ xFormers not available
80
+ xFormers not available
81
+ Found 2 scenes
82
+ [1/2] sample_data: shape=(720, 1280)
83
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/ppd
84
+ [INF-OK] ppd
85
+
86
+ ============================================
87
+ [da3_mono inference] Thu May 14 10:47:30 AM AEST 2026 env=da3
88
+ ============================================
89
+ [WARN ] Dependency `gsplat` is required for rendering 3DGS. Install via: pip install git+https://github.com/nerfstudio-project/gsplat.git@0b4dddf04cb687367602c01196913cde6a743d70
90
+ Found 2 scenes
91
+ Traceback (most recent call last):
92
+ File "/home/ywan0794/EvalMDE/scripts/run_inference.py", line 94, in <module>
93
+ main()
94
+ File "/home/ywan0794/miniconda3/envs/da3/lib/python3.10/site-packages/click/core.py", line 1485, in __call__
95
+ return self.main(*args, **kwargs)
96
+ File "/home/ywan0794/miniconda3/envs/da3/lib/python3.10/site-packages/click/core.py", line 1406, in main
97
+ rv = self.invoke(ctx)
98
+ File "/home/ywan0794/miniconda3/envs/da3/lib/python3.10/site-packages/click/core.py", line 1269, in invoke
99
+ return ctx.invoke(self.callback, **ctx.params)
100
+ File "/home/ywan0794/miniconda3/envs/da3/lib/python3.10/site-packages/click/core.py", line 824, in invoke
101
+ return callback(*args, **kwargs)
102
+ File "/home/ywan0794/miniconda3/envs/da3/lib/python3.10/site-packages/click/decorators.py", line 34, in new_func
103
+ return f(get_current_context(), *args, **kwargs)
104
+ File "/home/ywan0794/EvalMDE/scripts/run_inference.py", line 59, in main
105
+ pred = baseline.infer_for_evaluation(img, K_norm)
106
+ File "/home/ywan0794/MoGe/moge/test/baseline.py", line 43, in infer_for_evaluation
107
+ return self.infer(image, intrinsics)
108
+ File "/home/ywan0794/miniconda3/envs/da3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
109
+ return func(*args, **kwargs)
110
+ File "/home/ywan0794/EvalMDE/baselines/da3_mono.py", line 69, in infer
111
+ assert intrinsics is None, "DA3-Mono does not consume intrinsics."
112
+ AssertionError: DA3-Mono does not consume intrinsics.
113
+ [INF-FAIL rc=1] da3_mono
114
+
115
+ ============================================
116
+ [fe2e inference] Thu May 14 10:48:02 AM AEST 2026 env=fe2e
117
+ ============================================
118
+ [INFO] prompt_type=empty, 跳过Qwen模型加载
119
+ create LoRA network from weights
120
+ train all blocks only
121
+ create LoRA for DIT all blocks: 304 modules.
122
+ enable LoRA for U-Net
123
+ weights are merged
124
+ Found 2 scenes
125
+ [1/2] sample_data: shape=(720, 1280)
126
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/fe2e
127
+ [INF-OK] fe2e
128
+
129
+ ============================================
130
+ Stage 2: metric aggregation (evalmde env)
131
+ ============================================
132
+ --- metric: depth_pro ---
133
+ Found 2 scenes for depth_pro
134
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
135
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
136
+ sample_data: sawa_h=0.5913, rel_normal=0.1919
137
+ sample_data_2: sawa_h=1.2677, rel_normal=0.3900
138
+
139
+ Mean: {'sawa_h': 0.9295024154567082, 'rel_normal': 0.2909630531817561}
140
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/depth_pro_metrics.json
141
+ [METRIC-OK] depth_pro
142
+ --- metric: marigold ---
143
+ Found 2 scenes for marigold
144
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
145
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
146
+ sample_data: sawa_h=1.1787, rel_normal=0.3820
147
+ sample_data_2: sawa_h=2.1863, rel_normal=0.7493
148
+
149
+ Mean: {'sawa_h': 1.682470514493703, 'rel_normal': 0.5656452301519006}
150
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/marigold_metrics.json
151
+ [METRIC-OK] marigold
152
+ --- metric: lotus ---
153
+ Found 2 scenes for lotus
154
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
155
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
156
+ sample_data: sawa_h=1.0975, rel_normal=0.3560
157
+ sample_data_2: sawa_h=1.9437, rel_normal=0.5927
158
+
159
+ Mean: {'sawa_h': 1.520615413262182, 'rel_normal': 0.47437434867840383}
160
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/lotus_metrics.json
161
+ [METRIC-OK] lotus
162
+ --- metric: depthmaster ---
163
+ Found 2 scenes for depthmaster
164
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
165
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
166
+ sample_data: sawa_h=4.4748, rel_normal=0.3334
167
+ sample_data_2: sawa_h=4.7746, rel_normal=0.5856
168
+
169
+ Mean: {'sawa_h': 4.624733318991572, 'rel_normal': 0.4595408906976621}
170
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/depthmaster_metrics.json
171
+ [METRIC-OK] depthmaster
172
+ --- metric: ppd ---
173
+ Found 2 scenes for ppd
174
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
175
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
176
+ sample_data: sawa_h=2.1980, rel_normal=0.8372
177
+ sample_data_2: sawa_h=2.5355, rel_normal=0.9053
178
+
179
+ Mean: {'sawa_h': 2.3667450420848906, 'rel_normal': 0.8712330618410348}
180
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/ppd_metrics.json
181
+ [METRIC-OK] ppd
182
+ --- metric: da3_mono ---
183
+ [METRIC-SKIP no inference output] da3_mono
184
+ --- metric: fe2e ---
185
+ Found 2 scenes for fe2e
186
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
187
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
188
+ sample_data: sawa_h=1.1134, rel_normal=0.3317
189
+ sample_data_2: sawa_h=1.9205, rel_normal=0.6153
190
+
191
+ Mean: {'sawa_h': 1.5169872681088794, 'rel_normal': 0.47354231407887987}
192
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/fe2e_metrics.json
193
+ [METRIC-OK] fe2e
194
+
195
+ ============================================
196
+ smoke-all finished at Thu May 14 10:48:46 AM AEST 2026
197
+ === Summary ===
198
+ [INF-OK] depth_pro
199
+ [INF-OK] marigold
200
+ [INF-OK] lotus
201
+ [INF-OK] depthmaster
202
+ [INF-OK] ppd
203
+ [INF-FAIL rc=1] da3_mono
204
+ [INF-OK] fe2e
205
+ [METRIC-OK] depth_pro
206
+ [METRIC-OK] marigold
207
+ [METRIC-OK] lotus
208
+ [METRIC-OK] depthmaster
209
+ [METRIC-OK] ppd
210
+ [METRIC-SKIP no inference output] da3_mono
211
+ [METRIC-OK] fe2e
212
+ === Per-model means ===
213
+ depth_pro: {'sawa_h': 0.9295024154567082, 'rel_normal': 0.2909630531817561}
214
+ marigold: {'sawa_h': 1.682470514493703, 'rel_normal': 0.5656452301519006}
215
+ lotus: {'sawa_h': 1.520615413262182, 'rel_normal': 0.47437434867840383}
216
+ depthmaster: {'sawa_h': 4.624733318991572, 'rel_normal': 0.4595408906976621}
217
+ ppd: {'sawa_h': 2.3667450420848906, 'rel_normal': 0.8712330618410348}
218
+ fe2e: {'sawa_h': 1.5169872681088794, 'rel_normal': 0.47354231407887987}
smoke_all_12115.log ADDED
@@ -0,0 +1,207 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ============================================
2
+ smoke-all started at Thu May 14 10:56:25 AM AEST 2026
3
+ Data: /home/ywan0794/EvalMDE/data/smoke Output: /home/ywan0794/EvalMDE/output/smoke_all
4
+ ============================================
5
+ Thu May 14 10:56:25 2026
6
+ +-----------------------------------------------------------------------------------------+
7
+ | NVIDIA-SMI 550.163.01 Driver Version: 550.163.01 CUDA Version: 12.4 |
8
+ |-----------------------------------------+------------------------+----------------------+
9
+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
10
+ | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
11
+ | | | MIG M. |
12
+ |=========================================+========================+======================|
13
+ | 0 NVIDIA H100 NVL Off | 00000000:61:00.0 Off | 0 |
14
+ | N/A 38C P0 61W / 400W | 14MiB / 95830MiB | 0% Default |
15
+ | | | Disabled |
16
+ +-----------------------------------------+------------------------+----------------------+
17
+
18
+ +-----------------------------------------------------------------------------------------+
19
+ | Processes: |
20
+ | GPU GI CI PID Type Process name GPU Memory |
21
+ | ID ID Usage |
22
+ |=========================================================================================|
23
+ | 0 N/A N/A 4274 G /usr/lib/xorg/Xorg 4MiB |
24
+ +-----------------------------------------------------------------------------------------+
25
+
26
+ ============================================
27
+ [depth_pro inference] Thu May 14 10:56:25 AM AEST 2026 env=depth-pro
28
+ ============================================
29
+ Found 2 scenes
30
+ [1/2] sample_data: shape=(720, 1280)
31
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/depth_pro
32
+ [INF-OK] depth_pro
33
+
34
+ ============================================
35
+ [marigold inference] Thu May 14 10:56:44 AM AEST 2026 env=marigold
36
+ ============================================
37
+ The config attributes {'prediction_type': 'depth'} were passed to MarigoldDepthPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
38
+ Keyword arguments {'prediction_type': 'depth'} are not expected by MarigoldDepthPipeline and will be ignored.
39
+
40
+
41
+
42
+ Found 2 scenes
43
+ [1/2] sample_data: shape=(720, 1280)
44
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/marigold
45
+ [INF-OK] marigold
46
+
47
+ ============================================
48
+ [lotus inference] Thu May 14 10:57:01 AM AEST 2026 env=lotus
49
+ ============================================
50
+
51
+ Found 2 scenes
52
+ [1/2] sample_data: shape=(720, 1280)
53
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/lotus
54
+ [INF-OK] lotus
55
+
56
+ ============================================
57
+ [depthmaster inference] Thu May 14 10:57:14 AM AEST 2026 env=depthmaster
58
+ ============================================
59
+ The config attributes {'default_denoising_steps': 10, 'scheduler': ['diffusers', 'DDIMScheduler']} were passed to DepthMasterPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
60
+ Keyword arguments {'default_denoising_steps': 10, 'scheduler': ['diffusers', 'DDIMScheduler']} are not expected by DepthMasterPipeline and will be ignored.
61
+
62
+ Defaulting to unsafe serialization. Pass `allow_pickle=False` to raise an error instead.
63
+ Some weights of the model checkpoint at /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet were not used when initializing UNet2DConditionModel:
64
+ ['fftblock.norm.weight, fftblock.norm.bias, fftblock.conv_f4.bias, fftblock.fuse.bias, fftblock.conv_s1.bias, fftblock.conv_f4.weight, fftblock.conv_f2.weight, fftblock.conv_f3.bias, fftblock.conv_f1.weight, fftblock.fuse.weight, fftblock.conv_s1.weight, fftblock.conv_s2.weight, fftblock.conv_f2.bias, fftblock.conv_f3.weight, fftblock.conv_f1.bias, fftblock.conv_s2.bias']
65
+
66
+
67
+
68
+ Expected types for unet: (<class 'depthmaster.modules.unet_2d_condition_s2.UNet2DConditionModel'>,), got <class 'diffusers.models.unets.unet_2d_condition.UNet2DConditionModel'>.
69
+ An error occurred while trying to fetch /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet: Error no file named diffusion_pytorch_model.safetensors found in directory /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet.
70
+ Defaulting to unsafe serialization. Pass `allow_pickle=False` to raise an error instead.
71
+ Found 2 scenes
72
+ [1/2] sample_data: shape=(720, 1280)
73
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/depthmaster
74
+ [INF-OK] depthmaster
75
+
76
+ ============================================
77
+ [ppd inference] Thu May 14 10:57:32 AM AEST 2026 env=ppd
78
+ ============================================
79
+ xFormers not available
80
+ xFormers not available
81
+ Found 2 scenes
82
+ [1/2] sample_data: shape=(720, 1280)
83
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/ppd
84
+ [INF-OK] ppd
85
+
86
+ ============================================
87
+ [da3_mono inference] Thu May 14 10:57:50 AM AEST 2026 env=da3
88
+ ============================================
89
+ [WARN ] Dependency `gsplat` is required for rendering 3DGS. Install via: pip install git+https://github.com/nerfstudio-project/gsplat.git@0b4dddf04cb687367602c01196913cde6a743d70
90
+ Found 2 scenes
91
+ [1/2] sample_data: shape=(720, 1280)
92
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/da3_mono
93
+ [INF-OK] da3_mono
94
+
95
+ ============================================
96
+ [fe2e inference] Thu May 14 10:58:06 AM AEST 2026 env=fe2e
97
+ ============================================
98
+ [INFO] prompt_type=empty, 跳过Qwen模型加载
99
+ create LoRA network from weights
100
+ train all blocks only
101
+ create LoRA for DIT all blocks: 304 modules.
102
+ enable LoRA for U-Net
103
+ weights are merged
104
+ Found 2 scenes
105
+ [1/2] sample_data: shape=(720, 1280)
106
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/fe2e
107
+ [INF-OK] fe2e
108
+
109
+ ============================================
110
+ Stage 2: metric aggregation (evalmde env)
111
+ ============================================
112
+ --- metric: depth_pro ---
113
+ Found 2 scenes for depth_pro
114
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
115
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
116
+ sample_data: sawa_h=0.5911, rel_normal=0.1919
117
+ sample_data_2: sawa_h=1.2681, rel_normal=0.3900
118
+
119
+ Mean: {'sawa_h': 0.9295960737251597, 'rel_normal': 0.2909630531817561}
120
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/depth_pro_metrics.json
121
+ [METRIC-OK] depth_pro
122
+ --- metric: marigold ---
123
+ Found 2 scenes for marigold
124
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
125
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
126
+ sample_data: sawa_h=1.1825, rel_normal=0.3738
127
+ sample_data_2: sawa_h=2.3192, rel_normal=0.7576
128
+
129
+ Mean: {'sawa_h': 1.7508343150610857, 'rel_normal': 0.5656810754863019}
130
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/marigold_metrics.json
131
+ [METRIC-OK] marigold
132
+ --- metric: lotus ---
133
+ Found 2 scenes for lotus
134
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
135
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
136
+ sample_data: sawa_h=1.0973, rel_normal=0.3560
137
+ sample_data_2: sawa_h=1.9433, rel_normal=0.5927
138
+
139
+ Mean: {'sawa_h': 1.5202771508195192, 'rel_normal': 0.4743600947637731}
140
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/lotus_metrics.json
141
+ [METRIC-OK] lotus
142
+ --- metric: depthmaster ---
143
+ Found 2 scenes for depthmaster
144
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
145
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
146
+ sample_data: sawa_h=4.4750, rel_normal=0.3334
147
+ sample_data_2: sawa_h=4.7744, rel_normal=0.5856
148
+
149
+ Mean: {'sawa_h': 4.624715064603448, 'rel_normal': 0.4595408906976621}
150
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/depthmaster_metrics.json
151
+ [METRIC-OK] depthmaster
152
+ --- metric: ppd ---
153
+ Found 2 scenes for ppd
154
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
155
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
156
+ sample_data: sawa_h=2.1984, rel_normal=0.8372
157
+ sample_data_2: sawa_h=2.5360, rel_normal=0.9053
158
+
159
+ Mean: {'sawa_h': 2.3671746082894383, 'rel_normal': 0.8712330618410348}
160
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/ppd_metrics.json
161
+ [METRIC-OK] ppd
162
+ --- metric: da3_mono ---
163
+ Found 2 scenes for da3_mono
164
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
165
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
166
+ sample_data: sawa_h=0.8455, rel_normal=0.2068
167
+ sample_data_2: sawa_h=1.4472, rel_normal=0.4535
168
+
169
+ Mean: {'sawa_h': 1.1463719382024506, 'rel_normal': 0.3301750209283423}
170
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/da3_mono_metrics.json
171
+ [METRIC-OK] da3_mono
172
+ --- metric: fe2e ---
173
+ Found 2 scenes for fe2e
174
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
175
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
176
+ sample_data: sawa_h=1.1133, rel_normal=0.3317
177
+ sample_data_2: sawa_h=1.9202, rel_normal=0.6153
178
+
179
+ Mean: {'sawa_h': 1.5167008543796885, 'rel_normal': 0.47354231407887987}
180
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/fe2e_metrics.json
181
+ [METRIC-OK] fe2e
182
+
183
+ ============================================
184
+ smoke-all finished at Thu May 14 10:58:51 AM AEST 2026
185
+ === Summary ===
186
+ [INF-OK] depth_pro
187
+ [INF-OK] marigold
188
+ [INF-OK] lotus
189
+ [INF-OK] depthmaster
190
+ [INF-OK] ppd
191
+ [INF-OK] da3_mono
192
+ [INF-OK] fe2e
193
+ [METRIC-OK] depth_pro
194
+ [METRIC-OK] marigold
195
+ [METRIC-OK] lotus
196
+ [METRIC-OK] depthmaster
197
+ [METRIC-OK] ppd
198
+ [METRIC-OK] da3_mono
199
+ [METRIC-OK] fe2e
200
+ === Per-model means ===
201
+ depth_pro: {'sawa_h': 0.9295960737251597, 'rel_normal': 0.2909630531817561}
202
+ marigold: {'sawa_h': 1.7508343150610857, 'rel_normal': 0.5656810754863019}
203
+ lotus: {'sawa_h': 1.5202771508195192, 'rel_normal': 0.4743600947637731}
204
+ depthmaster: {'sawa_h': 4.624715064603448, 'rel_normal': 0.4595408906976621}
205
+ ppd: {'sawa_h': 2.3671746082894383, 'rel_normal': 0.8712330618410348}
206
+ da3_mono: {'sawa_h': 1.1463719382024506, 'rel_normal': 0.3301750209283423}
207
+ fe2e: {'sawa_h': 1.5167008543796885, 'rel_normal': 0.47354231407887987}
smoke_all_12351.log ADDED
@@ -0,0 +1,235 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ============================================
2
+ smoke-all started at Thu May 14 11:35:01 AM AEST 2026
3
+ Data: /home/ywan0794/EvalMDE/data/smoke Output: /home/ywan0794/EvalMDE/output/smoke_all
4
+ ============================================
5
+ Thu May 14 11:35:01 2026
6
+ +-----------------------------------------------------------------------------------------+
7
+ | NVIDIA-SMI 550.163.01 Driver Version: 550.163.01 CUDA Version: 12.4 |
8
+ |-----------------------------------------+------------------------+----------------------+
9
+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
10
+ | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
11
+ | | | MIG M. |
12
+ |=========================================+========================+======================|
13
+ | 0 NVIDIA H100 NVL Off | 00000000:E1:00.0 Off | 0 |
14
+ | N/A 37C P0 93W / 400W | 14MiB / 95830MiB | 2% Default |
15
+ | | | Disabled |
16
+ +-----------------------------------------+------------------------+----------------------+
17
+
18
+ +-----------------------------------------------------------------------------------------+
19
+ | Processes: |
20
+ | GPU GI CI PID Type Process name GPU Memory |
21
+ | ID ID Usage |
22
+ |=========================================================================================|
23
+ | 0 N/A N/A 4274 G /usr/lib/xorg/Xorg 4MiB |
24
+ +-----------------------------------------------------------------------------------------+
25
+
26
+ ============================================
27
+ [depth_pro inference] Thu May 14 11:35:01 AM AEST 2026 env=depth-pro
28
+ ============================================
29
+ Found 2 scenes
30
+ [1/2] sample_data: shape=(720, 1280)
31
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/depth_pro
32
+ [INF-OK] depth_pro
33
+
34
+ ============================================
35
+ [marigold inference] Thu May 14 11:35:20 AM AEST 2026 env=marigold
36
+ ============================================
37
+ The config attributes {'prediction_type': 'depth'} were passed to MarigoldDepthPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
38
+ Keyword arguments {'prediction_type': 'depth'} are not expected by MarigoldDepthPipeline and will be ignored.
39
+
40
+
41
+
42
+ Found 2 scenes
43
+ [1/2] sample_data: shape=(720, 1280)
44
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/marigold
45
+ [INF-OK] marigold
46
+
47
+ ============================================
48
+ [lotus inference] Thu May 14 11:35:37 AM AEST 2026 env=lotus
49
+ ============================================
50
+
51
+ Found 2 scenes
52
+ [1/2] sample_data: shape=(720, 1280)
53
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/lotus
54
+ [INF-OK] lotus
55
+
56
+ ============================================
57
+ [depthmaster inference] Thu May 14 11:35:52 AM AEST 2026 env=depthmaster
58
+ ============================================
59
+ The config attributes {'default_denoising_steps': 10, 'scheduler': ['diffusers', 'DDIMScheduler']} were passed to DepthMasterPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
60
+ Keyword arguments {'default_denoising_steps': 10, 'scheduler': ['diffusers', 'DDIMScheduler']} are not expected by DepthMasterPipeline and will be ignored.
61
+
62
+
63
+
64
+ Defaulting to unsafe serialization. Pass `allow_pickle=False` to raise an error instead.
65
+ Some weights of the model checkpoint at /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet were not used when initializing UNet2DConditionModel:
66
+ ['fftblock.fuse.bias, fftblock.conv_f1.weight, fftblock.conv_f2.weight, fftblock.conv_f4.weight, fftblock.conv_s2.bias, fftblock.fuse.weight, fftblock.conv_f2.bias, fftblock.conv_f3.bias, fftblock.conv_s1.weight, fftblock.conv_f3.weight, fftblock.conv_f4.bias, fftblock.norm.weight, fftblock.conv_f1.bias, fftblock.conv_s2.weight, fftblock.conv_s1.bias, fftblock.norm.bias']
67
+
68
+ Expected types for unet: (<class 'depthmaster.modules.unet_2d_condition_s2.UNet2DConditionModel'>,), got <class 'diffusers.models.unets.unet_2d_condition.UNet2DConditionModel'>.
69
+ An error occurred while trying to fetch /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet: Error no file named diffusion_pytorch_model.safetensors found in directory /home/ywan0794/EvalMDE/DepthMaster/ckpt/eval/unet.
70
+ Defaulting to unsafe serialization. Pass `allow_pickle=False` to raise an error instead.
71
+ Found 2 scenes
72
+ [1/2] sample_data: shape=(720, 1280)
73
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/depthmaster
74
+ [INF-OK] depthmaster
75
+
76
+ ============================================
77
+ [ppd inference] Thu May 14 11:36:09 AM AEST 2026 env=ppd
78
+ ============================================
79
+ xFormers not available
80
+ xFormers not available
81
+ Found 2 scenes
82
+ [1/2] sample_data: shape=(720, 1280)
83
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/ppd
84
+ [INF-OK] ppd
85
+
86
+ ============================================
87
+ [da3_mono inference] Thu May 14 11:36:29 AM AEST 2026 env=da3
88
+ ============================================
89
+ [WARN ] Dependency `gsplat` is required for rendering 3DGS. Install via: pip install git+https://github.com/nerfstudio-project/gsplat.git@0b4dddf04cb687367602c01196913cde6a743d70
90
+ Found 2 scenes
91
+ [1/2] sample_data: shape=(720, 1280)
92
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/da3_mono
93
+ [INF-OK] da3_mono
94
+
95
+ ============================================
96
+ [fe2e inference] Thu May 14 11:36:45 AM AEST 2026 env=fe2e
97
+ ============================================
98
+ [INFO] prompt_type=empty, 跳过Qwen模型加载
99
+ create LoRA network from weights
100
+ train all blocks only
101
+ create LoRA for DIT all blocks: 304 modules.
102
+ enable LoRA for U-Net
103
+ weights are merged
104
+ Found 2 scenes
105
+ [1/2] sample_data: shape=(720, 1280)
106
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_all/fe2e
107
+ [INF-OK] fe2e
108
+
109
+ ============================================
110
+ Stage 2: metric aggregation (evalmde env)
111
+ ============================================
112
+ --- metric: depth_pro ---
113
+ Found 2 scenes for depth_pro
114
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
115
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
116
+ sample_data: sawa_h raw=0.591 aln=0.586 | relnorm raw=0.192 aln=0.184 | boundF1_err raw=0.736 aln=0.796
117
+ sample_data_2: sawa_h raw=1.268 aln=1.572 | relnorm raw=0.390 aln=0.510 | boundF1_err raw=0.513 aln=0.730
118
+
119
+ Mean RAW : {'wkdr_no_align': 0.040659576654434204, 'delta0125_disparity_affine_err': 0.4819862172007561, 'delta0125_depth_affine_err': 0.5122604453936219, 'boundary_f1_err': 0.6248274250948582, 'rel_normal': 0.2909630531817561, 'sawa_h': 0.9297213865303355}
120
+ Mean ALIGNED: {'wkdr_no_align': 0.04262185096740723, 'delta0125_disparity_affine_err': 0.5153419096022844, 'delta0125_depth_affine_err': 0.5080458391457796, 'boundary_f1_err': 0.76309088091753, 'rel_normal': 0.3469443633141419, 'sawa_h': 1.0791019991638469}
121
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/depth_pro_metrics.json
122
+ [METRIC-OK] depth_pro
123
+ --- metric: marigold ---
124
+ Found 2 scenes for marigold
125
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
126
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
127
+ sample_data: sawa_h raw=1.157 aln=0.930 | relnorm raw=0.364 aln=0.301 | boundF1_err raw=0.823 aln=0.929
128
+ sample_data_2: sawa_h raw=2.127 aln=2.129 | relnorm raw=0.700 aln=0.703 | boundF1_err raw=0.927 aln=0.923
129
+
130
+ Mean RAW : {'wkdr_no_align': 0.06999608874320984, 'delta0125_disparity_affine_err': 0.9599380735307932, 'delta0125_depth_affine_err': 0.6116695962846279, 'boundary_f1_err': 0.8753077063320032, 'rel_normal': 0.5323382569438899, 'sawa_h': 1.6421890328486521}
131
+ Mean ALIGNED: {'wkdr_no_align': 0.07159394025802612, 'delta0125_disparity_affine_err': 0.5699970349669456, 'delta0125_depth_affine_err': 0.6097928117960691, 'boundary_f1_err': 0.9258331507737796, 'rel_normal': 0.5021106104687787, 'sawa_h': 1.5292764908179928}
132
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/marigold_metrics.json
133
+ [METRIC-OK] marigold
134
+ --- metric: lotus ---
135
+ Found 2 scenes for lotus
136
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
137
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
138
+ sample_data: sawa_h raw=1.295 aln=1.095 | relnorm raw=0.385 aln=0.339 | boundF1_err raw=0.917 aln=0.865
139
+ sample_data_2: sawa_h raw=2.195 aln=2.157 | relnorm raw=0.710 aln=0.690 | boundF1_err raw=0.948 aln=0.937
140
+
141
+ Mean RAW : {'wkdr_no_align': 0.0865098237991333, 'delta0125_disparity_affine_err': 0.9658380672335625, 'delta0125_depth_affine_err': 0.6983374059200287, 'boundary_f1_err': 0.9324993093468176, 'rel_normal': 0.5473170225478743, 'sawa_h': 1.7448899686403179}
142
+ Mean ALIGNED: {'wkdr_no_align': 0.08659347891807556, 'delta0125_disparity_affine_err': 0.6961807310581207, 'delta0125_depth_affine_err': 0.6983374059200287, 'boundary_f1_err': 0.900953527425987, 'rel_normal': 0.5143106302233235, 'sawa_h': 1.6263154318190827}
143
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/lotus_metrics.json
144
+ [METRIC-OK] lotus
145
+ --- metric: depthmaster ---
146
+ Found 2 scenes for depthmaster
147
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
148
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
149
+ sample_data: sawa_h raw=4.475 aln=1.124 | relnorm raw=0.333 aln=0.293 | boundF1_err raw=0.998 aln=0.925
150
+ sample_data_2: sawa_h raw=4.774 aln=2.044 | relnorm raw=0.586 aln=0.654 | boundF1_err raw=0.991 aln=0.933
151
+
152
+ Mean RAW : {'wkdr_no_align': 0.9196415990591049, 'delta0125_disparity_affine_err': 0.9356035124510527, 'delta0125_depth_affine_err': 0.9198634652420878, 'boundary_f1_err': 0.9947631304516369, 'rel_normal': 0.4595408906976621, 'sawa_h': 4.624761057503135}
153
+ Mean ALIGNED: {'wkdr_no_align': 0.08632892370223999, 'delta0125_disparity_affine_err': 0.8615778312087059, 'delta0125_depth_affine_err': 0.9183715572580695, 'boundary_f1_err': 0.9292771149464188, 'rel_normal': 0.4737114471098748, 'sawa_h': 1.5842239270857643}
154
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/depthmaster_metrics.json
155
+ [METRIC-OK] depthmaster
156
+ --- metric: ppd ---
157
+ Found 2 scenes for ppd
158
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
159
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
160
+ sample_data: sawa_h raw=2.198 aln=1.901 | relnorm raw=0.837 aln=0.732 | boundF1_err raw=0.857 aln=0.813
161
+ sample_data_2: sawa_h raw=2.535 aln=2.444 | relnorm raw=0.905 aln=0.852 | boundF1_err raw=0.922 aln=0.860
162
+
163
+ Mean RAW : {'wkdr_no_align': 0.0871766209602356, 'delta0125_disparity_affine_err': 0.9600839260965586, 'delta0125_depth_affine_err': 0.7343441895209253, 'boundary_f1_err': 0.8895310134282803, 'rel_normal': 0.8712330618410348, 'sawa_h': 2.366451557754713}
164
+ Mean ALIGNED: {'wkdr_no_align': 0.09249627590179443, 'delta0125_disparity_affine_err': 0.6880950853228569, 'delta0125_depth_affine_err': 0.7330712396651506, 'boundary_f1_err': 0.8366181924177966, 'rel_normal': 0.7918025637799675, 'sawa_h': 2.172219847013012}
165
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/ppd_metrics.json
166
+ [METRIC-OK] ppd
167
+ --- metric: da3_mono ---
168
+ Found 2 scenes for da3_mono
169
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
170
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
171
+ sample_data: sawa_h raw=0.845 aln=0.707 | relnorm raw=0.207 aln=0.177 | boundF1_err raw=0.931 aln=0.979
172
+ sample_data_2: sawa_h raw=1.447 aln=1.688 | relnorm raw=0.454 aln=0.561 | boundF1_err raw=0.821 aln=0.832
173
+
174
+ Mean RAW : {'wkdr_no_align': 0.047021448612213135, 'delta0125_disparity_affine_err': 0.8515407452359796, 'delta0125_depth_affine_err': 0.5784242674708366, 'boundary_f1_err': 0.8758562543377832, 'rel_normal': 0.3301750209283423, 'sawa_h': 1.1464006557203033}
175
+ Mean ALIGNED: {'wkdr_no_align': 0.051027655601501465, 'delta0125_disparity_affine_err': 0.6030512889847159, 'delta0125_depth_affine_err': 0.5845414977520704, 'boundary_f1_err': 0.905371028814778, 'rel_normal': 0.36910983958422094, 'sawa_h': 1.1977928844965942}
176
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/da3_mono_metrics.json
177
+ [METRIC-OK] da3_mono
178
+ --- metric: fe2e ---
179
+ Found 2 scenes for fe2e
180
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
181
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
182
+ sample_data: sawa_h raw=1.113 aln=0.967 | relnorm raw=0.332 aln=0.282 | boundF1_err raw=0.879 aln=0.837
183
+ sample_data_2: sawa_h raw=1.921 aln=2.007 | relnorm raw=0.615 aln=0.642 | boundF1_err raw=0.792 aln=0.843
184
+
185
+ Mean RAW : {'wkdr_no_align': 0.06889474391937256, 'delta0125_disparity_affine_err': 0.9601075295358896, 'delta0125_depth_affine_err': 0.6917768018320203, 'boundary_f1_err': 0.8355541301758247, 'rel_normal': 0.47354231407887987, 'sawa_h': 1.516985853988682}
186
+ Mean ALIGNED: {'wkdr_no_align': 0.07494029402732849, 'delta0125_disparity_affine_err': 0.7892066687345505, 'delta0125_depth_affine_err': 0.6908030491322279, 'boundary_f1_err': 0.8398841726062642, 'rel_normal': 0.46222504542092735, 'sawa_h': 1.4871907267011424}
187
+ Saved → /home/ywan0794/EvalMDE/output/smoke_all/fe2e_metrics.json
188
+ [METRIC-OK] fe2e
189
+
190
+ ============================================
191
+ smoke-all finished at Thu May 14 11:37:36 AM AEST 2026
192
+ === Summary ===
193
+ [INF-OK] depth_pro
194
+ [INF-OK] marigold
195
+ [INF-OK] lotus
196
+ [INF-OK] depthmaster
197
+ [INF-OK] ppd
198
+ [INF-OK] da3_mono
199
+ [INF-OK] fe2e
200
+ [METRIC-OK] depth_pro
201
+ [METRIC-OK] marigold
202
+ [METRIC-OK] lotus
203
+ [METRIC-OK] depthmaster
204
+ [METRIC-OK] ppd
205
+ [METRIC-OK] da3_mono
206
+ [METRIC-OK] fe2e
207
+ === Per-model means ===
208
+ Traceback (most recent call last):
209
+ File "<string>", line 1, in <module>
210
+ KeyError: 'mean'
211
+ depth_pro:
212
+ Traceback (most recent call last):
213
+ File "<string>", line 1, in <module>
214
+ KeyError: 'mean'
215
+ marigold:
216
+ Traceback (most recent call last):
217
+ File "<string>", line 1, in <module>
218
+ KeyError: 'mean'
219
+ lotus:
220
+ Traceback (most recent call last):
221
+ File "<string>", line 1, in <module>
222
+ KeyError: 'mean'
223
+ depthmaster:
224
+ Traceback (most recent call last):
225
+ File "<string>", line 1, in <module>
226
+ KeyError: 'mean'
227
+ ppd:
228
+ Traceback (most recent call last):
229
+ File "<string>", line 1, in <module>
230
+ KeyError: 'mean'
231
+ da3_mono:
232
+ Traceback (most recent call last):
233
+ File "<string>", line 1, in <module>
234
+ KeyError: 'mean'
235
+ fe2e:
smoke_evalmde_12112.log ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ === Smoke step 1: depth_pro inference ===
2
+ /var/spool/slurmd/job12112/slurm_script: line 26: PYTHONPATH: unbound variable
smoke_evalmde_12113.log ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ === Smoke step 1: depth_pro inference ===
2
+ Found 2 scenes
3
+ [1/2] sample_data: shape=(720, 1280)
4
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke/depth_pro
5
+ === Smoke step 2: compute_metrics in evalmde env ===
6
+ Found 2 scenes for depth_pro
7
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
8
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
9
+ sample_data: sawa_h=0.5910, rel_normal=0.1919
10
+ sample_data_2: sawa_h=1.2678, rel_normal=0.3900
11
+
12
+ Mean: {'sawa_h': 0.9293891770624477, 'rel_normal': 0.2909630531817561}
13
+ Saved → /home/ywan0794/EvalMDE/output/smoke/depth_pro_metrics.json
14
+ === Smoke summary ===
15
+ {
16
+ "model": "depth_pro",
17
+ "n_scenes": 2,
18
+ "per_scene": [
19
+ {
20
+ "scene": "sample_data",
21
+ "sawa_h": 0.5910246923517875,
22
+ "rel_normal": 0.19190798903378145
23
+ },
24
+ {
25
+ "scene": "sample_data_2",
26
+ "sawa_h": 1.267753661773108,
27
+ "rel_normal": 0.3900181173297307
28
+ }
29
+ ],
30
+ "mean": {
31
+ "sawa_h": 0.9293891770624477,
32
+ "rel_normal": 0.2909630531817561
33
+ }
34
+ }
smoke_lotus_v1_12348.log ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ === Stage 1: lotus v1-0 inference ===
2
+
3
+ Found 2 scenes
4
+ [1/2] sample_data: shape=(720, 1280)
5
+ Saved 2 predictions to /home/ywan0794/EvalMDE/output/smoke_lotus_v1/lotus
6
+
7
+ === Stage 2: metric (evalmde env, dual-track) ===
8
+ Found 2 scenes for lotus
9
+ /home/ywan0794/miniconda3/envs/evalmde/lib/python3.10/site-packages/torch/functional.py:554: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4314.)
10
+ return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
11
+ sample_data: sawa_h raw=1.296 aln=1.098 | relnorm raw=0.385 aln=0.339 | boundF1_err raw=0.917 aln=0.866
12
+ sample_data_2: sawa_h raw=2.194 aln=2.158 | relnorm raw=0.710 aln=0.690 | boundF1_err raw=0.948 aln=0.937
13
+
14
+ Mean RAW : {'wkdr_no_align': 0.08665573596954346, 'delta0125_disparity_affine_err': 0.9616885241121054, 'delta0125_depth_affine_err': 0.6993474271148443, 'boundary_f1_err': 0.9321780361920082, 'rel_normal': 0.5475669290628759, 'sawa_h': 1.7451062945205418}
15
+ Mean ALIGNED: {'wkdr_no_align': 0.08664768934249878, 'delta0125_disparity_affine_err': 0.6976255280897021, 'delta0125_depth_affine_err': 0.6993474271148443, 'boundary_f1_err': 0.9015399104142445, 'rel_normal': 0.5147056572154382, 'sawa_h': 1.6276670925082142}
16
+ Saved → /home/ywan0794/EvalMDE/output/smoke_lotus_v1/lotus_metrics.json
17
+
18
+ === Summary ===
19
+ RAW mean: {'wkdr_no_align': 0.08665573596954346, 'delta0125_disparity_affine_err': 0.9616885241121054, 'delta0125_depth_affine_err': 0.6993474271148443, 'boundary_f1_err': 0.9321780361920082, 'rel_normal': 0.5475669290628759, 'sawa_h': 1.7451062945205418}
20
+ ALIGNED mean: {'wkdr_no_align': 0.08664768934249878, 'delta0125_disparity_affine_err': 0.6976255280897021, 'delta0125_depth_affine_err': 0.6993474271148443, 'boundary_f1_err': 0.9015399104142445, 'rel_normal': 0.5147056572154382, 'sawa_h': 1.6276670925082142}