Bavantha11 commited on
Commit
580770b
·
verified ·
1 Parent(s): 36f0fba

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ pipeline_tag: image-segmentation
4
+ tags:
5
+ - monocular-depth-estimation
6
+ - semantic-segmentation
7
+ - multi-task-learning
8
+ - robotics
9
+ - scene-graph
10
+ - dinov3
11
+ ---
12
+
13
+ # M2H-MX Weights
14
+
15
+ This repository hosts model-only weights for **M2H-MX: Multi-Task Dense Visual Perception for Real-Time Monocular Spatial Understanding**.
16
+
17
+ Code and instructions: https://github.com/BavanthaU/m2h_mx
18
+
19
+ ## Artifacts
20
+
21
+ | Dataset | Variant | File | Paper result |
22
+ | --- | --- | --- | --- |
23
+ | NYUDv2 | M2H-MX-L | `weights/nyudv2/m2h_mx_l_nyudv2_weights.pt` | mIoU 65.60, depth RMSE 0.3800 |
24
+ | NYUDv2 | M2H-MX-B | `weights/nyudv2/m2h_mx_b_nyudv2_weights.pt` | mIoU 61.80, depth RMSE 0.4170 |
25
+ | ScanNet | M2H-MX-L | `weights/scannet/m2h_mx_l_scannet_weights.pt` | ScanNet25k mIoU 76.10, depth RMSE 0.2210; Mono-Hydra++ ATE 6.91 cm |
26
+ | ScanNet | M2H-MX-B | `weights/scannet/m2h_mx_b_scannet_weights.pt` | Base variant artifact |
27
+ | Cityscapes | M2H-MX-L | `weights/cityscapes/m2h_mx_l_cityscapes_weights.pt` | mIoU 82.28, disparity RMSE 3.89 |
28
+
29
+ These are model-only state dictionaries. They do not include optimizer, scheduler, gradient scaler, or EMA state.
30
+
31
+ ## Download
32
+
33
+ From the code repository:
34
+
35
+ ```bash
36
+ python3 scripts/download_weights.py --repo-id Bavantha11/m2h-mx
37
+ ```
38
+
39
+ ## Citation
40
+
41
+ ```bibtex
42
+ @misc{udugama2026m2hmxmultitaskdensevisual,
43
+ title={M2H-MX: Multi-Task Dense Visual Perception for Real-Time Monocular Spatial Understanding},
44
+ author={U. V. B. L. Udugama and George Vosselman and Francesco Nex},
45
+ year={2026},
46
+ eprint={2603.29236},
47
+ archivePrefix={arXiv},
48
+ primaryClass={cs.CV},
49
+ url={https://arxiv.org/abs/2603.29236},
50
+ }
51
+
52
+ @misc{udugama2026monohydrarealtimemonocularscene,
53
+ title={Mono-Hydra++: Real-Time Monocular Scene Graph Construction with Multi-Task Learning for 3D Indoor Mapping},
54
+ author={U. V. B. L. Udugama and George Vosselman and Francesco Nex},
55
+ year={2026},
56
+ eprint={2605.17661},
57
+ archivePrefix={arXiv},
58
+ primaryClass={cs.RO},
59
+ url={https://arxiv.org/abs/2605.17661},
60
+ }
61
+ ```