Bavantha11 commited on
Commit
a3f1cc6
·
verified ·
1 Parent(s): a97af77

Clarify M2H-MX multi-task outputs

Browse files
Files changed (1) hide show
  1. README.md +17 -4
README.md CHANGED
@@ -1,19 +1,32 @@
1
  ---
2
  license: other
3
- pipeline_tag: image-segmentation
4
  tags:
 
 
5
  - monocular-depth-estimation
6
  - semantic-segmentation
7
- - multi-task-learning
 
 
8
  - robotics
9
  - scene-graph
10
  - dinov3
11
  ---
12
 
13
- # M2H-MX Weights
14
 
15
  This repository hosts model-only weights for **M2H-MX: Multi-Task Semantic and Geometric Perception for Real-Time Monocular 3D Scene Graph Construction**.
16
 
 
 
 
 
 
 
 
 
 
17
  Code and instructions: https://github.com/BavanthaU/m2h_mx
18
 
19
  ## Artifacts
@@ -33,7 +46,7 @@ These are model-only state dictionaries. They do not include optimizer, schedule
33
  From the code repository:
34
 
35
  ```bash
36
- python3 scripts/download_weights.py --repo-id Bavantha11/m2h-mx
37
  ```
38
 
39
  ## Citation
 
1
  ---
2
  license: other
3
+ library_name: pytorch
4
  tags:
5
+ - multi-task-learning
6
+ - dense-prediction
7
  - monocular-depth-estimation
8
  - semantic-segmentation
9
+ - surface-normal-estimation
10
+ - edge-detection
11
+ - geometric-perception
12
  - robotics
13
  - scene-graph
14
  - dinov3
15
  ---
16
 
17
+ # M2H-MX Multi-Task Weights
18
 
19
  This repository hosts model-only weights for **M2H-MX: Multi-Task Semantic and Geometric Perception for Real-Time Monocular 3D Scene Graph Construction**.
20
 
21
+ M2H-MX is a **multi-task dense visual perception model**, not a semantic-segmentation-only model. Given a monocular RGB image, the network can predict:
22
+
23
+ - metric depth or disparity, depending on the dataset configuration;
24
+ - semantic segmentation logits;
25
+ - surface normals;
26
+ - edge maps.
27
+
28
+ Depth and semantics are the primary deployment outputs used by Mono-Hydra++ or a compatible mapping backend for metric-semantic mapping and downstream 3D scene graph construction. Surface normals and edges are auxiliary training heads used to improve geometric and semantic consistency. The network improves the dense evidence used by the mapping backend; it does not directly predict the 3D scene graph.
29
+
30
  Code and instructions: https://github.com/BavanthaU/m2h_mx
31
 
32
  ## Artifacts
 
46
  From the code repository:
47
 
48
  ```bash
49
+ python3 scripts/download_weights.py --repo-id Bavantha11/m2h-mx --verify
50
  ```
51
 
52
  ## Citation