Update model card
Browse files
README.md
CHANGED
|
@@ -13,7 +13,7 @@ tags:
|
|
| 13 |
|
| 14 |
**MOMO** is the first multi-sensor foundation model for Mars remote sensing, accepted at **CVPR 2026**.
|
| 15 |
|
| 16 |
-
It integrates representations learned independently from three Martian orbital sensors
|
| 17 |
|
| 18 |
[](https://arxiv.org/abs/2604.02719) [](https://github.com/kerner-lab/MOMO)
|
| 19 |
|
|
@@ -29,7 +29,7 @@ Each model size includes 5 checkpoints:
|
|
| 29 |
| `hirise.pth` | Pre-trained on HiRISE (High Resolution Imaging Science Experiment) |
|
| 30 |
| `themis.pth` | Pre-trained on THEMIS (THermal EMission Imaging System) |
|
| 31 |
| `hirise_ctx_themis.pth` | Pre-trained jointly on all three sensors |
|
| 32 |
-
| `momo.pth` | **MOMO**
|
| 33 |
|
| 34 |
Each checkpoint is available for three ViT architectures (all with patch size 16):
|
| 35 |
|
|
@@ -61,9 +61,9 @@ For full training and fine-tuning code, see the [MOMO GitHub repository](https:/
|
|
| 61 |
## Training Data
|
| 62 |
|
| 63 |
MOMO is pre-trained on ~12 million samples (~4M per sensor) from Mars orbital imagery:
|
| 64 |
-
- **HiRISE**
|
| 65 |
-
- **CTX**
|
| 66 |
-
- **THEMIS**
|
| 67 |
|
| 68 |
---
|
| 69 |
|
|
|
|
| 13 |
|
| 14 |
**MOMO** is the first multi-sensor foundation model for Mars remote sensing, accepted at **CVPR 2026**.
|
| 15 |
|
| 16 |
+
It integrates representations learned independently from three Martian orbital sensors (HiRISE, CTX, and THEMIS) spanning resolutions from 0.25 m/pixel to 100 m/pixel, using task arithmetic model merging with a novel **Equal Validation Loss (EVL)** checkpoint selection strategy.
|
| 17 |
|
| 18 |
[](https://arxiv.org/abs/2604.02719) [](https://github.com/kerner-lab/MOMO)
|
| 19 |
|
|
|
|
| 29 |
| `hirise.pth` | Pre-trained on HiRISE (High Resolution Imaging Science Experiment) |
|
| 30 |
| `themis.pth` | Pre-trained on THEMIS (THermal EMission Imaging System) |
|
| 31 |
| `hirise_ctx_themis.pth` | Pre-trained jointly on all three sensors |
|
| 32 |
+
| `momo.pth` | **MOMO** merged model via task arithmetic + EVL (main contribution) |
|
| 33 |
|
| 34 |
Each checkpoint is available for three ViT architectures (all with patch size 16):
|
| 35 |
|
|
|
|
| 61 |
## Training Data
|
| 62 |
|
| 63 |
MOMO is pre-trained on ~12 million samples (~4M per sensor) from Mars orbital imagery:
|
| 64 |
+
- **HiRISE**: 0.25 m/pixel high-resolution visible spectrum images
|
| 65 |
+
- **CTX**: 5 m/pixel context camera images
|
| 66 |
+
- **THEMIS**: 100 m/pixel thermal infrared images
|
| 67 |
|
| 68 |
---
|
| 69 |
|