Mirali33 commited on
Commit
1b795c7
·
verified ·
1 Parent(s): bb81293

Add model card

Browse files
Files changed (1) hide show
  1. README.md +87 -0
README.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ tags:
4
+ - mars
5
+ - remote-sensing
6
+ - vision-transformer
7
+ - foundation-model
8
+ - model-merging
9
+ - planetary-science
10
+ ---
11
+
12
+ # MOMO: Mars Orbital Model
13
+
14
+ **MOMO** is the first multi-sensor foundation model for Mars remote sensing, accepted at **CVPR 2026**.
15
+
16
+ It integrates representations learned independently from three Martian orbital sensors — HiRISE, CTX, and THEMIS — spanning resolutions from 0.25 m/pixel to 100 m/pixel, using task arithmetic model merging with a novel **Equal Validation Loss (EVL)** checkpoint selection strategy.
17
+
18
+ [![arXiv](https://img.shields.io/badge/arXiv-2604.02719-b31b1b.svg?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2604.02719) [![GitHub](https://img.shields.io/badge/GitHub-kerner--lab%2FMOMO-black?logo=github&logoColor=white)](https://github.com/kerner-lab/MOMO)
19
+
20
+ ---
21
+
22
+ ## Checkpoints
23
+
24
+ Each model size includes 5 checkpoints:
25
+
26
+ | File | Description |
27
+ |------|-------------|
28
+ | `ctx.pth` | Pre-trained on CTX (ConTeXt Camera) |
29
+ | `hirise.pth` | Pre-trained on HiRISE (High Resolution Imaging Science Experiment) |
30
+ | `themis.pth` | Pre-trained on THEMIS (THermal EMission Imaging System) |
31
+ | `hirise_ctx_themis.pth` | Pre-trained jointly on all three sensors |
32
+ | `momo.pth` | **MOMO** — merged model via task arithmetic + EVL (main contribution) |
33
+
34
+ Available for three ViT architectures:
35
+
36
+ ```
37
+ vit-s-16/ ViT-Small (patch 16)
38
+ vit-b-16/ ViT-Base (patch 16)
39
+ vit-l-16/ ViT-Large (patch 16)
40
+ ```
41
+
42
+ ViT-Base is the primary model reported in the main paper. ViT-Small and ViT-Large results are reported in the supplementary material.
43
+
44
+ ---
45
+
46
+ ## Usage
47
+
48
+ ```python
49
+ import torch
50
+ from huggingface_hub import hf_hub_download
51
+
52
+ # Download MOMO ViT-Base checkpoint
53
+ path = hf_hub_download(repo_id="Mirali33/MOMO", filename="vit-b-16/momo.pth")
54
+ checkpoint = torch.load(path, map_location="cpu", weights_only=False)
55
+ ```
56
+
57
+ For full training and fine-tuning code, see the [MOMO GitHub repository](https://github.com/kerner-lab/MOMO).
58
+
59
+ ---
60
+
61
+ ## Training Data
62
+
63
+ MOMO is pre-trained on ~12 million samples (~4M per sensor) from Mars orbital imagery:
64
+ - **HiRISE** — 0.25 m/pixel high-resolution visible spectrum images
65
+ - **CTX** — 5 m/pixel context camera images
66
+ - **THEMIS** — 100 m/pixel thermal infrared images
67
+
68
+ ---
69
+
70
+ ## Evaluation
71
+
72
+ MOMO is evaluated on 9 downstream tasks from [Mars-Bench](https://arxiv.org/abs/2510.24010) (4 classification, 5 segmentation), outperforming ImageNet pre-training, earth observation foundation models (SatMAE, CROMA, Prithvi, TerraFM), sensor-specific pre-training, and fully-supervised baselines.
73
+
74
+ ---
75
+
76
+ ## Citation
77
+
78
+ ```bibtex
79
+ @inproceedings{purohit2026momo,
80
+ title = {MOMO: Mars Orbital Model — Foundation Model for Mars Orbital Applications},
81
+ author = {Purohit, Mirali and Gajera, Bimal and Mehta, Irish and Tokas, Bhanu and
82
+ Adler, Jacob and Lu, Steven and Dickenshied, Scott and Diniega, Serina and
83
+ Bue, Brian and Rebbapragada, Umaa and Kerner, Hannah},
84
+ booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
85
+ year = {2026}
86
+ }
87
+ ```