| --- |
| license: mit |
| library_name: pytorch |
| tags: |
| - medical-image-segmentation |
| - 3d-medical-imaging |
| - self-supervised-learning |
| - in-context-segmentation |
| - pytorch |
| - arxiv:2603.13660 |
| pipeline_tag: image-segmentation |
| --- |
| |
| # MASS Base Checkpoint |
|
|
| This repository hosts `mass_base.pth`, the base checkpoint for **MASS: Learning |
| Generalizable 3D Medical Image Representations from Mask-Guided |
| Self-Supervision**. |
|
|
| MASS is a mask-guided self-supervised learning framework for 3D medical images. |
| The released checkpoint was trained with the data used in our paper and the Iris |
| in-context segmentation architecture. It uses automatically generated |
| class-agnostic masks for pretraining and does **not** use expert ground-truth |
| annotations during pretraining. |
|
|
| ## What This Checkpoint Is For |
|
|
| `mass_base.pth` can be used with the official MASS codebase for: |
|
|
| - training-free in-context segmentation with reference image-mask examples; |
| - initialization for downstream segmentation finetuning; |
| - frozen-encoder or finetuned encoder classification experiments. |
|
|
| This is a PyTorch checkpoint for the MASS/Iris architecture, not a standalone |
| Transformers model. Please use it with the code release: |
|
|
| - GitHub: https://github.com/Stanford-AIMI/MASS |
| - Project page: https://yhygao.github.io/MASS_page/ |
| - Paper: https://arxiv.org/abs/2603.13660 |
| |
| ## Download |
| |
| Using the Hugging Face CLI: |
| |
| ```bash |
| hf download StanfordAIMI/MASS mass_base.pth --local-dir checkpoints |
| ``` |
| |
| Using Python: |
| |
| ```python |
| from huggingface_hub import hf_hub_download |
| |
| checkpoint_path = hf_hub_download("StanfordAIMI/MASS", "mass_base.pth") |
| ``` |
| |
| ## Raw NIfTI In-Context Inference |
| |
| ```bash |
| python inference.py \ |
| --checkpoint checkpoints/mass_base.pth \ |
| --test-image /path/to/test_image.nii.gz \ |
| --reference-image /path/to/reference_image.nii.gz \ |
| --reference-mask /path/to/reference_mask.nii.gz \ |
| --output outputs/test_image_seg.nii.gz \ |
| --gpu 0 \ |
| --use-ema \ |
| --modality ct \ |
| --orientation RAS \ |
| --target-spacing 1.5 1.5 1.5 \ |
| --window-size 128 128 128 \ |
| --overlap 0.5 |
| ``` |
| |
| Please make sure the input NIfTI metadata is complete and reliable, especially |
| orientation and spacing. `mass_base.pth` was trained after standardizing images |
| to RAS orientation, so using `--orientation RAS` is recommended. |
|
|
| ## Downstream Segmentation Finetuning |
|
|
| ```bash |
| python train.py \ |
| --config config/downstream/segmentation_finetune_example.yaml \ |
| --gpu 0 \ |
| --name segmentation_finetune_example \ |
| --override \ |
| finetuning.pretrained_checkpoint=checkpoints/mass_base.pth \ |
| data.train.data_root=/path/to/mass_h5 \ |
| data.val.data_root=/path/to/mass_h5 \ |
| data.train.datasets='[example_segmentation]' \ |
| data.val.datasets='[example_segmentation]' |
| ``` |
|
|
| ## Classification Linear Probing |
|
|
| ```bash |
| python train.py \ |
| --config config/downstream/classification_linear_probe_example.yaml \ |
| --gpu 0 \ |
| --name classification_linear_probe_example \ |
| --override \ |
| classification.encoder.pretrained_checkpoint=checkpoints/mass_base.pth \ |
| classification.num_classes=2 \ |
| data.train.data_root=/path/to/classification_data \ |
| data.val.data_root=/path/to/classification_data \ |
| data.train.datasets='[example_classification]' \ |
| data.val.datasets='[example_classification]' |
| ``` |
|
|
| ## Training Details |
|
|
| - Architecture: Iris in-context segmentation architecture. |
| - Pretraining objective: MASS mask-guided self-supervised learning. |
| - Supervision during pretraining: automatically generated class-agnostic masks. |
| - Expert annotations during pretraining: none. |
| - Modalities: 3D CT, MRI, and PET volumes used in the MASS paper. |
|
|
| The MASS objective is compatible with other in-context segmentation |
| architectures. The official codebase includes preprocessing and pretraining |
| utilities for training MASS on your own data. |
|
|
| ## Limitations |
|
|
| - This checkpoint is intended for research use. |
| - It is not a medical device and should not be used for clinical decision-making. |
| - Raw NIfTI inference depends on reliable image metadata and preprocessing |
| choices. Cases with missing or incorrect spacing/orientation metadata should be |
| inspected carefully. |
| - Task-specific finetuning or validation is recommended before using the model on |
| a new dataset or anatomy. |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{gao2026learning, |
| title={Learning Generalizable 3D Medical Image Representations from Mask-Guided Self-Supervision}, |
| author={Gao, Yunhe and Zhang, Yabin and Wang, Chong and Liu, Jiaming and Varma, Maya and Delbrouck, Jean-Benoit and Chaudhari, Akshay and Langlotz, Curtis}, |
| journal={arXiv preprint arXiv:2603.13660}, |
| year={2026} |
| } |
| ``` |
|
|