--- license: mit library_name: pytorch tags: - medical-image-segmentation - 3d-medical-imaging - self-supervised-learning - in-context-segmentation - pytorch - arxiv:2603.13660 pipeline_tag: image-segmentation --- # MASS Base Checkpoint This repository hosts `mass_base.pth`, the base checkpoint for **MASS: Learning Generalizable 3D Medical Image Representations from Mask-Guided Self-Supervision**. MASS is a mask-guided self-supervised learning framework for 3D medical images. The released checkpoint was trained with the data used in our paper and the Iris in-context segmentation architecture. It uses automatically generated class-agnostic masks for pretraining and does **not** use expert ground-truth annotations during pretraining. ## What This Checkpoint Is For `mass_base.pth` can be used with the official MASS codebase for: - training-free in-context segmentation with reference image-mask examples; - initialization for downstream segmentation finetuning; - frozen-encoder or finetuned encoder classification experiments. This is a PyTorch checkpoint for the MASS/Iris architecture, not a standalone Transformers model. Please use it with the code release: - GitHub: https://github.com/Stanford-AIMI/MASS - Project page: https://yhygao.github.io/MASS_page/ - Paper: https://arxiv.org/abs/2603.13660 ## Download Using the Hugging Face CLI: ```bash hf download StanfordAIMI/MASS mass_base.pth --local-dir checkpoints ``` Using Python: ```python from huggingface_hub import hf_hub_download checkpoint_path = hf_hub_download("StanfordAIMI/MASS", "mass_base.pth") ``` ## Raw NIfTI In-Context Inference ```bash python inference.py \ --checkpoint checkpoints/mass_base.pth \ --test-image /path/to/test_image.nii.gz \ --reference-image /path/to/reference_image.nii.gz \ --reference-mask /path/to/reference_mask.nii.gz \ --output outputs/test_image_seg.nii.gz \ --gpu 0 \ --use-ema \ --modality ct \ --orientation RAS \ --target-spacing 1.5 1.5 1.5 \ --window-size 128 128 128 \ --overlap 0.5 ``` Please make sure the input NIfTI metadata is complete and reliable, especially orientation and spacing. `mass_base.pth` was trained after standardizing images to RAS orientation, so using `--orientation RAS` is recommended. ## Downstream Segmentation Finetuning ```bash python train.py \ --config config/downstream/segmentation_finetune_example.yaml \ --gpu 0 \ --name segmentation_finetune_example \ --override \ finetuning.pretrained_checkpoint=checkpoints/mass_base.pth \ data.train.data_root=/path/to/mass_h5 \ data.val.data_root=/path/to/mass_h5 \ data.train.datasets='[example_segmentation]' \ data.val.datasets='[example_segmentation]' ``` ## Classification Linear Probing ```bash python train.py \ --config config/downstream/classification_linear_probe_example.yaml \ --gpu 0 \ --name classification_linear_probe_example \ --override \ classification.encoder.pretrained_checkpoint=checkpoints/mass_base.pth \ classification.num_classes=2 \ data.train.data_root=/path/to/classification_data \ data.val.data_root=/path/to/classification_data \ data.train.datasets='[example_classification]' \ data.val.datasets='[example_classification]' ``` ## Training Details - Architecture: Iris in-context segmentation architecture. - Pretraining objective: MASS mask-guided self-supervised learning. - Supervision during pretraining: automatically generated class-agnostic masks. - Expert annotations during pretraining: none. - Modalities: 3D CT, MRI, and PET volumes used in the MASS paper. The MASS objective is compatible with other in-context segmentation architectures. The official codebase includes preprocessing and pretraining utilities for training MASS on your own data. ## Limitations - This checkpoint is intended for research use. - It is not a medical device and should not be used for clinical decision-making. - Raw NIfTI inference depends on reliable image metadata and preprocessing choices. Cases with missing or incorrect spacing/orientation metadata should be inspected carefully. - Task-specific finetuning or validation is recommended before using the model on a new dataset or anatomy. ## Citation ```bibtex @article{gao2026learning, title={Learning Generalizable 3D Medical Image Representations from Mask-Guided Self-Supervision}, author={Gao, Yunhe and Zhang, Yabin and Wang, Chong and Liu, Jiaming and Varma, Maya and Delbrouck, Jean-Benoit and Chaudhari, Akshay and Langlotz, Curtis}, journal={arXiv preprint arXiv:2603.13660}, year={2026} } ```