MASS / README.md

Upload folder using huggingface_hub

c1bc69c verified 4 days ago

4.61 kB

	---
	license: mit
	library_name: pytorch
	tags:
	- medical-image-segmentation
	- 3d-medical-imaging
	- self-supervised-learning
	- in-context-segmentation
	- pytorch
	- arxiv:2603.13660
	pipeline_tag: image-segmentation
	---

	# MASS Base Checkpoint

	This repository hosts `mass_base.pth`, the base checkpoint for **MASS: Learning
	Generalizable 3D Medical Image Representations from Mask-Guided
	Self-Supervision**.

	MASS is a mask-guided self-supervised learning framework for 3D medical images.
	The released checkpoint was trained with the data used in our paper and the Iris
	in-context segmentation architecture. It uses automatically generated
	class-agnostic masks for pretraining and does not use expert ground-truth
	annotations during pretraining.

	## What This Checkpoint Is For

	`mass_base.pth` can be used with the official MASS codebase for:

	- training-free in-context segmentation with reference image-mask examples;
	- initialization for downstream segmentation finetuning;
	- frozen-encoder or finetuned encoder classification experiments.

	This is a PyTorch checkpoint for the MASS/Iris architecture, not a standalone
	Transformers model. Please use it with the code release:

	- GitHub: https://github.com/Stanford-AIMI/MASS
	- Project page: https://yhygao.github.io/MASS_page/
	- Paper: https://arxiv.org/abs/2603.13660

	## Download

	Using the Hugging Face CLI:

	```bash
	hf download StanfordAIMI/MASS mass_base.pth --local-dir checkpoints
	```

	Using Python:

	```python
	from huggingface_hub import hf_hub_download

	checkpoint_path = hf_hub_download("StanfordAIMI/MASS", "mass_base.pth")
	```

	## Raw NIfTI In-Context Inference

	```bash
	python inference.py \
	--checkpoint checkpoints/mass_base.pth \
	--test-image /path/to/test_image.nii.gz \
	--reference-image /path/to/reference_image.nii.gz \
	--reference-mask /path/to/reference_mask.nii.gz \
	--output outputs/test_image_seg.nii.gz \
	--gpu 0 \
	--use-ema \
	--modality ct \
	--orientation RAS \
	--target-spacing 1.5 1.5 1.5 \
	--window-size 128 128 128 \
	--overlap 0.5
	```

	Please make sure the input NIfTI metadata is complete and reliable, especially
	orientation and spacing. `mass_base.pth` was trained after standardizing images
	to RAS orientation, so using `--orientation RAS` is recommended.

	## Downstream Segmentation Finetuning

	```bash
	python train.py \
	--config config/downstream/segmentation_finetune_example.yaml \
	--gpu 0 \
	--name segmentation_finetune_example \
	--override \
	finetuning.pretrained_checkpoint=checkpoints/mass_base.pth \
	data.train.data_root=/path/to/mass_h5 \
	data.val.data_root=/path/to/mass_h5 \
	data.train.datasets='[example_segmentation]' \
	data.val.datasets='[example_segmentation]'
	```

	## Classification Linear Probing

	```bash
	python train.py \
	--config config/downstream/classification_linear_probe_example.yaml \
	--gpu 0 \
	--name classification_linear_probe_example \
	--override \
	classification.encoder.pretrained_checkpoint=checkpoints/mass_base.pth \
	classification.num_classes=2 \
	data.train.data_root=/path/to/classification_data \
	data.val.data_root=/path/to/classification_data \
	data.train.datasets='[example_classification]' \
	data.val.datasets='[example_classification]'
	```

	## Training Details

	- Architecture: Iris in-context segmentation architecture.
	- Pretraining objective: MASS mask-guided self-supervised learning.
	- Supervision during pretraining: automatically generated class-agnostic masks.
	- Expert annotations during pretraining: none.
	- Modalities: 3D CT, MRI, and PET volumes used in the MASS paper.

	The MASS objective is compatible with other in-context segmentation
	architectures. The official codebase includes preprocessing and pretraining
	utilities for training MASS on your own data.

	## Limitations

	- This checkpoint is intended for research use.
	- It is not a medical device and should not be used for clinical decision-making.
	- Raw NIfTI inference depends on reliable image metadata and preprocessing
	choices. Cases with missing or incorrect spacing/orientation metadata should be
	inspected carefully.
	- Task-specific finetuning or validation is recommended before using the model on
	a new dataset or anatomy.

	## Citation

	```bibtex
	@article{gao2026learning,
	title={Learning Generalizable 3D Medical Image Representations from Mask-Guided Self-Supervision},
	author={Gao, Yunhe and Zhang, Yabin and Wang, Chong and Liu, Jiaming and Varma, Maya and Delbrouck, Jean-Benoit and Chaudhari, Akshay and Langlotz, Curtis},
	journal={arXiv preprint arXiv:2603.13660},
	year={2026}
	}
	```