M4A1TasteGood's picture
Update README
79b7033 verified

Official implementation for:

Beyond ViT Tokens: Masked-Diffusion Pretrained Convolutional Pathology Foundation Model for Cell-Level Dense Prediction


ConvNeXt Masked-Diffusion (CMD): inference & downstream

Deps: Python 3.10+, torch, torchvision, Pillow, numpy, timm, PyYAML

Weights: Put files under weights/ — e.g. weights/CMD-L/pytorch_model.bin, weights/SegHead/best_model.pth, weights/H0-mini/pytorch_model.bin (+ config.json next to H0-mini). H0-mini is not redistributed; get it from bioptimus/H0-mini.

Inference

python infer.py \
  --image test.png \
  --cmd weights/CMD-L \
  --seg-head weights/TNBC_SegHead/best_model.pth \
  --pathology weights/H0-mini/pytorch_model.bin \
  --output-dir outputs/tnbc

Outputs: outputs/tnbc/test_mask_vis.png, outputs/tnbc/test_overlay.png. --image can also be a folder.

Downstream (fine-tune head)

Edit configs/downstream.yaml (data.json → your manifest), then:

python train_downstream.py --config configs/downstream.yaml

Paths in YAML are relative to configs/ (e.g. ../weights/CMD-L).

Dataset JSON (short): top-level num_classes and data.train / data.val / data.test; each item is { "image_path", "mask_path" }. Paths are absolute or relative to the JSON file’s directory. Masks: single-channel class indices; 255 = ignore. If val is empty or omitted, test is used for validation (but test must exist).

Details: downstream/README.md.

Citation

@misc{chen2026vittokensmaskeddiffusionpretrained,
      title={Beyond ViT Tokens: Masked-Diffusion Pretrained Convolutional Pathology Foundation Model for Cell-Level Dense Prediction},
      author={Weiming Chen and Xitong Ling and Zhenyang Cai and Xidong Wang and Jiawen Li and Tian Guan and Benyou Wang and Yonghong He},
      year={2026},
      eprint={2605.08276},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2605.08276},
}