M4A1TasteGood's picture
Update README
79b7033 verified
**Official implementation** for:
**Beyond ViT Tokens: Masked-Diffusion Pretrained Convolutional Pathology Foundation Model for Cell-Level Dense Prediction**
---
# ConvNeXt Masked-Diffusion (CMD): inference & downstream
**Deps:** Python 3.10+, `torch`, `torchvision`, `Pillow`, `numpy`, `timm`, `PyYAML`
**Weights:** Put files under `weights/` — e.g. `weights/CMD-L/pytorch_model.bin`, `weights/SegHead/best_model.pth`, `weights/H0-mini/pytorch_model.bin` (+ `config.json` next to H0-mini). H0-mini is not redistributed; get it from [bioptimus/H0-mini](https://huggingface.co/bioptimus/H0-mini).
## Inference
```bash
python infer.py \
--image test.png \
--cmd weights/CMD-L \
--seg-head weights/TNBC_SegHead/best_model.pth \
--pathology weights/H0-mini/pytorch_model.bin \
--output-dir outputs/tnbc
```
Outputs: `outputs/tnbc/test_mask_vis.png`, `outputs/tnbc/test_overlay.png`. `--image` can also be a folder.
## Downstream (fine-tune head)
Edit `configs/downstream.yaml` (`data.json` → your manifest), then:
```bash
python train_downstream.py --config configs/downstream.yaml
```
Paths in YAML are relative to `configs/` (e.g. `../weights/CMD-L`).
**Dataset JSON (short):** top-level `num_classes` and `data.train` / `data.val` / `data.test`; each item is `{ "image_path", "mask_path" }`. Paths are absolute or relative to the **JSON file’s directory**. Masks: single-channel class indices; `255` = ignore. If `val` is empty or omitted, `test` is used for validation (but `test` must exist).
Details: **`downstream/README.md`**.
## Citation
```bibtex
@misc{chen2026vittokensmaskeddiffusionpretrained,
title={Beyond ViT Tokens: Masked-Diffusion Pretrained Convolutional Pathology Foundation Model for Cell-Level Dense Prediction},
author={Weiming Chen and Xitong Ling and Zhenyang Cai and Xidong Wang and Jiawen Li and Tian Guan and Benyou Wang and Yonghong He},
year={2026},
eprint={2605.08276},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.08276},
}
```