ProtoPathway / README.md
AmayaGS's picture
Improve model card metadata and links (#1)
0ad1f1a
---
datasets:
- TCGA
library_name: pytorch
license: mit
pipeline_tag: other
tags:
- medical-imaging
- computational-pathology
- survival-analysis
- multimodal
- tcga
- interpretable
---
# ProtoPathway
Pretrained checkpoints, preprocessed cohort data, and the curated pathway graph for **ProtoPathway**, an interpretable-by-design multimodal framework for cancer survival prediction.
- **Paper:** [ProtoPathway: Biologically Structured Prototype-Pathway Fusion for Multimodal Cancer Survival Prediction](https://huggingface.co/papers/2605.21454)
- **Code:** [https://github.com/AmayaGS/ProtoPathway](https://github.com/AmayaGS/ProtoPathway)
## Layout
```
pathways/pathways_base_*.pkl curated Reactome + Hallmark pathway graph
raw_inputs/ raw files for re-running preprocessing from scratch
Reactome/ Reactome hierarchy files (GMT, relations, names)
Hallmark/ MSigDB Hallmark gene sets
{cohort}/ rna_clean.csv, clinical CSV, SurvPath splits
cohorts/{cohort}/ preprocessed cohort data and trained models
gene_expression.csv preprocessed expression matrix
bipartite_graph.pt cohort-specific gene-pathway graph
labels.csv survival times, events, and bins
data_splits.pkl 5-fold CV splits (SurvPath-compatible)
checkpoints/best_fold_{0..4}.pt trained model weights
```
## Cohorts
Five TCGA cohorts: BRCA (N=714), BLCA (N=359), COADREAD (N=227),
HNSC (N=392), STAD (N=318). Gene expression is the preprocessed
SurvPath release. WSI patch features (UNI2-h) are not redistributed
here and should be obtained from the
[Mahmood Lab](https://huggingface.co/MahmoodLab/UNI2-h) directly.
## Quick load
```python
from huggingface_hub import snapshot_download
# Everything for one cohort plus the shared pathway file
snapshot_download(
repo_id="AmayaGS/ProtoPathway",
local_dir="./assets",
allow_patterns=["cohorts/TCGA-BLCA/*", "pathways/*"],
)
```
## Citation
```bibtex
@article{protopathway2026,
title = {ProtoPathway: Biologically Structured Prototype-Pathway Fusion for Multimodal Cancer Survival Prediction},
author = {Amaya Gallagher-Syed, Costantino Pitzalis, Myles J. Lewis, Michael R.
Barnes, Gregory Slabaugh},
journal = {arXiv preprint arXiv:2605.21454},
year = {2026},
}
```