| ---
|
| license: mit
|
| tags:
|
| - medical-imaging
|
| - computational-pathology
|
| - survival-analysis
|
| - multimodal
|
| - tcga
|
| datasets:
|
| - TCGA
|
| library_name: pytorch
|
| ---
|
|
|
| # ProtoPathway
|
|
|
| Pretrained checkpoints, preprocessed cohort data, and the curated pathway
|
| graph for **ProtoPathway**, an interpretable-by-design multimodal framework
|
| for cancer survival prediction.
|
|
|
| See the [code repository](https://github.com/AmayaGS/ProtoPathway) for usage,
|
| training, and evaluation instructions.
|
|
|
| ## Layout
|
|
|
| ```
|
| pathways/ shared curated Reactome + Hallmark pathways
|
| cohorts/{cohort}/
|
| gene_expression.csv preprocessed expression matrix
|
| bipartite_graph.pt cohort-specific gene-pathway graph
|
| labels.csv survival times, events, and bins
|
| data_splits.pkl 5-fold CV splits (SurvPath-compatible)
|
| checkpoints/best_fold_{0..4}.pt trained model weights
|
| ```
|
|
|
| ## Cohorts
|
|
|
| Five TCGA cohorts: BRCA (N=714), BLCA (N=359), COADREAD (N=227),
|
| HNSC (N=392), STAD (N=318). Gene expression is the preprocessed
|
| SurvPath release. WSI patch features (UNI2-h) are not redistributed
|
| here and should be obtained from the
|
| [Mahmood Lab](https://huggingface.co/MahmoodLab/UNI2-h) directly.
|
|
|
| ## Quick load
|
|
|
| ```python
|
| from huggingface_hub import snapshot_download
|
|
|
| # Everything for one cohort plus the shared pathway file
|
| snapshot_download(
|
| repo_id="AmayaGS/ProtoPathway",
|
| local_dir="./assets",
|
| allow_patterns=["cohorts/TCGA-BLCA/*", "pathways/*"],
|
| )
|
| ```
|
|
|
| ## Citation
|
|
|
| ```bibtex
|
| @article{protopathway2026,
|
| title = {ProtoPathway: Interpretable Multimodal Cancer Survival Prediction
|
| via Prototype-Pathway Cross-Modal Attention},
|
| author = {...},
|
| year = {2026}
|
| }
|
| ```
|
|
|