metadata
license: mit
tags:
- medical-imaging
- computational-pathology
- survival-analysis
- multimodal
- tcga
datasets:
- TCGA
library_name: pytorch
ProtoPathway
Pretrained checkpoints, preprocessed cohort data, and the curated pathway graph for ProtoPathway, an interpretable-by-design multimodal framework for cancer survival prediction.
See the code repository for usage, training, and evaluation instructions.
Layout
pathways/ shared curated Reactome + Hallmark pathways
cohorts/{cohort}/
gene_expression.csv preprocessed expression matrix
bipartite_graph.pt cohort-specific gene-pathway graph
labels.csv survival times, events, and bins
data_splits.pkl 5-fold CV splits (SurvPath-compatible)
checkpoints/best_fold_{0..4}.pt trained model weights
Cohorts
Five TCGA cohorts: BRCA (N=714), BLCA (N=359), COADREAD (N=227), HNSC (N=392), STAD (N=318). Gene expression is the preprocessed SurvPath release. WSI patch features (UNI2-h) are not redistributed here and should be obtained from the Mahmood Lab directly.
Quick load
from huggingface_hub import snapshot_download
# Everything for one cohort plus the shared pathway file
snapshot_download(
repo_id="AmayaGS/ProtoPathway",
local_dir="./assets",
allow_patterns=["cohorts/TCGA-BLCA/*", "pathways/*"],
)
Citation
@article{protopathway2026,
title = {ProtoPathway: Interpretable Multimodal Cancer Survival Prediction
via Prototype-Pathway Cross-Modal Attention},
author = {...},
year = {2026}
}