| --- |
| datasets: |
| - TCGA |
| library_name: pytorch |
| license: mit |
| pipeline_tag: other |
| tags: |
| - medical-imaging |
| - computational-pathology |
| - survival-analysis |
| - multimodal |
| - tcga |
| - interpretable |
| --- |
| |
| # ProtoPathway |
|
|
| Pretrained checkpoints, preprocessed cohort data, and the curated pathway graph for **ProtoPathway**, an interpretable-by-design multimodal framework for cancer survival prediction. |
|
|
| - **Paper:** [ProtoPathway: Biologically Structured Prototype-Pathway Fusion for Multimodal Cancer Survival Prediction](https://huggingface.co/papers/2605.21454) |
| - **Code:** [https://github.com/AmayaGS/ProtoPathway](https://github.com/AmayaGS/ProtoPathway) |
|
|
| ## Layout |
|
|
| ``` |
| pathways/pathways_base_*.pkl curated Reactome + Hallmark pathway graph |
| raw_inputs/ raw files for re-running preprocessing from scratch |
| Reactome/ Reactome hierarchy files (GMT, relations, names) |
| Hallmark/ MSigDB Hallmark gene sets |
| {cohort}/ rna_clean.csv, clinical CSV, SurvPath splits |
| cohorts/{cohort}/ preprocessed cohort data and trained models |
| gene_expression.csv preprocessed expression matrix |
| bipartite_graph.pt cohort-specific gene-pathway graph |
| labels.csv survival times, events, and bins |
| data_splits.pkl 5-fold CV splits (SurvPath-compatible) |
| checkpoints/best_fold_{0..4}.pt trained model weights |
| ``` |
|
|
| ## Cohorts |
|
|
| Five TCGA cohorts: BRCA (N=714), BLCA (N=359), COADREAD (N=227), |
| HNSC (N=392), STAD (N=318). Gene expression is the preprocessed |
| SurvPath release. WSI patch features (UNI2-h) are not redistributed |
| here and should be obtained from the |
| [Mahmood Lab](https://huggingface.co/MahmoodLab/UNI2-h) directly. |
|
|
| ## Quick load |
|
|
| ```python |
| from huggingface_hub import snapshot_download |
| |
| # Everything for one cohort plus the shared pathway file |
| snapshot_download( |
| repo_id="AmayaGS/ProtoPathway", |
| local_dir="./assets", |
| allow_patterns=["cohorts/TCGA-BLCA/*", "pathways/*"], |
| ) |
| ``` |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{protopathway2026, |
| title = {ProtoPathway: Biologically Structured Prototype-Pathway Fusion for Multimodal Cancer Survival Prediction}, |
| author = {Amaya Gallagher-Syed, Costantino Pitzalis, Myles J. Lewis, Michael R. |
| Barnes, Gregory Slabaugh}, |
| journal = {arXiv preprint arXiv:2605.21454}, |
| year = {2026}, |
| } |
| ``` |