--- datasets: - TCGA library_name: pytorch license: mit pipeline_tag: other tags: - medical-imaging - computational-pathology - survival-analysis - multimodal - tcga - interpretable --- # ProtoPathway Pretrained checkpoints, preprocessed cohort data, and the curated pathway graph for **ProtoPathway**, an interpretable-by-design multimodal framework for cancer survival prediction. - **Paper:** [ProtoPathway: Biologically Structured Prototype-Pathway Fusion for Multimodal Cancer Survival Prediction](https://huggingface.co/papers/2605.21454) - **Code:** [https://github.com/AmayaGS/ProtoPathway](https://github.com/AmayaGS/ProtoPathway) ## Layout ``` pathways/pathways_base_*.pkl curated Reactome + Hallmark pathway graph raw_inputs/ raw files for re-running preprocessing from scratch Reactome/ Reactome hierarchy files (GMT, relations, names) Hallmark/ MSigDB Hallmark gene sets {cohort}/ rna_clean.csv, clinical CSV, SurvPath splits cohorts/{cohort}/ preprocessed cohort data and trained models gene_expression.csv preprocessed expression matrix bipartite_graph.pt cohort-specific gene-pathway graph labels.csv survival times, events, and bins data_splits.pkl 5-fold CV splits (SurvPath-compatible) checkpoints/best_fold_{0..4}.pt trained model weights ``` ## Cohorts Five TCGA cohorts: BRCA (N=714), BLCA (N=359), COADREAD (N=227), HNSC (N=392), STAD (N=318). Gene expression is the preprocessed SurvPath release. WSI patch features (UNI2-h) are not redistributed here and should be obtained from the [Mahmood Lab](https://huggingface.co/MahmoodLab/UNI2-h) directly. ## Quick load ```python from huggingface_hub import snapshot_download # Everything for one cohort plus the shared pathway file snapshot_download( repo_id="AmayaGS/ProtoPathway", local_dir="./assets", allow_patterns=["cohorts/TCGA-BLCA/*", "pathways/*"], ) ``` ## Citation ```bibtex @article{protopathway2026, title = {ProtoPathway: Biologically Structured Prototype-Pathway Fusion for Multimodal Cancer Survival Prediction}, author = {Amaya Gallagher-Syed, Costantino Pitzalis, Myles J. Lewis, Michael R. Barnes, Gregory Slabaugh}, journal = {arXiv preprint arXiv:2605.21454}, year = {2026}, } ```