gprolcastelo commited on
Commit
245b43f
·
verified ·
1 Parent(s): ce69f18

Upload 9 files

Browse files
BRCA/20251209_VAE_idim8954_md1024_feat512mse_relu.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d6db7847bbeedcdebeeca48f539dcc4d7cd47ce709fb722e76196d218eb7c06
3
+ size 79694716
BRCA/config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "INPUT_DIM": 8954,
3
+ "MID_DIM": 1024,
4
+ "LATENT_DIM": 512,
5
+ "BETA_CYCLES": 3,
6
+ "EPOCHS": 600,
7
+ "BETA_RATIO": 0.5,
8
+ "BATCH_SIZE": 8
9
+ }
BRCA/network_dims.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ in_dim,layer1_dim,layer2_dim,layer3_dim,out_dim
2
+ 8954,3104,790,4027,8954
BRCA/network_reconstruction.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a700d341da632fa0439795fbb73c183b549cb5b0af0acd4fe6b32ea592750f8b
3
+ size 278008948
KIRC/20250321_VAE_idim8516_md512_feat256mse_relu.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c65ff37ba6302358adc93f20ddf4dab7afe8a48f2da5ad796184d0eef2b0e1e8
3
+ size 36498544
KIRC/config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "INPUT_DIM": 8516,
3
+ "MID_DIM": 512,
4
+ "LATENT_DIM": 256,
5
+ "BETA_CYCLES": 3,
6
+ "EPOCHS": 600,
7
+ "BETA_RATIO": 0.5,
8
+ "BATCH_SIZE": 8
9
+ }
KIRC/network_dims.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ in_dim,layer1_dim,layer2_dim,layer3_dim,out_dim
2
+ 8516,3512,824,3731,8516
KIRC/network_reconstruction.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ac6c67e3f8b9e5829ffddc537920378ad57ceae64ac1a15ffce5bbf27af553cc
3
+ size 270668340
README.md CHANGED
@@ -1,3 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- license: apache-2.0
3
- ---
 
 
 
1
+ # Pretrained Models
2
+
3
+ This directory contains pretrained VAE and reconstruction network models obtained during the WP3 of the EVENFLOW EU project.
4
+
5
+ These models have been trained on a pre-processed version of the bulk RNA-Seq TCGA datasets of either KIRC or BRCA, independently (see data availability in the respective section).
6
+
7
+ ## Available Models
8
+
9
+ ### KIRC (Kidney Renal Clear Cell Carcinoma)
10
+
11
+ **Location**: `KIRC/`
12
+
13
+ *Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17987300)
14
+
15
+ **Model Files**:
16
+ - `20250321_VAE_idim8516_md512_feat256mse_relu.pth` - VAE weights
17
+ - `network_reconstruction.pth` - Reconstruction network weights
18
+ - `network_dims.csv` - Network architecture specifications
19
+
20
+ **Model Specifications**:
21
+ - Input dimension: 8,516 genes
22
+ - VAE architecture:
23
+ - Middle dimension: 512
24
+ - Latent dimension: 256
25
+ - Loss function: MSE
26
+ - Activation: ReLU
27
+ - Reconstruction network: [8954, 3512, 824, 3731, 8954]
28
+ - Training: Beta-VAE with 3 cycles, 600 epochs total
29
+
30
+ ### BRCA (Breast Invasive Carcinoma)
31
+
32
+ **Location**: `BRCA/`
33
+
34
+ *Data availability:* [Zenodo](https://doi.org/10.5281/zenodo.17986123)
35
+
36
+ **Model Files**:
37
+ - `20251209_VAE_idim8954_md1024_feat512mse_relu.pth` - VAE weights
38
+ - `network_reconstruction.pth` - Reconstruction network weights
39
+ - `network_dims.csv` - Network architecture specifications
40
+
41
+ **Model Specifications**:
42
+ - Input dimension: 8,954 genes
43
+ - VAE architecture:
44
+ - Middle dimension: 1,024
45
+ - Latent dimension: 512
46
+ - Loss function: MSE
47
+ - Activation: ReLU
48
+ - Reconstruction network: [8954, 3104, 790, 4027, 8954]
49
+ - Training: Beta-VAE with 3 cycles, 600 epochs total
50
+
51
+ ## Usage
52
+
53
+ ### Loading Models in Python
54
+
55
+ See [renalprog](https://www.github.com/gprolcastelo/renalprog) for the needed VAE and NetworkReconstruction objects.
56
+
57
+
58
+ ```python
59
+ import torch
60
+ import pandas as pd
61
+ import json
62
+ from pathlib import Path
63
+ import huggingface_hub as hf
64
+ from renalprog.modeling.train import VAE, NetworkReconstruction
65
+
66
+ # Configuration
67
+ cancer_type = "KIRC" # or "BRCA"
68
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
69
+
70
+ # ============================================================================
71
+ # Load VAE Model
72
+ # ============================================================================
73
+
74
+ # Download VAE config
75
+ vae_config_path = hf.hf_hub_download(
76
+ repo_id="gprolcastelo/evenflow_models",
77
+ filename=f"{cancer_type}/config.json"
78
+ )
79
+
80
+ # Load configuration
81
+ with open(vae_config_path, "r") as f:
82
+ vae_config = json.load(f)
83
+
84
+ print(f"VAE Configuration: {vae_config}")
85
+
86
+ # Download VAE model weights
87
+ if cancer_type == "KIRC":
88
+ vae_filename = "KIRC/20250321_VAE_idim8516_md512_feat256mse_relu.pth"
89
+ elif cancer_type == "BRCA":
90
+ vae_filename = "BRCA/20251209_VAE_idim8954_md1024_feat512mse_relu.pth"
91
+ else:
92
+ raise ValueError(f"Unknown cancer type: {cancer_type}")
93
+
94
+ vae_model_path = hf.hf_hub_download(
95
+ repo_id="gprolcastelo/evenflow_models",
96
+ filename=vae_filename
97
+ )
98
+
99
+ # Initialize and load VAE
100
+ model_vae = VAE(
101
+ input_dim=vae_config["INPUT_DIM"],
102
+ mid_dim=vae_config["MID_DIM"],
103
+ features=vae_config["LATENT_DIM"]
104
+ ).to(device)
105
+
106
+ checkpoint_vae = torch.load(vae_model_path, map_location=device, weights_only=False)
107
+ model_vae.load_state_dict(checkpoint_vae)
108
+ model_vae.eval()
109
+
110
+ print(f"VAE model loaded successfully from {cancer_type}")
111
+
112
+ # ============================================================================
113
+ # Load Reconstruction Network
114
+ # ============================================================================
115
+
116
+ # Download network dimensions
117
+ network_dims_path = hf.hf_hub_download(
118
+ repo_id="gprolcastelo/evenflow_models",
119
+ filename=f"{cancer_type}/network_dims.csv"
120
+ )
121
+
122
+ # Load network dimensions
123
+ network_dims = pd.read_csv(network_dims_path)
124
+ layer_dims = network_dims.values.tolist()[0]
125
+
126
+ print(f"Reconstruction Network dimensions: {layer_dims}")
127
+
128
+ # Download reconstruction network weights
129
+ recnet_model_path = hf.hf_hub_download(
130
+ repo_id="gprolcastelo/evenflow_models",
131
+ filename=f"{cancer_type}/network_reconstruction.pth"
132
+ )
133
+
134
+ # Initialize and load Reconstruction Network
135
+ model_recnet = NetworkReconstruction(layer_dims=layer_dims).to(device)
136
+ checkpoint_recnet = torch.load(recnet_model_path, map_location=device, weights_only=False)
137
+ model_recnet.load_state_dict(checkpoint_recnet)
138
+ model_recnet.eval()
139
+
140
+ print(f"Reconstruction Network loaded successfully from {cancer_type}")
141
+
142
+ # ============================================================================
143
+ # Use the models
144
+ # ============================================================================
145
+
146
+ # Example: Apply VAE to your data
147
+ # your_data = torch.tensor(your_data_array).float().to(device)
148
+ # with torch.no_grad():
149
+ # vae_output = model_vae(your_data)
150
+ # recnet_output = model_recnet(vae_output)
151
+
152
+ ```
153
+
154
+ ## Citation
155
+
156
+ !!! warning "Warning"
157
+ This citation is temporary. It will be updated when a pre-print is released.
158
+
159
+ If you use these pretrained models, please cite:
160
+
161
+ ```bibtex
162
+ @software{renalprog2024,
163
+ title = {RenalProg: A Deep Learning Framework for Kidney Cancer Progression Modeling},
164
+ author = {[Guillermo Prol-Castelo, Elina Syrri, Nikolaos Manginas, Vasileos Manginas, Nikos Katzouris, Davide Cirillo, George Paliouras, Alfonso Valencia]},
165
+ year = {2025},
166
+ url = {https://github.com/gprolcas/renalprog},
167
+ note = {Preprint in preparation}
168
+ }
169
+ ```
170
+
171
+ ## Training Details
172
+
173
+ These models were trained using:
174
+ - Random seed: 2023
175
+ - Train/test split: 80/20
176
+ - Optimizer: Adam
177
+ - Learning rate: 1e-4
178
+ - Batch size: 8
179
+ - Beta annealing (for VAE): 3 cycles with 0.5 ratio
180
+
181
+ ## Model Performance
182
+
183
+ **KIRC Model**:
184
+ - Reconstruction loss (test): ~1.1
185
+
186
+ **BRCA Model**:
187
+ - Reconstruction loss (test): ~0.9
188
+
189
+ ## License
190
+
191
+ These pretrained models are provided under the same Apache 2.0 license.
192
+
193
+ ## Contact
194
+
195
+ For questions about the pretrained models, please:
196
+ 1. Check the [documentation](https://gprolcastelo.github.io/renalprog/)
197
+ 2. Open an issue on [GitHub](https://github.com/gprolcastelo/renalprog/issues)
198
+ 3. Contact the authors
199
+
200
  ---
201
+
202
+ **Last Updated**: December 2025
203
+ **Version**: 1.0.0-alpha
204
+