hadamelino nielsr HF Staff commited on
Commit
468f6a4
·
1 Parent(s): 076b35e

Update pipeline tag and improve model card documentation (#1)

Browse files

- Update pipeline tag and improve model card documentation (f5809d8718c80f82abb57f2831c8995ae1d88b6d)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +31 -108
README.md CHANGED
@@ -1,9 +1,9 @@
1
  ---
2
- license: mit
3
  language:
4
  - en
5
  library_name: pytorch
6
- pipeline_tag: feature-extraction
 
7
  tags:
8
  - cgm
9
  - continuous-glucose-monitor
@@ -18,11 +18,13 @@ tags:
18
 
19
  # CGM-JEPA Pretrained Encoders
20
 
21
- Frozen self-supervised encoder weights from the paper *CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining*. The repo contains the **exact checkpoints used to produce Tables 1–8 of the paper** for both the paper's main contributions (CGM-JEPA, X-CGM-JEPA) and the two re-pretrained baselines (GluFormer, TS2Vec).
22
 
23
- > Companion repos: pretraining dataset [`CRUISEResearchGroup/CGM-JEPA-Pretraining`](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Pretraining), labeled splits [`CRUISEResearchGroup/CGM-JEPA-Downstream`](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Downstream), code [github.com/cruiseresearchgroup/CGM-JEPA](https://github.com/cruiseresearchgroup/CGM-JEPA).
24
 
25
- > **MOMENT and Mantis are not redistributed here.** Those baselines are loaded directly from their upstream HF repos (`AutonLab/MOMENT-1-{small,large}`, `paris-noah/Mantis-8M`) by the eval pipeline.
 
 
26
 
27
  ## Quick start
28
 
@@ -54,64 +56,7 @@ The downstream eval will load all four checkpoints automatically from the subdir
54
  └── ts2vec.pkl
55
  ```
56
 
57
- `cgm_jepa/` and `x_cgm_jepa/` use the standard `PyTorchModelHubMixin` layout — `model.safetensors` for weights, `config.json` for architecture hyperparameters — so they load via the standard `from_pretrained` one-liner (see [Loading examples](#loading-examples)).
58
-
59
- `baselines/gluformer.pt` is `{"encoder": state_dict}` and `baselines/ts2vec.pkl` is a full pickled `TS2Vec` model object (per the upstream library's convention). Their architectures are documented in the [Architectures](#architectures) section.
60
-
61
- ### Important note on the baselines
62
-
63
- `gluformer.pt` and `ts2vec.pkl` are **not** vendored from upstream releases of those methods. They were **re-pretrained on the same open CGM corpus and compute budget as CGM-JEPA / X-CGM-JEPA** (Stanford + Colas, 101 epochs, batch 128, lr 1e-4, seed 43) so that the comparison in the paper isolates the pretraining objective rather than mixing in corpus or compute differences. Use these checkpoints when reproducing paper numbers; for other settings, prefer the original authors' releases.
64
-
65
- ## Architectures
66
-
67
- ### `cgm_jepa/cgm_jepa.pt` and `x_cgm_jepa/x_cgm_jepa.pt`
68
-
69
- Both use the same `models.encoder.Encoder` class with identical hyperparameters; only the pretraining objective differs. At downstream / inference time only the temporal encoder is used, so the two checkpoints are drop-in interchangeable.
70
-
71
- | Field | Value |
72
- |---|---|
73
- | `patch_size` | 12 |
74
- | `encoder_kernel_size` | 3 |
75
- | `encoder_embed_dim` | 96 |
76
- | `encoder_embed_bias` | `True` |
77
- | `encoder_nhead` | 6 |
78
- | `encoder_num_layers` | 3 |
79
- | `encoder_dropout` | 0.0 |
80
-
81
- Input: a tensor of shape `(B, num_patches, patch_size)` (raw glucose values, z-scored).
82
- Output: per-patch embedding of shape `(B, num_patches, embed_dim)`. Pool with `.mean(dim=1)` for a single embedding per sample.
83
-
84
- X-CGM-JEPA adds a second pretraining branch that predicts Glucodensity image patches; only the temporal encoder is loaded at inference.
85
-
86
- ### `baselines/gluformer.pt`
87
-
88
- `models.gluformer.GluFormer`:
89
-
90
- | Field | Value |
91
- |---|---|
92
- | `vocab_size` | 278 |
93
- | `embed_dim` | 96 |
94
- | `nhead` | 6 |
95
- | `num_layers` | 3 |
96
- | `dim_feedforward` | 192 |
97
- | `max_seq_length` | 25000 |
98
- | `dropout` | 0.0 |
99
- | `pad_token` | 278 (= `vocab_size`) |
100
-
101
- Input: a tensor of integer bin indices in `[0, vocab_size)` (raw glucose discretized into the 40–320 mg/dL range with width `(320 − 40) / vocab_size`). The downstream pipeline detaches GluFormer's output head and uses only the encoder embedding.
102
-
103
- ### `baselines/ts2vec.pkl`
104
-
105
- `models.ts2vec.TS2Vec` (loaded via `eval/baseline_utils/ts2vec_utils.py:load_pretrained_ts2vec`):
106
-
107
- | Field | Value |
108
- |---|---|
109
- | `input_dims` | 1 |
110
- | `output_dims` | 96 |
111
- | `hidden_dims` | 64 |
112
- | `depth` | 10 |
113
-
114
- Saved as a Python pickle of the full model object, matching the upstream `ts2vec` library convention.
115
 
116
  ## Loading examples
117
 
@@ -129,12 +74,6 @@ encoder.eval()
129
  encoder_x = Encoder.from_pretrained("CRUISEResearchGroup/CGM-JEPA", subfolder="x_cgm_jepa")
130
  ```
131
 
132
- `config.json` for each subfolder is auto-introspected from `Encoder.__init__`, so no architecture wiring is needed on the user side.
133
-
134
- ### From the CGM-JEPA code repository
135
-
136
- `config/model_configs.py` looks for these checkpoints under `Output/cgm_jepa/`, `Output/x_cgm_jepa/`, and `Output/baselines/`. The `huggingface-cli download CRUISEResearchGroup/CGM-JEPA --local-dir Output` flow above produces exactly that structure, so the eval pipeline picks them up automatically.
137
-
138
  ### Standalone PyTorch — GluFormer
139
 
140
  ```python
@@ -160,55 +99,39 @@ gluformer.output_head = nn.Identity() # discard the LM head for embedding extr
160
  gluformer.eval()
161
  ```
162
 
163
- ### Standalone PyTorch — TS2Vec
164
-
165
- ```python
166
- from eval.baseline_utils.ts2vec_utils import load_pretrained_ts2vec
167
-
168
- ts2vec = load_pretrained_ts2vec(
169
- checkpoint_path="Output/baselines/ts2vec.pkl",
170
- device="cpu",
171
- input_dims=1,
172
- output_dims=96,
173
- hidden_dims=64,
174
- depth=10,
175
- )
176
- ```
177
 
178
- ## Pretraining
179
 
180
- All four encoders were pretrained on the [CGM-JEPA pretraining corpus](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Pretraining) under identical conditions:
181
 
182
- | Setting | Value |
183
  |---|---|
184
- | Corpus | 228 subjects (22 Stanford + 206 Colas), 389,365 readings at 5-min sampling |
185
- | Window length | 288 timesteps (24 hours) |
186
- | Masking ratio | 0.25 |
187
- | Epochs | 101 |
188
- | Batch size | 128 |
189
- | Learning rate | 1e-4 |
190
- | Random seed | 43 |
191
 
192
- See [`config/config_pretrain.py`](https://github.com/cruiseresearchgroup/CGM-JEPA/blob/main/config/config_pretrain.py) for the full configuration.
 
193
 
194
  ## Intended use
195
 
196
  - **Frozen feature extraction** from raw CGM windows (24-hour, 5-min sampled, 288 timesteps).
197
- - **Linear-probe or shallow-classifier downstream evaluation**, especially the IR / β-cell dysfunction tasks in the paper.
198
- - **Comparison baseline** for new CGM representation methods, with identical pretraining conditions across all four encoders shipped here.
199
-
200
- ## License & attribution
201
-
202
- Released under the **MIT license**. When using these weights, please cite:
203
-
204
- 1. Our paper (citation TBD; see code repo).
205
- 2. The two upstream pretraining datasets — Metwally et al. 2025 (*Nature Biomedical Engineering*) and Colas et al. 2019 (*PLOS ONE*).
206
- 3. The original baseline papers when using `gluformer.pt` or `ts2vec.pkl`.
207
 
208
  ## Citation
209
 
210
- > _Citation block to be filled once the CGM-JEPA paper has a stable venue / arXiv link._
211
-
212
- ## Code repository
 
 
 
 
 
213
 
214
- [github.com/cruiseresearchgroup/CGM-JEPA](https://github.com/cruiseresearchgroup/CGM-JEPA)
 
 
1
  ---
 
2
  language:
3
  - en
4
  library_name: pytorch
5
+ license: mit
6
+ pipeline_tag: time-series-forecasting
7
  tags:
8
  - cgm
9
  - continuous-glucose-monitor
 
18
 
19
  # CGM-JEPA Pretrained Encoders
20
 
21
+ This repository contains frozen self-supervised encoder weights from the paper [CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining](https://huggingface.co/papers/2605.00933).
22
 
23
+ The repo contains the **exact checkpoints used to produce Tables 1–8 of the paper** for both the paper's main contributions (CGM-JEPA, X-CGM-JEPA) and the two re-pretrained baselines (GluFormer, TS2Vec).
24
 
25
+ - **Code:** [github.com/cruiseresearchgroup/CGM-JEPA](https://github.com/cruiseresearchgroup/CGM-JEPA)
26
+ - **Pretraining Dataset:** [`CRUISEResearchGroup/CGM-JEPA-Pretraining`](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Pretraining)
27
+ - **Downstream Splits:** [`CRUISEResearchGroup/CGM-JEPA-Downstream`](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Downstream)
28
 
29
  ## Quick start
30
 
 
56
  └── ts2vec.pkl
57
  ```
58
 
59
+ `cgm_jepa/` and `x_cgm_jepa/` use the standard `PyTorchModelHubMixin` layout — `model.safetensors` for weights, `config.json` for architecture hyperparameters — so they load via the standard `from_pretrained` one-liner.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
 
61
  ## Loading examples
62
 
 
74
  encoder_x = Encoder.from_pretrained("CRUISEResearchGroup/CGM-JEPA", subfolder="x_cgm_jepa")
75
  ```
76
 
 
 
 
 
 
 
77
  ### Standalone PyTorch — GluFormer
78
 
79
  ```python
 
99
  gluformer.eval()
100
  ```
101
 
102
+ ## Architectures
 
 
 
 
 
 
 
 
 
 
 
 
 
103
 
104
+ ### `cgm_jepa` and `x_cgm_jepa`
105
 
106
+ Both use the same `models.encoder.Encoder` class with identical hyperparameters; only the pretraining objective differs. At downstream / inference time only the temporal encoder is used.
107
 
108
+ | Field | Value |
109
  |---|---|
110
+ | `patch_size` | 12 |
111
+ | `encoder_kernel_size` | 3 |
112
+ | `encoder_embed_dim` | 96 |
113
+ | `encoder_nhead` | 6 |
114
+ | `encoder_num_layers` | 3 |
 
 
115
 
116
+ **Input**: a tensor of shape `(B, num_patches, patch_size)` (raw glucose values, z-scored).
117
+ **Output**: per-patch embedding of shape `(B, num_patches, embed_dim)`.
118
 
119
  ## Intended use
120
 
121
  - **Frozen feature extraction** from raw CGM windows (24-hour, 5-min sampled, 288 timesteps).
122
+ - **Linear-probe evaluation**, especially for the metabolic subphenotyping tasks (IR / β-cell dysfunction) described in the paper.
123
+ - **Comparison baseline** for new CGM representation methods.
 
 
 
 
 
 
 
 
124
 
125
  ## Citation
126
 
127
+ ```bibtex
128
+ @article{muhammad2026cgm,
129
+ title = {CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining},
130
+ author = {Muhammad, Hada Melino and Li, Zechen and Salim, Flora and Metwally, Ahmed A},
131
+ journal = {arXiv preprint arXiv:2605.00933},
132
+ year = {2026}
133
+ }
134
+ ```
135
 
136
+ ## License
137
+ Released under the **MIT license**.