Commit ·
468f6a4
1
Parent(s): 076b35e
Update pipeline tag and improve model card documentation (#1)
Browse files- Update pipeline tag and improve model card documentation (f5809d8718c80f82abb57f2831c8995ae1d88b6d)
Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -1,9 +1,9 @@
|
|
| 1 |
---
|
| 2 |
-
license: mit
|
| 3 |
language:
|
| 4 |
- en
|
| 5 |
library_name: pytorch
|
| 6 |
-
|
|
|
|
| 7 |
tags:
|
| 8 |
- cgm
|
| 9 |
- continuous-glucose-monitor
|
|
@@ -18,11 +18,13 @@ tags:
|
|
| 18 |
|
| 19 |
# CGM-JEPA Pretrained Encoders
|
| 20 |
|
| 21 |
-
|
| 22 |
|
| 23 |
-
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
| 26 |
|
| 27 |
## Quick start
|
| 28 |
|
|
@@ -54,64 +56,7 @@ The downstream eval will load all four checkpoints automatically from the subdir
|
|
| 54 |
└── ts2vec.pkl
|
| 55 |
```
|
| 56 |
|
| 57 |
-
`cgm_jepa/` and `x_cgm_jepa/` use the standard `PyTorchModelHubMixin` layout — `model.safetensors` for weights, `config.json` for architecture hyperparameters — so they load via the standard `from_pretrained` one-liner
|
| 58 |
-
|
| 59 |
-
`baselines/gluformer.pt` is `{"encoder": state_dict}` and `baselines/ts2vec.pkl` is a full pickled `TS2Vec` model object (per the upstream library's convention). Their architectures are documented in the [Architectures](#architectures) section.
|
| 60 |
-
|
| 61 |
-
### Important note on the baselines
|
| 62 |
-
|
| 63 |
-
`gluformer.pt` and `ts2vec.pkl` are **not** vendored from upstream releases of those methods. They were **re-pretrained on the same open CGM corpus and compute budget as CGM-JEPA / X-CGM-JEPA** (Stanford + Colas, 101 epochs, batch 128, lr 1e-4, seed 43) so that the comparison in the paper isolates the pretraining objective rather than mixing in corpus or compute differences. Use these checkpoints when reproducing paper numbers; for other settings, prefer the original authors' releases.
|
| 64 |
-
|
| 65 |
-
## Architectures
|
| 66 |
-
|
| 67 |
-
### `cgm_jepa/cgm_jepa.pt` and `x_cgm_jepa/x_cgm_jepa.pt`
|
| 68 |
-
|
| 69 |
-
Both use the same `models.encoder.Encoder` class with identical hyperparameters; only the pretraining objective differs. At downstream / inference time only the temporal encoder is used, so the two checkpoints are drop-in interchangeable.
|
| 70 |
-
|
| 71 |
-
| Field | Value |
|
| 72 |
-
|---|---|
|
| 73 |
-
| `patch_size` | 12 |
|
| 74 |
-
| `encoder_kernel_size` | 3 |
|
| 75 |
-
| `encoder_embed_dim` | 96 |
|
| 76 |
-
| `encoder_embed_bias` | `True` |
|
| 77 |
-
| `encoder_nhead` | 6 |
|
| 78 |
-
| `encoder_num_layers` | 3 |
|
| 79 |
-
| `encoder_dropout` | 0.0 |
|
| 80 |
-
|
| 81 |
-
Input: a tensor of shape `(B, num_patches, patch_size)` (raw glucose values, z-scored).
|
| 82 |
-
Output: per-patch embedding of shape `(B, num_patches, embed_dim)`. Pool with `.mean(dim=1)` for a single embedding per sample.
|
| 83 |
-
|
| 84 |
-
X-CGM-JEPA adds a second pretraining branch that predicts Glucodensity image patches; only the temporal encoder is loaded at inference.
|
| 85 |
-
|
| 86 |
-
### `baselines/gluformer.pt`
|
| 87 |
-
|
| 88 |
-
`models.gluformer.GluFormer`:
|
| 89 |
-
|
| 90 |
-
| Field | Value |
|
| 91 |
-
|---|---|
|
| 92 |
-
| `vocab_size` | 278 |
|
| 93 |
-
| `embed_dim` | 96 |
|
| 94 |
-
| `nhead` | 6 |
|
| 95 |
-
| `num_layers` | 3 |
|
| 96 |
-
| `dim_feedforward` | 192 |
|
| 97 |
-
| `max_seq_length` | 25000 |
|
| 98 |
-
| `dropout` | 0.0 |
|
| 99 |
-
| `pad_token` | 278 (= `vocab_size`) |
|
| 100 |
-
|
| 101 |
-
Input: a tensor of integer bin indices in `[0, vocab_size)` (raw glucose discretized into the 40–320 mg/dL range with width `(320 − 40) / vocab_size`). The downstream pipeline detaches GluFormer's output head and uses only the encoder embedding.
|
| 102 |
-
|
| 103 |
-
### `baselines/ts2vec.pkl`
|
| 104 |
-
|
| 105 |
-
`models.ts2vec.TS2Vec` (loaded via `eval/baseline_utils/ts2vec_utils.py:load_pretrained_ts2vec`):
|
| 106 |
-
|
| 107 |
-
| Field | Value |
|
| 108 |
-
|---|---|
|
| 109 |
-
| `input_dims` | 1 |
|
| 110 |
-
| `output_dims` | 96 |
|
| 111 |
-
| `hidden_dims` | 64 |
|
| 112 |
-
| `depth` | 10 |
|
| 113 |
-
|
| 114 |
-
Saved as a Python pickle of the full model object, matching the upstream `ts2vec` library convention.
|
| 115 |
|
| 116 |
## Loading examples
|
| 117 |
|
|
@@ -129,12 +74,6 @@ encoder.eval()
|
|
| 129 |
encoder_x = Encoder.from_pretrained("CRUISEResearchGroup/CGM-JEPA", subfolder="x_cgm_jepa")
|
| 130 |
```
|
| 131 |
|
| 132 |
-
`config.json` for each subfolder is auto-introspected from `Encoder.__init__`, so no architecture wiring is needed on the user side.
|
| 133 |
-
|
| 134 |
-
### From the CGM-JEPA code repository
|
| 135 |
-
|
| 136 |
-
`config/model_configs.py` looks for these checkpoints under `Output/cgm_jepa/`, `Output/x_cgm_jepa/`, and `Output/baselines/`. The `huggingface-cli download CRUISEResearchGroup/CGM-JEPA --local-dir Output` flow above produces exactly that structure, so the eval pipeline picks them up automatically.
|
| 137 |
-
|
| 138 |
### Standalone PyTorch — GluFormer
|
| 139 |
|
| 140 |
```python
|
|
@@ -160,55 +99,39 @@ gluformer.output_head = nn.Identity() # discard the LM head for embedding extr
|
|
| 160 |
gluformer.eval()
|
| 161 |
```
|
| 162 |
|
| 163 |
-
##
|
| 164 |
-
|
| 165 |
-
```python
|
| 166 |
-
from eval.baseline_utils.ts2vec_utils import load_pretrained_ts2vec
|
| 167 |
-
|
| 168 |
-
ts2vec = load_pretrained_ts2vec(
|
| 169 |
-
checkpoint_path="Output/baselines/ts2vec.pkl",
|
| 170 |
-
device="cpu",
|
| 171 |
-
input_dims=1,
|
| 172 |
-
output_dims=96,
|
| 173 |
-
hidden_dims=64,
|
| 174 |
-
depth=10,
|
| 175 |
-
)
|
| 176 |
-
```
|
| 177 |
|
| 178 |
-
##
|
| 179 |
|
| 180 |
-
|
| 181 |
|
| 182 |
-
|
|
| 183 |
|---|---|
|
| 184 |
-
|
|
| 185 |
-
|
|
| 186 |
-
|
|
| 187 |
-
|
|
| 188 |
-
|
|
| 189 |
-
| Learning rate | 1e-4 |
|
| 190 |
-
| Random seed | 43 |
|
| 191 |
|
| 192 |
-
|
|
|
|
| 193 |
|
| 194 |
## Intended use
|
| 195 |
|
| 196 |
- **Frozen feature extraction** from raw CGM windows (24-hour, 5-min sampled, 288 timesteps).
|
| 197 |
-
- **Linear-probe
|
| 198 |
-
- **Comparison baseline** for new CGM representation methods
|
| 199 |
-
|
| 200 |
-
## License & attribution
|
| 201 |
-
|
| 202 |
-
Released under the **MIT license**. When using these weights, please cite:
|
| 203 |
-
|
| 204 |
-
1. Our paper (citation TBD; see code repo).
|
| 205 |
-
2. The two upstream pretraining datasets — Metwally et al. 2025 (*Nature Biomedical Engineering*) and Colas et al. 2019 (*PLOS ONE*).
|
| 206 |
-
3. The original baseline papers when using `gluformer.pt` or `ts2vec.pkl`.
|
| 207 |
|
| 208 |
## Citation
|
| 209 |
|
| 210 |
-
|
| 211 |
-
|
| 212 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 213 |
|
| 214 |
-
|
|
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
library_name: pytorch
|
| 5 |
+
license: mit
|
| 6 |
+
pipeline_tag: time-series-forecasting
|
| 7 |
tags:
|
| 8 |
- cgm
|
| 9 |
- continuous-glucose-monitor
|
|
|
|
| 18 |
|
| 19 |
# CGM-JEPA Pretrained Encoders
|
| 20 |
|
| 21 |
+
This repository contains frozen self-supervised encoder weights from the paper [CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining](https://huggingface.co/papers/2605.00933).
|
| 22 |
|
| 23 |
+
The repo contains the **exact checkpoints used to produce Tables 1–8 of the paper** for both the paper's main contributions (CGM-JEPA, X-CGM-JEPA) and the two re-pretrained baselines (GluFormer, TS2Vec).
|
| 24 |
|
| 25 |
+
- **Code:** [github.com/cruiseresearchgroup/CGM-JEPA](https://github.com/cruiseresearchgroup/CGM-JEPA)
|
| 26 |
+
- **Pretraining Dataset:** [`CRUISEResearchGroup/CGM-JEPA-Pretraining`](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Pretraining)
|
| 27 |
+
- **Downstream Splits:** [`CRUISEResearchGroup/CGM-JEPA-Downstream`](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Downstream)
|
| 28 |
|
| 29 |
## Quick start
|
| 30 |
|
|
|
|
| 56 |
└── ts2vec.pkl
|
| 57 |
```
|
| 58 |
|
| 59 |
+
`cgm_jepa/` and `x_cgm_jepa/` use the standard `PyTorchModelHubMixin` layout — `model.safetensors` for weights, `config.json` for architecture hyperparameters — so they load via the standard `from_pretrained` one-liner.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
|
| 61 |
## Loading examples
|
| 62 |
|
|
|
|
| 74 |
encoder_x = Encoder.from_pretrained("CRUISEResearchGroup/CGM-JEPA", subfolder="x_cgm_jepa")
|
| 75 |
```
|
| 76 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
### Standalone PyTorch — GluFormer
|
| 78 |
|
| 79 |
```python
|
|
|
|
| 99 |
gluformer.eval()
|
| 100 |
```
|
| 101 |
|
| 102 |
+
## Architectures
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
|
| 104 |
+
### `cgm_jepa` and `x_cgm_jepa`
|
| 105 |
|
| 106 |
+
Both use the same `models.encoder.Encoder` class with identical hyperparameters; only the pretraining objective differs. At downstream / inference time only the temporal encoder is used.
|
| 107 |
|
| 108 |
+
| Field | Value |
|
| 109 |
|---|---|
|
| 110 |
+
| `patch_size` | 12 |
|
| 111 |
+
| `encoder_kernel_size` | 3 |
|
| 112 |
+
| `encoder_embed_dim` | 96 |
|
| 113 |
+
| `encoder_nhead` | 6 |
|
| 114 |
+
| `encoder_num_layers` | 3 |
|
|
|
|
|
|
|
| 115 |
|
| 116 |
+
**Input**: a tensor of shape `(B, num_patches, patch_size)` (raw glucose values, z-scored).
|
| 117 |
+
**Output**: per-patch embedding of shape `(B, num_patches, embed_dim)`.
|
| 118 |
|
| 119 |
## Intended use
|
| 120 |
|
| 121 |
- **Frozen feature extraction** from raw CGM windows (24-hour, 5-min sampled, 288 timesteps).
|
| 122 |
+
- **Linear-probe evaluation**, especially for the metabolic subphenotyping tasks (IR / β-cell dysfunction) described in the paper.
|
| 123 |
+
- **Comparison baseline** for new CGM representation methods.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 124 |
|
| 125 |
## Citation
|
| 126 |
|
| 127 |
+
```bibtex
|
| 128 |
+
@article{muhammad2026cgm,
|
| 129 |
+
title = {CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining},
|
| 130 |
+
author = {Muhammad, Hada Melino and Li, Zechen and Salim, Flora and Metwally, Ahmed A},
|
| 131 |
+
journal = {arXiv preprint arXiv:2605.00933},
|
| 132 |
+
year = {2026}
|
| 133 |
+
}
|
| 134 |
+
```
|
| 135 |
|
| 136 |
+
## License
|
| 137 |
+
Released under the **MIT license**.
|