CRUISEResearchGroup
/

CGM-JEPA

Time Series Forecasting

PyTorch

Safetensors

English

cgm

continuous-glucose-monitor

self-supervised-learning

Model card Files Files and versions

xet

Community

hadamelino

nielsr HF Staff commited on 14 days ago

Commit

468f6a4

1 Parent(s): 076b35e

Update pipeline tag and improve model card documentation (#1)

Browse files

- Update pipeline tag and improve model card documentation (f5809d8718c80f82abb57f2831c8995ae1d88b6d)

Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show

README.md +31 -108

README.md CHANGED Viewed

@@ -1,9 +1,9 @@
 ---
-license: mit
 language:
 - en
 library_name: pytorch
-pipeline_tag: feature-extraction
 tags:
 - cgm
 - continuous-glucose-monitor
@@ -18,11 +18,13 @@ tags:
 # CGM-JEPA Pretrained Encoders
-Frozen self-supervised encoder weights from the paper *CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining*. The repo contains the **exact checkpoints used to produce Tables 1–8 of the paper** for both the paper's main contributions (CGM-JEPA, X-CGM-JEPA) and the two re-pretrained baselines (GluFormer, TS2Vec).
-> Companion repos: pretraining dataset [`CRUISEResearchGroup/CGM-JEPA-Pretraining`](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Pretraining), labeled splits [`CRUISEResearchGroup/CGM-JEPA-Downstream`](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Downstream), code [github.com/cruiseresearchgroup/CGM-JEPA](https://github.com/cruiseresearchgroup/CGM-JEPA).
-> **MOMENT and Mantis are not redistributed here.** Those baselines are loaded directly from their upstream HF repos (`AutonLab/MOMENT-1-{small,large}`, `paris-noah/Mantis-8M`) by the eval pipeline.
 ## Quick start
@@ -54,64 +56,7 @@ The downstream eval will load all four checkpoints automatically from the subdir
     └── ts2vec.pkl
 ```
-`cgm_jepa/` and `x_cgm_jepa/` use the standard `PyTorchModelHubMixin` layout — `model.safetensors` for weights, `config.json` for architecture hyperparameters — so they load via the standard `from_pretrained` one-liner (see [Loading examples](#loading-examples)).
-`baselines/gluformer.pt` is `{"encoder": state_dict}` and `baselines/ts2vec.pkl` is a full pickled `TS2Vec` model object (per the upstream library's convention). Their architectures are documented in the [Architectures](#architectures) section.
-### Important note on the baselines
-`gluformer.pt` and `ts2vec.pkl` are **not** vendored from upstream releases of those methods. They were **re-pretrained on the same open CGM corpus and compute budget as CGM-JEPA / X-CGM-JEPA** (Stanford + Colas, 101 epochs, batch 128, lr 1e-4, seed 43) so that the comparison in the paper isolates the pretraining objective rather than mixing in corpus or compute differences. Use these checkpoints when reproducing paper numbers; for other settings, prefer the original authors' releases.
-## Architectures
-### `cgm_jepa/cgm_jepa.pt` and `x_cgm_jepa/x_cgm_jepa.pt`
-Both use the same `models.encoder.Encoder` class with identical hyperparameters; only the pretraining objective differs. At downstream / inference time only the temporal encoder is used, so the two checkpoints are drop-in interchangeable.
-| Field | Value |
-|---|---|
-| `patch_size` | 12 |
-| `encoder_kernel_size` | 3 |
-| `encoder_embed_dim` | 96 |
-| `encoder_embed_bias` | `True` |
-| `encoder_nhead` | 6 |
-| `encoder_num_layers` | 3 |
-| `encoder_dropout` | 0.0 |
-Input: a tensor of shape `(B, num_patches, patch_size)` (raw glucose values, z-scored).
-Output: per-patch embedding of shape `(B, num_patches, embed_dim)`. Pool with `.mean(dim=1)` for a single embedding per sample.
-X-CGM-JEPA adds a second pretraining branch that predicts Glucodensity image patches; only the temporal encoder is loaded at inference.
-### `baselines/gluformer.pt`
-`models.gluformer.GluFormer`:
-| Field | Value |
-|---|---|
-| `vocab_size` | 278 |
-| `embed_dim` | 96 |
-| `nhead` | 6 |
-| `num_layers` | 3 |
-| `dim_feedforward` | 192 |
-| `max_seq_length` | 25000 |
-| `dropout` | 0.0 |
-| `pad_token` | 278 (= `vocab_size`) |
-Input: a tensor of integer bin indices in `[0, vocab_size)` (raw glucose discretized into the 40–320 mg/dL range with width `(320 − 40) / vocab_size`). The downstream pipeline detaches GluFormer's output head and uses only the encoder embedding.
-### `baselines/ts2vec.pkl`
-`models.ts2vec.TS2Vec` (loaded via `eval/baseline_utils/ts2vec_utils.py:load_pretrained_ts2vec`):
-| Field | Value |
-|---|---|
-| `input_dims` | 1 |
-| `output_dims` | 96 |
-| `hidden_dims` | 64 |
-| `depth` | 10 |
-Saved as a Python pickle of the full model object, matching the upstream `ts2vec` library convention.
 ## Loading examples
@@ -129,12 +74,6 @@ encoder.eval()
 encoder_x = Encoder.from_pretrained("CRUISEResearchGroup/CGM-JEPA", subfolder="x_cgm_jepa")
 ```
-`config.json` for each subfolder is auto-introspected from `Encoder.__init__`, so no architecture wiring is needed on the user side.
-### From the CGM-JEPA code repository
-`config/model_configs.py` looks for these checkpoints under `Output/cgm_jepa/`, `Output/x_cgm_jepa/`, and `Output/baselines/`. The `huggingface-cli download CRUISEResearchGroup/CGM-JEPA --local-dir Output` flow above produces exactly that structure, so the eval pipeline picks them up automatically.
 ### Standalone PyTorch — GluFormer
 ```python
@@ -160,55 +99,39 @@ gluformer.output_head = nn.Identity()   # discard the LM head for embedding extr
 gluformer.eval()
 ```
-### Standalone PyTorch — TS2Vec
-```python
-from eval.baseline_utils.ts2vec_utils import load_pretrained_ts2vec
-ts2vec = load_pretrained_ts2vec(
-    checkpoint_path="Output/baselines/ts2vec.pkl",
-    device="cpu",
-    input_dims=1,
-    output_dims=96,
-    hidden_dims=64,
-    depth=10,
-)
-```
-## Pretraining
-All four encoders were pretrained on the [CGM-JEPA pretraining corpus](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Pretraining) under identical conditions:
-| Setting | Value |
 |---|---|
-| Corpus | 228 subjects (22 Stanford + 206 Colas), 389,365 readings at 5-min sampling |
-| Window length | 288 timesteps (24 hours) |
-| Masking ratio | 0.25 |
-| Epochs | 101 |
-| Batch size | 128 |
-| Learning rate | 1e-4 |
-| Random seed | 43 |
-See [`config/config_pretrain.py`](https://github.com/cruiseresearchgroup/CGM-JEPA/blob/main/config/config_pretrain.py) for the full configuration.
 ## Intended use
 - **Frozen feature extraction** from raw CGM windows (24-hour, 5-min sampled, 288 timesteps).
-- **Linear-probe or shallow-classifier downstream evaluation**, especially the IR / β-cell dysfunction tasks in the paper.
-- **Comparison baseline** for new CGM representation methods, with identical pretraining conditions across all four encoders shipped here.
-## License & attribution
-Released under the **MIT license**. When using these weights, please cite:
-1. Our paper (citation TBD; see code repo).
-2. The two upstream pretraining datasets — Metwally et al. 2025 (*Nature Biomedical Engineering*) and Colas et al. 2019 (*PLOS ONE*).
-3. The original baseline papers when using `gluformer.pt` or `ts2vec.pkl`.
 ## Citation
-> _Citation block to be filled once the CGM-JEPA paper has a stable venue / arXiv link._
-## Code repository
-[github.com/cruiseresearchgroup/CGM-JEPA](https://github.com/cruiseresearchgroup/CGM-JEPA)

 ---
 language:
 - en
 library_name: pytorch
+license: mit
+pipeline_tag: time-series-forecasting
 tags:
 - cgm
 - continuous-glucose-monitor
 # CGM-JEPA Pretrained Encoders
+This repository contains frozen self-supervised encoder weights from the paper [CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining](https://huggingface.co/papers/2605.00933).
+The repo contains the **exact checkpoints used to produce Tables 1–8 of the paper** for both the paper's main contributions (CGM-JEPA, X-CGM-JEPA) and the two re-pretrained baselines (GluFormer, TS2Vec).
+- **Code:** [github.com/cruiseresearchgroup/CGM-JEPA](https://github.com/cruiseresearchgroup/CGM-JEPA)
+- **Pretraining Dataset:** [`CRUISEResearchGroup/CGM-JEPA-Pretraining`](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Pretraining)
+- **Downstream Splits:** [`CRUISEResearchGroup/CGM-JEPA-Downstream`](https://huggingface.co/datasets/CRUISEResearchGroup/CGM-JEPA-Downstream)
 ## Quick start
     └── ts2vec.pkl
 ```
+`cgm_jepa/` and `x_cgm_jepa/` use the standard `PyTorchModelHubMixin` layout — `model.safetensors` for weights, `config.json` for architecture hyperparameters — so they load via the standard `from_pretrained` one-liner.
 ## Loading examples
 encoder_x = Encoder.from_pretrained("CRUISEResearchGroup/CGM-JEPA", subfolder="x_cgm_jepa")
 ```
 ### Standalone PyTorch — GluFormer
 ```python
 gluformer.eval()
 ```
+## Architectures
+### `cgm_jepa` and `x_cgm_jepa`
+Both use the same `models.encoder.Encoder` class with identical hyperparameters; only the pretraining objective differs. At downstream / inference time only the temporal encoder is used.
+| Field | Value |
 |---|---|
+| `patch_size` | 12 |
+| `encoder_kernel_size` | 3 |
+| `encoder_embed_dim` | 96 |
+| `encoder_nhead` | 6 |
+| `encoder_num_layers` | 3 |
+**Input**: a tensor of shape `(B, num_patches, patch_size)` (raw glucose values, z-scored).
+**Output**: per-patch embedding of shape `(B, num_patches, embed_dim)`.
 ## Intended use
 - **Frozen feature extraction** from raw CGM windows (24-hour, 5-min sampled, 288 timesteps).
+- **Linear-probe evaluation**, especially for the metabolic subphenotyping tasks (IR / β-cell dysfunction) described in the paper.
+- **Comparison baseline** for new CGM representation methods.
 ## Citation
+```bibtex
+@article{muhammad2026cgm,
+  title   = {CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining},
+  author  = {Muhammad, Hada Melino and Li, Zechen and Salim, Flora and Metwally, Ahmed A},
+  journal = {arXiv preprint arXiv:2605.00933},
+  year    = {2026}
+}
+```
+## License
+Released under the **MIT license**.