Add architecture-only model card

a22ba3d verified 13 days ago

8.45 kB

license: bsd-3-clause
library_name: braindecode
pipeline_tag: feature-extraction
tags:
  - eeg
  - biosignal
  - pytorch
  - neuroscience
  - braindecode
  - foundation-model
  - transformer

CodeBrain

CodeBrain: Scalable Code EEG Pre-Training for Unified Downstream BCI Tasks.

Architecture-only repository. This repo documents the braindecode.models.CodeBrain class. No pretrained weights are distributed here — instantiate the model and train it on your own data, or fine-tune from a published foundation-model checkpoint separately.

Quick start

pip install braindecode

from braindecode.models import CodeBrain

model = CodeBrain(
    n_chans=22,
    sfreq=200,
    input_window_seconds=4.0,
    n_outputs=2,
)

The signal-shape arguments above are example defaults — adjust them to match your recording.

Documentation

Full API reference (parameters, references, architecture figure): https://braindecode.org/stable/generated/braindecode.models.CodeBrain.html
Interactive browser with live instantiation: https://huggingface.co/spaces/braindecode/model-explorer
Source on GitHub: https://github.com/braindecode/braindecode/blob/master/braindecode/models/codebrain.py#L21

Architecture description

The block below is the rendered class docstring (parameters, references, architecture figure where available).

CodeBrain: Scalable Code EEG Pre-Training for Unified Downstream BCI Tasks.

Foundation ModelAttention/Transformer

.. figure:: https://raw.githubusercontent.com/jingyingma01/CodeBrain/refs/heads/main/assets/intro.png :align: center :alt: CodeBrain pre-training overview :width: 1000px

CodeBrain is a foundation model for EEG that pre-trains on large unlabelled corpora using a two-stage vector-quantised masking strategy, then fine-tunes on downstream BCI tasks. It segments EEG signals into fixed-size patches, embeds them with convolutional and spectral projections, and processes them through stacked residual blocks that combine a multi-scale convolutional structured state-space model (_GConv) with sliding-window self-attention.

.. rubric:: Stage 2: EEGSSM Backbone (this implementation)

This class implements Stage 2 of CodeBrain — the EEGSSM backbone described in Section 3.3 of [codebrain]_. Following :class:Labram, CodeBrain discretises EEG patches into codebook tokens via VQ-VAE (Stage 1, not implemented here), then trains the backbone to predict masked token indices via cross-entropy. CodeBrain extends this with a dual tokenizer that decouples temporal and frequency representations, as stated in the paper: "the TFDual-Tokenizer, which decouples heterogeneous temporal and frequency EEG signals into discrete tokens to enhance discriminative power."

.. rubric:: Macro Components

PatchEmbedding: Splits (batch, n_chans, n_times) into (batch, n_chans, seq_len, patch_size) patches, projects each patch with a 2-D convolutional stack, adds FFT-based spectral embeddings, and applies depth-wise convolutional positional encoding.
Residual blocks (ResidualGroup): Each block applies RMSNorm, a _GConv SSM layer, and sliding-window multi-head attention, with gated activation and separate residual/skip paths.
Classification head (final_layer): Flattens the output and maps to n_outputs classes.

.. important:: Pre-trained Weights Available

This model has pre-trained weights available on the Hugging Face Hub.
You can load them using:

.. code:: python
    from braindecode.models import CodeBrain

    # Load pre-trained model from Hugging Face Hub
    model = CodeBrain.from_pretrained("braindecode/codebrain-pretrained")

To push your own trained model to the Hub:

.. code:: python
    model.push_to_hub("my-username/my-codebrain")

Parameters

patch_size : int, default=200 Number of time samples per patch. Input length is trimmed to the nearest multiple of patch_size. res_channels : int, default=200 Width of the residual stream inside each ResidualBlock. skip_channels : int, default=200 Width of the skip-connection stream aggregated across blocks. out_channels : int, default=200 Output channels of final_conv before the classification head. num_res_layers : int, default=8 Number of stacked ResidualBlock modules. drop_prob : float, default=0.1 Dropout rate used inside the _GConv SSM and attention layers. s4_bidirectional : bool, default=True Whether the _GConv SSM processes the sequence bidirectionally. s4_layernorm : bool, default=False Whether to apply layer normalisation inside the _GConv SSM. Set to False to match the released pretrained checkpoint. s4_lmax : int, default=570 Maximum sequence length for the _GConv SSM kernel. Also determines the patch embedding dimension as s4_lmax // n_chans. s4_d_state : int, default=64 State dimension of the _GConv SSM. conv_out_chans : int, default=25 Number of output channels in the patch projection convolutions. conv_groups : int, default=5 Number of groups for GroupNorm in the patch projection. activation : type[nn.Module], default=nn.ReLU Non-linear activation class used in init_conv and final_conv.

References

.. [codebrain] Yi Ding, Xuyang Chen, Yong Li, Rui Yan, Tao Wang, Le Wu (2025). CodeBrain: Scalable Code EEG Pre-Training for Unified Downstream BCI Tasks. https://arxiv.org/abs/2506.09110

.. rubric:: Hugging Face Hub integration

When the optional huggingface_hub package is installed, all models automatically gain the ability to be pushed to and loaded from the Hugging Face Hub. Install with::

 pip install braindecode[hub]

Pushing a model to the Hub:

.. code:: from braindecode.models import CodeBrain

 # Train your model
 model = CodeBrain(n_chans=22, n_outputs=4, n_times=1000)
 # ... training code ...

 # Push to the Hub
 model.push_to_hub(
     repo_id="username/my-codebrain-model",
     commit_message="Initial model upload",
 )

Loading a model from the Hub:

.. code:: from braindecode.models import CodeBrain

 # Load pretrained model
 model = CodeBrain.from_pretrained("username/my-codebrain-model")

 # Load with a different number of outputs (head is rebuilt automatically)
 model = CodeBrain.from_pretrained("username/my-codebrain-model", n_outputs=4)

Extracting features and replacing the head:

.. code:: import torch

 x = torch.randn(1, model.n_chans, model.n_times)
 # Extract encoder features (consistent dict across all models)
 out = model(x, return_features=True)
 features = out["features"]

 # Replace the classification head
 model.reset_head(n_outputs=10)

Saving and restoring full configuration:

.. code:: import json

 config = model.get_config()            # all __init__ params
 with open("config.json", "w") as f:
     json.dump(config, f)

 model2 = CodeBrain.from_config(config)    # reconstruct (no weights)

All model parameters (both EEG-specific and model-specific such as dropout rates, activation functions, number of filters) are automatically saved to the Hub and restored when loading.

See :ref:load-pretrained-models for a complete tutorial.

Citation

Please cite both the original paper for this architecture (see the References section above) and braindecode:

@article{aristimunha2025braindecode,
  title   = {Braindecode: a deep learning library for raw electrophysiological data},
  author  = {Aristimunha, Bruno and others},
  journal = {Zenodo},
  year    = {2025},
  doi     = {10.5281/zenodo.17699192},
}

License

BSD-3-Clause for the model code (matching braindecode). Pretraining-derived weights, if you fine-tune from a checkpoint, inherit the licence of that checkpoint and its training corpus.