A newer version of the Gradio SDK is available: 6.14.0
Deploying the EEGDash Space and datasets
One-time setup, per-push workflow, and how the dataset mirrors are kept in sync.
1. Create the org (one-time)
- Sign in at https://huggingface.co.
- Create org → handle
EEGDash, display name EEG-DaSh, linkhttps://eegdash.organdhttps://github.com/eegdash/EEGDash, uploaddocs/source/_static/eegdash_image_only.svgas the logo. - Add maintainers.
- Generate a write access token (Settings → Access Tokens) and export it as
HF_TOKENlocally and in CI.
2. Create the Space
huggingface-cli login # paste the write token
huggingface-cli repo create \
--type space --space_sdk gradio EEGDash/catalog
3. Push the Space
From the repo root:
cd huggingface-space
git init -b main
git remote add origin https://huggingface.co/spaces/EEGDash/catalog
git add README.md app.py requirements.txt dataset_summary.csv
git commit -m "Initial Space: searchable EEGDash catalog"
git push origin main
The Space will build and expose at https://huggingface.co/spaces/EEGDash/catalog.
Keeping the catalog fresh
dataset_summary.csv in this folder is a snapshot of
eegdash/dataset/dataset_summary.csv. Refresh it whenever the source changes:
cp ../eegdash/dataset/dataset_summary.csv dataset_summary.csv
git add dataset_summary.csv
git commit -m "Refresh catalog snapshot"
git push
A GitHub Action that runs on pushes to develop can automate this — see the
stub in .github/workflows/sync-hf-space.yml (add when ready).
4. Mirror datasets to EEGDash/<slug>
This is what powers the on 🤗 column. Push one or more datasets with the helper
script at scripts/push_to_hf.py:
# Single dataset
python scripts/push_to_hf.py --dataset ds002718
# Batch, skipping anything already on the Hub, capped at 5 GB
python scripts/push_to_hf.py \
--from-csv eegdash/dataset/dataset_summary.csv \
--max-size-gb 5 \
--skip-existing
Under the hood this calls EEGDashDataset(...).push_to_hub("EEGDash/<slug>"),
which is the HubDatasetMixin braindecode inherits from. The resulting repo
lays out:
EEGDash/<slug>/
├── README.md # Dataset card with load snippets
├── format_info.json # Version + compression metadata
└── sourcedata/braindecode/
├── dataset_description.json # BIDS-compliant
├── participants.tsv # BIDS-compliant
├── dataset.zarr/ # blosc-compressed windowed data
└── sub-<label>/eeg/
├── *_events.tsv
├── *_channels.tsv
└── *_eeg.json
Users then load it with:
from braindecode.datasets import BaseConcatDataset
ds = BaseConcatDataset.pull_from_hub("EEGDash/ds002718")
5. Verify
- Space renders: https://huggingface.co/spaces/EEGDash/catalog.
- Org page shows the Space card + dataset repos: https://huggingface.co/EEGDash.
- At least one dataset loadable end-to-end via
pull_from_hub.
Troubleshooting
| Symptom | Likely cause |
|---|---|
on 🤗 column empty for everything |
Space has no outbound network, or rate-limited; the Space caches once per process so redeploy to retry. |
push_to_hub fails with ImportError |
pip install braindecode[hub] (pulls in zarr + huggingface_hub). |
| Repo exists but Space doesn't flag it | HfApi().list_datasets(author="EEGDash", limit=500) caps at 500 — raise the limit in app.py::_hf_repos if the org grows beyond that. |
dataset_summary.csv out of sync |
Re-run step 3's refresh or add the workflow stub. |