# Deploying the EEGDash Space and datasets One-time setup, per-push workflow, and how the dataset mirrors are kept in sync. ## 1. Create the org (one-time) 1. Sign in at . 2. Create org → handle **`EEGDash`**, display name *EEG-DaSh*, link `https://eegdash.org` and `https://github.com/eegdash/EEGDash`, upload `docs/source/_static/eegdash_image_only.svg` as the logo. 3. Add maintainers. 4. Generate a **write** access token (Settings → Access Tokens) and export it as `HF_TOKEN` locally and in CI. ## 2. Create the Space ```bash huggingface-cli login # paste the write token huggingface-cli repo create \ --type space --space_sdk gradio EEGDash/catalog ``` ## 3. Push the Space From the repo root: ```bash cd huggingface-space git init -b main git remote add origin https://huggingface.co/spaces/EEGDash/catalog git add README.md app.py requirements.txt dataset_summary.csv git commit -m "Initial Space: searchable EEGDash catalog" git push origin main ``` The Space will build and expose at . ### Keeping the catalog fresh `dataset_summary.csv` in this folder is a snapshot of `eegdash/dataset/dataset_summary.csv`. Refresh it whenever the source changes: ```bash cp ../eegdash/dataset/dataset_summary.csv dataset_summary.csv git add dataset_summary.csv git commit -m "Refresh catalog snapshot" git push ``` A GitHub Action that runs on pushes to `develop` can automate this — see the stub in `.github/workflows/sync-hf-space.yml` (add when ready). ## 4. Mirror datasets to `EEGDash/` This is what powers the `on 🤗` column. Push one or more datasets with the helper script at `scripts/push_to_hf.py`: ```bash # Single dataset python scripts/push_to_hf.py --dataset ds002718 # Batch, skipping anything already on the Hub, capped at 5 GB python scripts/push_to_hf.py \ --from-csv eegdash/dataset/dataset_summary.csv \ --max-size-gb 5 \ --skip-existing ``` Under the hood this calls `EEGDashDataset(...).push_to_hub("EEGDash/")`, which is the `HubDatasetMixin` braindecode inherits from. The resulting repo lays out: ``` EEGDash// ├── README.md # Dataset card with load snippets ├── format_info.json # Version + compression metadata └── sourcedata/braindecode/ ├── dataset_description.json # BIDS-compliant ├── participants.tsv # BIDS-compliant ├── dataset.zarr/ # blosc-compressed windowed data └── sub-