File size: 3,249 Bytes
45d17fb b4c56ea 45d17fb b4c56ea 45d17fb b4c56ea 45d17fb b4c56ea 45d17fb b4c56ea 45d17fb b4c56ea 45d17fb f7b4d24 b4c56ea | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | # Dataset Preparation
This guide provides commands to process raw EMG data into HDF5 format using sliding windows.
### Usage
- **Dependencies:** Install requirements specific to these scripts via `pip install -r scripts/requirements.txt`. Framework requirements for TinyMyo are in the [BioFoundation repository](https://github.com/pulp-bio/BioFoundation).
- Use `--download_data` if raw data is missing.
- Replace `$DATA_PATH` with your local storage path.
- `seq_len`: Window size (samples).
- `stride`: Step size (samples).
- Pretraining scripts use **2 kHz** sampling. Downstream scripts use **200 Hz** or **2 kHz**.
---
## Pretraining Datasets
(0.5s windows, 50% overlap @ 2 kHz)
| Dataset | Size (GB) | Seq Len | Stride | Command |
| :--- | :--- | :--- | :--- | :--- |
| **EMG2Pose** | 431 | 1000 (0.5s) | 500 | `python scripts/emg2pose.py --data_dir $DATA_PATH/emg2pose_data/ --save_dir $DATA_PATH/emg2pose_data/h5/ --seq_len 1000 --stride 500` |
| **NinaPro DB6** | ~20 | 1000 (0.5s) | 500 | `python scripts/db6.py --data_dir $DATA_PATH/ninapro/DB6/ --save_dir $DATA_PATH/ninapro/DB6/h5/ --seq_len 1000 --stride 500` |
| **NinaPro DB7** | ~10 | 1000 (0.5s) | 500 | `python scripts/db7.py --data_dir $DATA_PATH/ninapro/DB7/ --save_dir $DATA_PATH/ninapro/DB7/h5/ --seq_len 1000 --stride 500` |
---
## Downstream Datasets
| Dataset | Metric | Seq Len | Stride | Command |
| :--- | :--- | :--- | :--- | :--- |
| **NinaPro DB5** | Gesture | 200 (1s) | 50 | `python scripts/db5.py --data_dir $DATA_PATH/ninapro/DB5/ --save_dir $DATA_PATH/ninapro/DB5/h5_1sec/ --seq_len 200 --stride 50` |
| **NinaPro DB5** | Gesture | 1000 (5s) | 250 | `python scripts/db5.py --data_dir $DATA_PATH/ninapro/DB5/ --save_dir $DATA_PATH/ninapro/DB5/h5_5sec/ --seq_len 1000 --stride 250` |
| **EMG-EPN612** | Gesture | 200 (1s) | N/A | `python scripts/epn.py --data_dir $DATA_PATH/EPN612/ --source_training $DATA_PATH/EPN612/trainingJSON/ --source_testing $DATA_PATH/EPN612/testingJSON/ --dest_dir $DATA_PATH/EPN612/h5_1sec/ --seq_len 200` |
| **EMG-EPN612** | Gesture | 1000 (5s) | N/A | `python scripts/epn.py --data_dir $DATA_PATH/EPN612/ --source_training $DATA_PATH/EPN612/trainingJSON/ --source_testing $DATA_PATH/EPN612/testingJSON/ --dest_dir $DATA_PATH/EPN612/h5_5sec/ --seq_len 1000` |
| **UCI EMG** | Gesture | 200 (1s) | 50 | `python scripts/uci.py --data_dir $DATA_PATH/UCI_EMG/EMG_data_for_gestures-master/ --save_dir $DATA_PATH/UCI_EMG/EMG_data_for_gestures-master/h5_1sec/ --seq_len 200 --stride 50` |
| **UCI EMG** | Gesture | 1000 (5s) | 250 | `python scripts/uci.py --data_dir $DATA_PATH/UCI_EMG/EMG_data_for_gestures-master/ --save_dir $DATA_PATH/UCI_EMG/EMG_data_for_gestures-master/h5_5sec/ --seq_len 1000 --stride 250` |
| **NinaPro DB8** | Regression | 200 (0.1s) | 200 | `python scripts/db8.py --data_dir $DATA_PATH/ninapro/DB8/ --save_dir $DATA_PATH/ninapro/DB8/h5_100/ --seq_len 200 --stride 200` |
| **NinaPro DB8** | Regression | 1000 (0.5s) | 1000 | `python scripts/db8.py --data_dir $DATA_PATH/ninapro/DB8/ --save_dir $DATA_PATH/ninapro/DB8/h5_500/ --seq_len 1000 --stride 1000` |
| **AVE-Speech** | Speech | 2000 (2s) | N/A | `python scripts/avespeech.py --data_dir $DATA_PATH/AVE-Speech/ --save_dir $DATA_PATH/AVE-Speech/h5/` |
|