🧠 Key Insights

Tree-based models (RF, XGBoost) fail under extreme imbalance, predicting only the majority class.
Linear models achieve high recall but suffer from extremely low precision.
Threshold tuning significantly improves performance:
- F1 improved from 0.0085 → 0.0769 (LogReg)

⚙️ Usage

import joblib

model = joblib.load("models/logistic_regression.joblib")

preds = model.predict(X)

⚠️ Limitations

Models struggle with extreme imbalance (~1600:1) Poor generalization across subjects (LOSO results) Classical ML is insufficient for robust seizure detection in this setting

📚 Citation

If you use this model, please cite:

@dataset{eegparquet_benchmark_2026,
  title={EEGParquet-Benchmark: Windowed and Feature-Enriched EEG Dataset for Seizure Detection},
  author={Daffa Tarigan},
  year={2026},
  publisher={Hugging Face}
}

🚀 Notes

This repository is intended for:

Benchmarking classical ML under imbalance Demonstrating limitations of accuracy-based evaluation Supporting research in biomedical signal classification

1. Folder structure (important)

/models
├── logistic_regression.joblib
├── random_forest.joblib
├── svm_rbf_cuml_gpu.joblib
├── xgboost_gpu_optuna.joblib

Downloads last month: -

LeBabyOx
/

EEGDetectionMLBaseline