🧠 Key Insights

  • Tree-based models (RF, XGBoost) fail under extreme imbalance, predicting only the majority class.
  • Linear models achieve high recall but suffer from extremely low precision.
  • Threshold tuning significantly improves performance:
    • F1 improved from 0.0085 β†’ 0.0769 (LogReg)

βš™οΈ Usage

import joblib

model = joblib.load("models/logistic_regression.joblib")

preds = model.predict(X)

⚠️ Limitations

Models struggle with extreme imbalance (~1600:1) Poor generalization across subjects (LOSO results) Classical ML is insufficient for robust seizure detection in this setting

πŸ“š Citation

If you use this model, please cite:

@dataset{eegparquet_benchmark_2026,
  title={EEGParquet-Benchmark: Windowed and Feature-Enriched EEG Dataset for Seizure Detection},
  author={Daffa Tarigan},
  year={2026},
  publisher={Hugging Face}
}

πŸš€ Notes

This repository is intended for:

Benchmarking classical ML under imbalance Demonstrating limitations of accuracy-based evaluation Supporting research in biomedical signal classification


1. Folder structure (important)

/models
β”œβ”€β”€ logistic_regression.joblib
β”œβ”€β”€ random_forest.joblib
β”œβ”€β”€ svm_rbf_cuml_gpu.joblib
β”œβ”€β”€ xgboost_gpu_optuna.joblib
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train LeBabyOx/EEGDetectionMLBaseline