π§ Key Insights
- Tree-based models (RF, XGBoost) fail under extreme imbalance, predicting only the majority class.
- Linear models achieve high recall but suffer from extremely low precision.
- Threshold tuning significantly improves performance:
- F1 improved from 0.0085 β 0.0769 (LogReg)
βοΈ Usage
import joblib
model = joblib.load("models/logistic_regression.joblib")
preds = model.predict(X)
β οΈ Limitations
Models struggle with extreme imbalance (~1600:1) Poor generalization across subjects (LOSO results) Classical ML is insufficient for robust seizure detection in this setting
π Citation
If you use this model, please cite:
@dataset{eegparquet_benchmark_2026,
title={EEGParquet-Benchmark: Windowed and Feature-Enriched EEG Dataset for Seizure Detection},
author={Daffa Tarigan},
year={2026},
publisher={Hugging Face}
}
π Notes
This repository is intended for:
Benchmarking classical ML under imbalance Demonstrating limitations of accuracy-based evaluation Supporting research in biomedical signal classification
1. Folder structure (important)
/models
βββ logistic_regression.joblib
βββ random_forest.joblib
βββ svm_rbf_cuml_gpu.joblib
βββ xgboost_gpu_optuna.joblib
- Downloads last month
- -