Model Card for keerthikoganti/sleep-hours-predictor-automl-ag
This dataset contains self-reported technology-use patterns and routines.
Model Details
Model Description
This dataset contains self-reported technology-use patterns and routines, plus sleep timing/quality attributes. It was created as a class exercise for supervised learning on tabular data. The primary task is regression — predicting nightly sleep_hours. The same features can also support simple classification tasks if desired.
- Developed by: Keerthi Koganti
- Model type: AutoML (AutoGluon Tabular ensemble; model family chosen via search)
- Language(s) (NLP): English
- Task: Tabular Regression
- Target column: sleep_hours (continuous, hours slept per night)
- License: Carnegie Mellon educational use
- Framework: autogluon.tabular
- Repo artifacts: autogluon_sleep_model.zip (zipped native AutoGluon predictor directory) metrics.json (test-set metrics for reproducibility)
Model Sources
- Repository: Iris314/Students_sleep_tabular
Uses
Direct Use
Classroom demos of AutoML on tabular regression tasks
Baseline experiments for feature engineering and evaluation
Comparing different AutoML presets and model search spaces
Out-of-Scope Use
Production deployment or any sleep/health recommendation system
Generalization beyond course context and small cohort data
Clinical or safety-critical applications
Bias, Risks, and Limitations
Small sample size and potential sampling bias.
Self-report bias in device use and sleep estimates.
Domain shift likely for other age groups, locations, or lifestyles.
Recommendations
Use primarily for teaching and demonstration of tabular ML workflows. If you publish results, disclose the split strategy, preprocessing, and any imputations/encodings performed by AutoGluon.
How to Get Started with the Model
Use the code below to get started with the model.
import pathlib, shutil, zipfile import huggingface_hub as hf from autogluon.tabular import TabularPredictor
REPO = "keerthikoganti/sleep-hours-predictor-automl-ag" ZIPNAME = "autogluon_sleep_model.zip"
dest = pathlib.Path("hf_download") dest.mkdir(exist_ok=True)
Download zipped predictor directory
zip_path = hf.hf_hub_download( repo_id=REPO, filename=ZIPNAME, repo_type="model", local_dir=str(dest), local_dir_use_symlinks=False, )
Extract to folder
extract_dir = dest / "predictor_dir" if extract_dir.exists(): shutil.rmtree(extract_dir) extract_dir.mkdir(parents=True, exist_ok=True)
with zipfile.ZipFile(zip_path, "r") as zf: zf.extractall(str(extract_dir))
Load predictor from native directory
predictor = TabularPredictor.load(str(extract_dir))
Example: predictions on a new DataFrame X
preds = predictor.predict(X)
Training Details
Training Data
Dataset: Iris314/Students_sleep_tabular
Splits: 80/20 random split (random_state=42) on the augmented data (≈300 rows)
Target column: sleep_hours
Training Procedure
Library: AutoGluon Tabular
Presets: "best_quality" (ensembles across boosted trees, RF, kNN, neural nets, etc.)
Training time limit: 300 seconds
Evaluation metric (internal): AutoGluon default for regression (root_mean_squared_error)
Training Hyperparameters
- Training regime: time_limit: 300s
presets: "best_quality"
random_state: 42
problem_type: inferred automatically by AutoGluon
eval_metric: AutoGluon default for regression
Evaluation
Testing Data
Held-out 20% of the augmented split (≈60 rows)
Metrics (replace with actuals from metrics.json)
R² = 0.78
MAE = 0.62 hours
RMSE = 0.95 hours
Environmental Impact
Training was lightweight and classroom-focused:
Hardware: CPU runtime in Google Colab (no GPU)
Training wall-time: ≤ 5 minutes
Estimated emissions: negligible
Cloud provider: Colab (Google)
Model Card Contact
Keerthi Koganti — kkoganti@andrew.cmu.edu