BodyFatPredictor / README.md
EnYa32's picture
Update README.md
d7e3212 verified
metadata
title: BodyFatPredictor
emoji: ๐Ÿงโ€โ™‚๏ธ
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
  - streamlit
pinned: false
short_description: Streamlit app with BodyFat (%) predictor
license: mit

๐Ÿงโ€โ™‚๏ธ Body Density Predictor โ€“ Ridge Regression

This project presents a machine learning regression application that predicts human body density from anthropometric measurements using a Ridge Regression model.
The implementation follows the original ML Olympiad โ€“ Predicting Wellness competition setup, where body density is the primary target variable.


๐Ÿ“Œ Project Overview

The main objectives of this project are:

  • Predict body density (Density) from physical body measurements
  • Build a stable and interpretable regression model for a small dataset
  • Provide an interactive Streamlit demo for single and batch predictions
  • Clearly communicate model assumptions and limitations

This project is intended for educational and portfolio purposes.


๐Ÿ“Š Dataset & Target

  • Input features:
    Anthropometric measurements such as age, height, weight, and body circumferences (neck, chest, abdomen, hip, thigh, etc.)

  • Target variable:
    Density (human body density)

The model is trained directly on Density, without predicting body fat percentage.


๐Ÿง  Model Description

  • Model type: Ridge Regression (L2-regularized linear regression)
  • Reason for model choice:
    • Handles strong multicollinearity between body measurements
    • Performs well on small datasets
    • Produces stable and smooth predictions
    • Easy to interpret and deploy

๐Ÿงฎ Feature Engineering

Only features that were used during training are applied in the app:

  • Waist-to-Hip Ratio:
    Waist_hip = Abdomen / Hip

markdown Code kopieren

  • Body Index (BMI, imperial units):
    Body_Index = 703 ร— Weight / Heightยฒ

yaml Code kopieren

Engineered features are calculated only if required by the trained model.


๐Ÿ“ Units & Input Conventions

Measurement Unit
Height Centimeters (cm) (converted internally to inches)
Weight Pounds (lbs)
Body circumferences Inches
Model output Density (unitless)

โš ๏ธ Correct units are essential for meaningful predictions.


๐Ÿ–ฅ๏ธ Application Features

๐Ÿ”น Single Prediction

  • Interactive form with realistic input ranges
  • Automatic unit conversion (cm โ†’ inches)
  • Feature engineering and alignment
  • Density prediction with plausibility warnings
  • Display of final model inputs

๐Ÿ”น Batch Prediction (CSV)

  • Upload CSV files with required feature columns
  • Automatic feature alignment
  • Download predictions as a CSV file

โš ๏ธ Important Notes on Predictions

  • Human body density typically lies within a very narrow range (โ‰ˆ 0.95 โ€“ 1.10)
  • Predictions outside this range may indicate:
  • Unrealistic input values
  • Inputs outside the training distribution

The app provides warnings when predicted density values fall outside the typical physiological range.


๐Ÿงช Limitations

  • Small dataset size limits generalization
  • Linear regression may oversimplify complex physiological relationships
  • Predictions may be unreliable for extreme or unrealistic input values
  • The model is not intended for medical or clinical use

These limitations are explicitly communicated in the app interface.


๐Ÿšซ Disclaimer

This application is not a medical device.
All predictions are provided for educational and demonstration purposes only and must not be used for health diagnosis or treatment decisions.


๐Ÿงฐ Tech Stack

  • Python
  • pandas, NumPy
  • scikit-learn
  • Streamlit
  • joblib

๐Ÿ“Ž Repository Structure

โ”œโ”€โ”€ app.py โ”œโ”€โ”€ ridge_model.pkl โ”œโ”€โ”€ requirements.txt โ””โ”€โ”€ README.md

yaml Code kopieren


โœ… Conclusion

This project demonstrates a complete and transparent machine learning pipeline for predicting body density using Ridge Regression.
By focusing directly on the competition target variable and avoiding unnecessary post-processing, the application provides stable, interpretable, and scientifically consistent predictions, making it well suited as a portfolio and learning project.