EnYa32 commited on
Commit
d7e3212
ยท
verified ยท
1 Parent(s): 5182d91

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -67
README.md CHANGED
@@ -12,81 +12,66 @@ short_description: Streamlit app with BodyFat (%) predictor
12
  license: mit
13
  ---
14
 
15
- # ๐Ÿงโ€โ™‚๏ธ BodyFat Predictor โ€“ Ridge Regression
16
 
17
- This project demonstrates a **machine learning regression pipeline** to predict **body density** from anthropometric measurements and convert it into **body fat percentage** using the **Siri equation**.
18
- The implementation follows the original **ML Olympiad โ€“ Predicting Wellness** competition setup.
19
 
20
  ---
21
 
22
  ## ๐Ÿ“Œ Project Overview
23
 
24
- The goal of this project is to:
25
 
26
- - Predict **body density** using a trained **Ridge Regression** model
27
- - Convert predicted density into **body fat percentage** using a physiological formula
28
- - Provide a **transparent and interactive demo** via Streamlit
29
- - Highlight model limitations caused by small datasets and narrow target ranges
30
 
31
- This Space is intended as an **educational and portfolio project**, not a medical tool.
32
 
33
  ---
34
 
35
  ## ๐Ÿ“Š Dataset & Target
36
 
37
  - **Input features:**
38
- Anthropometric measurements such as age, weight, height, and body circumferences (neck, abdomen, hip, thigh, etc.)
39
 
40
- - **Target variable (model output):**
41
- `Density` (body density)
42
 
43
- - **Derived output (post-processing):**
44
- `BodyFat (%)` calculated using the **Siri equation**
45
 
46
  ---
47
 
48
- ## ๐Ÿงฎ Body Fat Calculation (Siri Formula)
49
 
50
- After predicting body density, body fat percentage is calculated as:
51
-
52
- BodyFat (%) = 495 / Density โˆ’ 450
53
-
54
- yaml
55
- Code kopieren
56
-
57
- This formula is widely used in body composition analysis and was applied **exactly as defined in the competition**.
58
 
59
  ---
60
 
61
- ## โš ๏ธ Important Note on Model Behavior
62
-
63
- - Human body density typically lies in a **very narrow range (โ‰ˆ 0.95โ€“1.10)**
64
- - Small prediction errors in density can lead to **large changes in body fat percentage**
65
- - In some cases, valid inputs may result in **non-physiological values (e.g. negative body fat)**
66
 
67
- ### How this demo handles it:
68
- - Raw body fat values are **calculated transparently**
69
- - A **warning** is shown if predicted density is outside the typical human range
70
- - For display purposes only, body fat is **clipped to 0โ€“60%**
71
- - Raw values remain visible for transparency
72
 
73
- This behavior reflects a **known limitation** of regression models combined with physical formulas.
 
74
 
75
- ---
 
76
 
77
- ## ๐Ÿง  Model Details
 
78
 
79
- - **Model type:** Ridge Regression (L2-regularized linear regression)
80
- - **Why Ridge?**
81
- - Handles multicollinearity between highly correlated body measurements
82
- - Performs well on small datasets
83
- - Produces stable and interpretable results
84
-
85
- - **Feature Engineering (if used by the model):**
86
- - **Waist-to-Hip Ratio:** `Abdomen / Hip`
87
- - **Body Index (BMI, imperial):** `703 ร— Weight / Heightยฒ`
88
 
89
- Only engineered features that were present during training are computed in the app.
90
 
91
  ---
92
 
@@ -94,13 +79,12 @@ Only engineered features that were present during training are computed in the a
94
 
95
  | Measurement | Unit |
96
  |------------|------|
97
- | Height | **Centimeters (cm)** โ†’ converted to inches internally |
98
  | Weight | **Pounds (lbs)** |
99
- | Circumferences | **Inches** |
100
- | Output Density | Unitless |
101
- | Output BodyFat | Percentage (%) |
102
 
103
- โš ๏ธ Correct units are critical for meaningful predictions.
104
 
105
  ---
106
 
@@ -108,33 +92,44 @@ Only engineered features that were present during training are computed in the a
108
 
109
  ### ๐Ÿ”น Single Prediction
110
  - Interactive form with realistic input ranges
111
- - Automatic unit conversion and feature engineering
112
- - Density prediction + body fat calculation
113
- - Warnings for implausible outputs
114
- - Display of aligned model inputs
115
 
116
  ### ๐Ÿ”น Batch Prediction (CSV)
117
- - Upload a CSV file with the required features
118
  - Automatic feature alignment
119
- - Download predictions as CSV
120
 
121
  ---
122
 
123
- ## ๐Ÿšซ Disclaimer
 
 
 
 
 
124
 
125
- This application is **not a medical device** and must not be used for health diagnosis or clinical decision-making.
126
- Predictions are for **educational and demonstration purposes only**.
127
 
128
  ---
129
 
130
  ## ๐Ÿงช Limitations
131
 
132
  - Small dataset size limits generalization
133
- - Density prediction errors propagate through the Siri equation
134
- - Linear models may oversimplify physiological relationships
135
- - Predictions outside the training distribution may be unreliable
136
 
137
- These limitations are **explicitly shown and explained** in the app.
 
 
 
 
 
 
 
138
 
139
  ---
140
 
@@ -148,7 +143,7 @@ These limitations are **explicitly shown and explained** in the app.
148
 
149
  ---
150
 
151
- ## ๐Ÿ“Ž Files
152
 
153
  โ”œโ”€โ”€ app.py
154
  โ”œโ”€โ”€ ridge_model.pkl
@@ -162,5 +157,7 @@ Code kopieren
162
 
163
  ## โœ… Conclusion
164
 
165
- This project faithfully follows the original competition objective by predicting **body density first** and then converting it into **body fat percentage** using a known physiological formula.
166
- By combining transparency, proper unit handling, and clear warnings, the application demonstrates both the **power and limitations of machine learning in real-world regression problems**.
 
 
 
12
  license: mit
13
  ---
14
 
15
+ # ๐Ÿงโ€โ™‚๏ธ Body Density Predictor โ€“ Ridge Regression
16
 
17
+ This project presents a **machine learning regression application** that predicts **human body density** from anthropometric measurements using a **Ridge Regression** model.
18
+ The implementation follows the original **ML Olympiad โ€“ Predicting Wellness** competition setup, where **body density** is the primary target variable.
19
 
20
  ---
21
 
22
  ## ๐Ÿ“Œ Project Overview
23
 
24
+ The main objectives of this project are:
25
 
26
+ - Predict **body density (Density)** from physical body measurements
27
+ - Build a **stable and interpretable regression model** for a small dataset
28
+ - Provide an **interactive Streamlit demo** for single and batch predictions
29
+ - Clearly communicate model assumptions and limitations
30
 
31
+ This project is intended for **educational and portfolio purposes**.
32
 
33
  ---
34
 
35
  ## ๐Ÿ“Š Dataset & Target
36
 
37
  - **Input features:**
38
+ Anthropometric measurements such as age, height, weight, and body circumferences (neck, chest, abdomen, hip, thigh, etc.)
39
 
40
+ - **Target variable:**
41
+ `Density` (human body density)
42
 
43
+ The model is trained **directly on Density**, without predicting body fat percentage.
 
44
 
45
  ---
46
 
47
+ ## ๐Ÿง  Model Description
48
 
49
+ - **Model type:** Ridge Regression (L2-regularized linear regression)
50
+ - **Reason for model choice:**
51
+ - Handles strong multicollinearity between body measurements
52
+ - Performs well on small datasets
53
+ - Produces stable and smooth predictions
54
+ - Easy to interpret and deploy
 
 
55
 
56
  ---
57
 
58
+ ## ๐Ÿงฎ Feature Engineering
 
 
 
 
59
 
60
+ Only features that were used during training are applied in the app:
 
 
 
 
61
 
62
+ - **Waist-to-Hip Ratio:**
63
+ Waist_hip = Abdomen / Hip
64
 
65
+ markdown
66
+ Code kopieren
67
 
68
+ - **Body Index (BMI, imperial units):**
69
+ Body_Index = 703 ร— Weight / Heightยฒ
70
 
71
+ yaml
72
+ Code kopieren
 
 
 
 
 
 
 
73
 
74
+ Engineered features are calculated **only if required by the trained model**.
75
 
76
  ---
77
 
 
79
 
80
  | Measurement | Unit |
81
  |------------|------|
82
+ | Height | **Centimeters (cm)** (converted internally to inches) |
83
  | Weight | **Pounds (lbs)** |
84
+ | Body circumferences | **Inches** |
85
+ | Model output | **Density** (unitless) |
 
86
 
87
+ โš ๏ธ Correct units are essential for meaningful predictions.
88
 
89
  ---
90
 
 
92
 
93
  ### ๐Ÿ”น Single Prediction
94
  - Interactive form with realistic input ranges
95
+ - Automatic unit conversion (cm โ†’ inches)
96
+ - Feature engineering and alignment
97
+ - Density prediction with plausibility warnings
98
+ - Display of final model inputs
99
 
100
  ### ๐Ÿ”น Batch Prediction (CSV)
101
+ - Upload CSV files with required feature columns
102
  - Automatic feature alignment
103
+ - Download predictions as a CSV file
104
 
105
  ---
106
 
107
+ ## โš ๏ธ Important Notes on Predictions
108
+
109
+ - Human body density typically lies within a **very narrow range (โ‰ˆ 0.95 โ€“ 1.10)**
110
+ - Predictions outside this range may indicate:
111
+ - Unrealistic input values
112
+ - Inputs outside the training distribution
113
 
114
+ The app provides **warnings** when predicted density values fall outside the typical physiological range.
 
115
 
116
  ---
117
 
118
  ## ๐Ÿงช Limitations
119
 
120
  - Small dataset size limits generalization
121
+ - Linear regression may oversimplify complex physiological relationships
122
+ - Predictions may be unreliable for extreme or unrealistic input values
123
+ - The model is not intended for medical or clinical use
124
 
125
+ These limitations are **explicitly communicated in the app interface**.
126
+
127
+ ---
128
+
129
+ ## ๐Ÿšซ Disclaimer
130
+
131
+ This application is **not a medical device**.
132
+ All predictions are provided for **educational and demonstration purposes only** and must not be used for health diagnosis or treatment decisions.
133
 
134
  ---
135
 
 
143
 
144
  ---
145
 
146
+ ## ๐Ÿ“Ž Repository Structure
147
 
148
  โ”œโ”€โ”€ app.py
149
  โ”œโ”€โ”€ ridge_model.pkl
 
157
 
158
  ## โœ… Conclusion
159
 
160
+ This project demonstrates a complete and transparent machine learning pipeline for predicting **body density** using Ridge Regression.
161
+ By focusing directly on the competition target variable and avoiding unnecessary post-processing, the application provides **stable, interpretable, and scientifically consistent predictions**, making it well suited as a portfolio and learning project.
162
+
163
+ ---