Spaces:

W1nd5pac
/

microclimate-x

Paused

App Files Files Community

microclimate-x / models /MODEL_CARD.md

W1nd5pac

Deploy 2026-05-20T06:52:08Z — 11e81c5

4eefabb verified 1 day ago

preview code

raw

history blame contribute delete

7.17 kB

	# Model Card — MicroClimate-X Rain Predictor (Random Forest v1.0)

	> Following the Model Card methodology of Mitchell et al. (2019).
	> Authored: 2026-05-11 · UKM Final Year Project · KyoukoLi

	---

	## 1. Model Details

	\| Field \| Value \|
	\|---\|---\|
	\| Model name \| MicroClimate-X RF Rain Predictor \|
	\| Version \| 1.0.0 \|
	\| Architecture \| `sklearn.ensemble.RandomForestClassifier` \|
	\| Hyper-parameters \| `n_estimators=200, max_depth=None, class_weight='balanced', n_jobs=-1, random_state=42` \|
	\| Features (n=18) \| `elevation_m`, `temperature_c`, `humidity_pct`, `wind_speed_kmh`, `wind_direction_deg`, `pressure_hpa`, `dew_point_c`, `cloud_cover_pct`, `cape_jkg`, `visibility_m`, `wind_u`, `wind_v`, `hour_sin`, `hour_cos`, `month_sin`, `month_cos`, `dew_point_depression`, `pressure_change_3h`, `precipitation_lag_1h` \|
	\| Target \| `is_rain_event` ∈ {0, 1} — defined as `precipitation(t+1h) > 0.1 mm` \|
	\| Output \| `predict_proba(...)[:, 1]` — calibrated probability of rain in the next hour \|
	\| Author / Contact \| Li Zhenyue (`KyoukoLi`), Faculty of Information Science & Technology, UKM \|
	\| Licence \| MIT (see `LICENSE`) \|

	---

	## 2. Intended Use

	* Primary use case: terrain-aware rain-risk decision support inside the MicroClimate-X hybrid pipeline. The RF probability is one input among many — the topographic Rule Engine has final authority (Veto cascade + R1-R4 decision table).
	* Intended users: hikers, drivers, construction crews, and other outdoor decision makers in complex terrain (initially Malaysian mountain regions).
	* Out-of-scope uses:
	* Lightning forecasting (CAPE → thunderstorm risk is handled by the rule engine sub-scorer, not by this model).
	* Multi-hour quantitative precipitation forecasting.
	* Aviation, marine, or any life-critical use without the Rule Engine veto layer in the loop.

	---

	## 3. Training Data

	\| Field \| Value \|
	\|---\|---\|
	\| Source \| ECMWF ERA5 Reanalysis (via Open-Meteo Historical Archive API) \|
	\| Spatial coverage \| 5 mountain sites in West Malaysia (Genting, Cameron, Brinchang, Korbu, Kinabalu) \|
	\| Temporal coverage \| 2019-01-01 → 2024-12-31 (5 years, hourly) \|
	\| Total rows \| 175 315 \|
	\| Class balance \| 29.2 % positive (rain-event), 70.8 % negative \|
	\| Train / test split \| Time-based; 80 % oldest → train; 20 % newest → test. No random shuffling — would leak temporal autocorrelation. \|
	\| Synthetic fallback \| `scripts/1b_synth_dataset.py` generates a physically-plausible synthetic replacement when the Open-Meteo API is unreachable. The synthetic data set has the same schema and is sufficient for end-to-end pipeline verification but should not be used to ship a production model. \|

	---

	## 4. Evaluation — Held-out 20 % temporal test set (n = 35 063)

	Numbers below come from `figures/evaluation_summary.json`, reproducible via `make evaluate`.

	### 4.1 Discrimination

	\| Metric \| Value \|
	\|---\|---\|
	\| ROC AUC \| 0.871 \|
	\| PR Average Precision \| 0.750 \|
	\| Test-set base rate \| 0.292 \|

	### 4.2 Calibration

	\| Metric \| Value \|
	\|---\|---\|
	\| Brier score \| 0.138 (lower is better; 0 is perfect, 0.25 is random) \|

	The reliability diagram (`figures/03_calibration_curve.png`) shows the predicted probability tracks the empirical frequency closely; no post-hoc calibration (Platt / isotonic) was deemed necessary.

	### 4.3 Operating point — safety-critical threshold

	\| Threshold τ \| F1 \| F2 \| Precision \| Recall \|
	\|---\|---\|---\|---\|---\|
	\| 0.50 (default) \| 0.696 \| 0.694 \| 0.700 \| 0.692 \|
	\| 0.20 (chosen) \| 0.621 \| 0.778 \| 0.466 \| 0.934 \|

	We adopt τ = 0.20 because the application is safety-critical: a missed rain event (false negative) on a windward slope can cascade into orographic flash flooding. F2 weights recall 4× higher than precision and is the appropriate metric for this regime (Sasaki, 2007).

	### 4.4 Confusion matrix at τ = 0.20

	\| \| Pred = 0 \| Pred = 1 \|
	\|---\|---\|---\|
	\| True = 0 \| 13 877 (TN) \| 10 950 (FP) \|
	\| True = 1 \| 679 (FN) \| 9 557 (TP) \|

	Recall = 9 557 / (9 557 + 679) = 93.4 % — the operationally important metric for "do not let people walk into a storm".

	### 4.5 Top feature importances

	1. `precipitation_lag_1h` — recent rain is by far the strongest signal (rain begets rain).
	2. `hour_cos` / `hour_sin` — diurnal cycle (afternoon convective storms in tropical climates).
	3. `pressure_change_3h` — falling pressure is a classical storm precursor.
	4. `wind_v` — meridional wind component, relevant for monsoon-driven precipitation.
	5. `dew_point_c` / `dew_point_depression` / `temperature_c` — moisture saturation indicators.

	---

	## 5. Quantitative Limitations

	* Geographic generalisation — the model has only seen West Malaysian mountains. Hindcast validation in other tropical mountainous regions is a planned thesis Chapter 5 contribution; until then, the Rule Engine Veto cascade is the only safety net for out-of-distribution coordinates (e.g. Himalayas).
	* Convective forecasting — the model uses current-hour features to predict next-hour rain. Forecasting horizon > 1 h would degrade accuracy substantially.
	* Class imbalance — addressed via `class_weight='balanced'` and the F2-optimal threshold, but precision at τ = 0.20 is moderate (47 %). False positives are tolerable because they only inflate the rainfall sub-score; the composite-score formula combines this with three other hazards.
	* Calibration drift — Brier = 0.138 in 2024 hold-out. Calibration should be re-checked annually as climate signals shift.

	---

	## 6. Ethical / Safety Considerations

	* Decision-support only. The system is explicitly not a substitute for official meteorological forecasts; the disclaimer is shown in every UI footer.
	* Hidden risk surfaced, not hidden. The R1 decision-table rule deliberately raises an alarm when macro model probability is low but local terrain inputs suggest hidden orographic rain — this is the OPPOSITE of the harmful failure mode where ML over-confidently says "safe".
	* Mt-Everest test (worst-case OOD). When fed coordinates the model has never seen, the RF returns ~0 % rain probability — and the Rule Engine then immediately vetoes on `altitude_hypoxia + extreme_cold + gale_wind`. See `tests/test_rule_engine.py::test_mt_everest_veto_hypoxia`.

	---

	## 7. Reproducibility

	```bash
	# Full pipeline from scratch — works offline via the synthetic dataset.
	make install-dev
	make synth # OR: download real data via scripts/1_download_dataset.py
	make preprocess
	make train
	make evaluate # writes figures/*.png + figures/evaluation_summary.json
	```

	The seed is fixed (`random_state=42`) and figures are written to `figures/` so the thesis can pull them in directly.

	---

	## 8. Citation

	If you reference this model in academic work, please cite:

	> Li Zhenyue (KyoukoLi). MicroClimate-X: A Hybrid Microclimate Risk Engine for Complex Terrain. Bachelor's Thesis, Universiti Kebangsaan Malaysia, Faculty of Information Science & Technology, 2026. GitHub: <https://github.com/KyoukoLi/microclimate-x>

	# Model Card — MicroClimate-X Rain Predictor (Random Forest v1.0)

	> Following the Model Card methodology of Mitchell et al. (2019).
	> Authored: 2026-05-11 · UKM Final Year Project · KyoukoLi

	---

	## 1. Model Details

	\| Field \| Value \|
	\|---\|---\|
	\| Model name \| MicroClimate-X RF Rain Predictor \|
	\| Version \| 1.0.0 \|
	\| Architecture \| `sklearn.ensemble.RandomForestClassifier` \|
	\| Hyper-parameters \| `n_estimators=200, max_depth=None, class_weight='balanced', n_jobs=-1, random_state=42` \|
	\| Features (n=18) \| `elevation_m`, `temperature_c`, `humidity_pct`, `wind_speed_kmh`, `wind_direction_deg`, `pressure_hpa`, `dew_point_c`, `cloud_cover_pct`, `cape_jkg`, `visibility_m`, `wind_u`, `wind_v`, `hour_sin`, `hour_cos`, `month_sin`, `month_cos`, `dew_point_depression`, `pressure_change_3h`, `precipitation_lag_1h` \|
	\| Target \| `is_rain_event` ∈ {0, 1} — defined as `precipitation(t+1h) > 0.1 mm` \|
	\| Output \| `predict_proba(...)[:, 1]` — calibrated probability of rain in the next hour \|
	\| Author / Contact \| Li Zhenyue (`KyoukoLi`), Faculty of Information Science & Technology, UKM \|
	\| Licence \| MIT (see `LICENSE`) \|

	---

	## 2. Intended Use

	* Primary use case: terrain-aware rain-risk decision support inside the MicroClimate-X hybrid pipeline. The RF probability is one input among many — the topographic Rule Engine has final authority (Veto cascade + R1-R4 decision table).
	* Intended users: hikers, drivers, construction crews, and other outdoor decision makers in complex terrain (initially Malaysian mountain regions).
	* Out-of-scope uses:
	* Lightning forecasting (CAPE → thunderstorm risk is handled by the rule engine sub-scorer, not by this model).
	* Multi-hour quantitative precipitation forecasting.
	* Aviation, marine, or any life-critical use without the Rule Engine veto layer in the loop.

	---

	## 3. Training Data

	\| Field \| Value \|
	\|---\|---\|
	\| Source \| ECMWF ERA5 Reanalysis (via Open-Meteo Historical Archive API) \|
	\| Spatial coverage \| 5 mountain sites in West Malaysia (Genting, Cameron, Brinchang, Korbu, Kinabalu) \|
	\| Temporal coverage \| 2019-01-01 → 2024-12-31 (5 years, hourly) \|
	\| Total rows \| 175 315 \|
	\| Class balance \| 29.2 % positive (rain-event), 70.8 % negative \|
	\| Train / test split \| Time-based; 80 % oldest → train; 20 % newest → test. No random shuffling — would leak temporal autocorrelation. \|
	\| Synthetic fallback \| `scripts/1b_synth_dataset.py` generates a physically-plausible synthetic replacement when the Open-Meteo API is unreachable. The synthetic data set has the same schema and is sufficient for end-to-end pipeline verification but should not be used to ship a production model. \|

	---

	## 4. Evaluation — Held-out 20 % temporal test set (n = 35 063)

	Numbers below come from `figures/evaluation_summary.json`, reproducible via `make evaluate`.

	### 4.1 Discrimination

	\| Metric \| Value \|
	\|---\|---\|
	\| ROC AUC \| 0.871 \|
	\| PR Average Precision \| 0.750 \|
	\| Test-set base rate \| 0.292 \|

	### 4.2 Calibration

	\| Metric \| Value \|
	\|---\|---\|
	\| Brier score \| 0.138 (lower is better; 0 is perfect, 0.25 is random) \|

	The reliability diagram (`figures/03_calibration_curve.png`) shows the predicted probability tracks the empirical frequency closely; no post-hoc calibration (Platt / isotonic) was deemed necessary.

	### 4.3 Operating point — safety-critical threshold

	\| Threshold τ \| F1 \| F2 \| Precision \| Recall \|
	\|---\|---\|---\|---\|---\|
	\| 0.50 (default) \| 0.696 \| 0.694 \| 0.700 \| 0.692 \|
	\| 0.20 (chosen) \| 0.621 \| 0.778 \| 0.466 \| 0.934 \|

	We adopt τ = 0.20 because the application is safety-critical: a missed rain event (false negative) on a windward slope can cascade into orographic flash flooding. F2 weights recall 4× higher than precision and is the appropriate metric for this regime (Sasaki, 2007).

	### 4.4 Confusion matrix at τ = 0.20

	\| \| Pred = 0 \| Pred = 1 \|
	\|---\|---\|---\|
	\| True = 0 \| 13 877 (TN) \| 10 950 (FP) \|
	\| True = 1 \| 679 (FN) \| 9 557 (TP) \|

	Recall = 9 557 / (9 557 + 679) = 93.4 % — the operationally important metric for "do not let people walk into a storm".

	### 4.5 Top feature importances

	1. `precipitation_lag_1h` — recent rain is by far the strongest signal (rain begets rain).
	2. `hour_cos` / `hour_sin` — diurnal cycle (afternoon convective storms in tropical climates).
	3. `pressure_change_3h` — falling pressure is a classical storm precursor.
	4. `wind_v` — meridional wind component, relevant for monsoon-driven precipitation.
	5. `dew_point_c` / `dew_point_depression` / `temperature_c` — moisture saturation indicators.

	---

	## 5. Quantitative Limitations

	* Geographic generalisation — the model has only seen West Malaysian mountains. Hindcast validation in other tropical mountainous regions is a planned thesis Chapter 5 contribution; until then, the Rule Engine Veto cascade is the only safety net for out-of-distribution coordinates (e.g. Himalayas).
	* Convective forecasting — the model uses current-hour features to predict next-hour rain. Forecasting horizon > 1 h would degrade accuracy substantially.
	* Class imbalance — addressed via `class_weight='balanced'` and the F2-optimal threshold, but precision at τ = 0.20 is moderate (47 %). False positives are tolerable because they only inflate the rainfall sub-score; the composite-score formula combines this with three other hazards.
	* Calibration drift — Brier = 0.138 in 2024 hold-out. Calibration should be re-checked annually as climate signals shift.

	---

	## 6. Ethical / Safety Considerations

	* Decision-support only. The system is explicitly not a substitute for official meteorological forecasts; the disclaimer is shown in every UI footer.
	* Hidden risk surfaced, not hidden. The R1 decision-table rule deliberately raises an alarm when macro model probability is low but local terrain inputs suggest hidden orographic rain — this is the OPPOSITE of the harmful failure mode where ML over-confidently says "safe".
	* Mt-Everest test (worst-case OOD). When fed coordinates the model has never seen, the RF returns ~0 % rain probability — and the Rule Engine then immediately vetoes on `altitude_hypoxia + extreme_cold + gale_wind`. See `tests/test_rule_engine.py::test_mt_everest_veto_hypoxia`.

	---

	## 7. Reproducibility

	```bash
	# Full pipeline from scratch — works offline via the synthetic dataset.
	make install-dev
	make synth # OR: download real data via scripts/1_download_dataset.py
	make preprocess
	make train
	make evaluate # writes figures/*.png + figures/evaluation_summary.json
	```

	The seed is fixed (`random_state=42`) and figures are written to `figures/` so the thesis can pull them in directly.

	---

	## 8. Citation

	If you reference this model in academic work, please cite:

	> Li Zhenyue (KyoukoLi). MicroClimate-X: A Hybrid Microclimate Risk Engine for Complex Terrain. Bachelor's Thesis, Universiti Kebangsaan Malaysia, Faculty of Information Science & Technology, 2026. GitHub: <https://github.com/KyoukoLi/microclimate-x>