Spaces:

W1nd5pac
/

microclimate-x

Paused

App Files Files Community

microclimate-x / models /MODEL_CARD.md

W1nd5pac

Deploy 2026-05-20T06:52:08Z — 11e81c5

4eefabb verified about 24 hours ago

preview code

raw

history blame contribute delete

7.17 kB

Model Card — MicroClimate-X Rain Predictor (Random Forest v1.0)

Following the Model Card methodology of Mitchell et al. (2019). Authored: 2026-05-11 · UKM Final Year Project · KyoukoLi

1. Model Details

Field	Value
Model name	MicroClimate-X RF Rain Predictor
Version	1.0.0
Architecture	`sklearn.ensemble.RandomForestClassifier`
Hyper-parameters	`n_estimators=200, max_depth=None, class_weight='balanced', n_jobs=-1, random_state=42`
Features (n=18)	`elevation_m`, `temperature_c`, `humidity_pct`, `wind_speed_kmh`, `wind_direction_deg`, `pressure_hpa`, `dew_point_c`, `cloud_cover_pct`, `cape_jkg`, `visibility_m`, `wind_u`, `wind_v`, `hour_sin`, `hour_cos`, `month_sin`, `month_cos`, `dew_point_depression`, `pressure_change_3h`, `precipitation_lag_1h`
Target	`is_rain_event` ∈ {0, 1} — defined as `precipitation(t+1h) > 0.1 mm`
Output	`predict_proba(...)[:, 1]` — calibrated probability of rain in the next hour
Author / Contact	Li Zhenyue (`KyoukoLi`), Faculty of Information Science & Technology, UKM
Licence	MIT (see `LICENSE`)

2. Intended Use

Primary use case: terrain-aware rain-risk decision support inside the MicroClimate-X hybrid pipeline. The RF probability is one input among many — the topographic Rule Engine has final authority (Veto cascade + R1-R4 decision table).
Intended users: hikers, drivers, construction crews, and other outdoor decision makers in complex terrain (initially Malaysian mountain regions).
Out-of-scope uses:
- Lightning forecasting (CAPE → thunderstorm risk is handled by the rule engine sub-scorer, not by this model).
- Multi-hour quantitative precipitation forecasting.
- Aviation, marine, or any life-critical use without the Rule Engine veto layer in the loop.

3. Training Data

Field	Value
Source	ECMWF ERA5 Reanalysis (via Open-Meteo Historical Archive API)
Spatial coverage	5 mountain sites in West Malaysia (Genting, Cameron, Brinchang, Korbu, Kinabalu)
Temporal coverage	2019-01-01 → 2024-12-31 (5 years, hourly)
Total rows	175 315
Class balance	29.2 % positive (rain-event), 70.8 % negative
Train / test split	Time-based; 80 % oldest → train; 20 % newest → test. No random shuffling — would leak temporal autocorrelation.
Synthetic fallback	`scripts/1b_synth_dataset.py` generates a physically-plausible synthetic replacement when the Open-Meteo API is unreachable. The synthetic data set has the same schema and is sufficient for end-to-end pipeline verification but should not be used to ship a production model.

4. Evaluation — Held-out 20 % temporal test set (n = 35 063)

Numbers below come from figures/evaluation_summary.json, reproducible via make evaluate.

4.1 Discrimination

Metric	Value
ROC AUC	0.871
PR Average Precision	0.750
Test-set base rate	0.292

4.2 Calibration

Metric	Value
Brier score	0.138 (lower is better; 0 is perfect, 0.25 is random)

The reliability diagram (figures/03_calibration_curve.png) shows the predicted probability tracks the empirical frequency closely; no post-hoc calibration (Platt / isotonic) was deemed necessary.

4.3 Operating point — safety-critical threshold

Threshold τ	F1	F2	Precision	Recall
0.50 (default)	0.696	0.694	0.700	0.692
0.20 (chosen)	0.621	0.778	0.466	0.934

We adopt τ = 0.20 because the application is safety-critical: a missed rain event (false negative) on a windward slope can cascade into orographic flash flooding. F2 weights recall 4× higher than precision and is the appropriate metric for this regime (Sasaki, 2007).

4.4 Confusion matrix at τ = 0.20

	Pred = 0	Pred = 1
True = 0	13 877 (TN)	10 950 (FP)
True = 1	679 (FN)	9 557 (TP)

Recall = 9 557 / (9 557 + 679) = 93.4 % — the operationally important metric for "do not let people walk into a storm".

4.5 Top feature importances

precipitation_lag_1h — recent rain is by far the strongest signal (rain begets rain).
hour_cos / hour_sin — diurnal cycle (afternoon convective storms in tropical climates).
pressure_change_3h — falling pressure is a classical storm precursor.
wind_v — meridional wind component, relevant for monsoon-driven precipitation.
dew_point_c / dew_point_depression / temperature_c — moisture saturation indicators.

5. Quantitative Limitations

Geographic generalisation — the model has only seen West Malaysian mountains. Hindcast validation in other tropical mountainous regions is a planned thesis Chapter 5 contribution; until then, the Rule Engine Veto cascade is the only safety net for out-of-distribution coordinates (e.g. Himalayas).
Convective forecasting — the model uses current-hour features to predict next-hour rain. Forecasting horizon > 1 h would degrade accuracy substantially.
Class imbalance — addressed via class_weight='balanced' and the F2-optimal threshold, but precision at τ = 0.20 is moderate (47 %). False positives are tolerable because they only inflate the rainfall sub-score; the composite-score formula combines this with three other hazards.
Calibration drift — Brier = 0.138 in 2024 hold-out. Calibration should be re-checked annually as climate signals shift.

6. Ethical / Safety Considerations

Decision-support only. The system is explicitly not a substitute for official meteorological forecasts; the disclaimer is shown in every UI footer.
Hidden risk surfaced, not hidden. The R1 decision-table rule deliberately raises an alarm when macro model probability is low but local terrain inputs suggest hidden orographic rain — this is the OPPOSITE of the harmful failure mode where ML over-confidently says "safe".
Mt-Everest test (worst-case OOD). When fed coordinates the model has never seen, the RF returns ~0 % rain probability — and the Rule Engine then immediately vetoes on altitude_hypoxia + extreme_cold + gale_wind. See tests/test_rule_engine.py::test_mt_everest_veto_hypoxia.

7. Reproducibility

# Full pipeline from scratch — works offline via the synthetic dataset.
make install-dev
make synth          # OR: download real data via scripts/1_download_dataset.py
make preprocess
make train
make evaluate       # writes figures/*.png + figures/evaluation_summary.json

The seed is fixed (random_state=42) and figures are written to figures/ so the thesis can pull them in directly.

8. Citation

If you reference this model in academic work, please cite:

Li Zhenyue (KyoukoLi). MicroClimate-X: A Hybrid Microclimate Risk Engine for Complex Terrain. Bachelor's Thesis, Universiti Kebangsaan Malaysia, Faculty of Information Science & Technology, 2026. GitHub: https://github.com/KyoukoLi/microclimate-x