Spaces:
Paused
Paused
File size: 7,169 Bytes
4eefabb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | # Model Card β MicroClimate-X Rain Predictor (Random Forest v1.0)
> Following the *Model Card* methodology of Mitchell et al. (2019).
> Authored: 2026-05-11 Β· UKM Final Year Project Β· KyoukoLi
---
## 1. Model Details
| Field | Value |
|---|---|
| **Model name** | MicroClimate-X RF Rain Predictor |
| **Version** | 1.0.0 |
| **Architecture** | `sklearn.ensemble.RandomForestClassifier` |
| **Hyper-parameters** | `n_estimators=200, max_depth=None, class_weight='balanced', n_jobs=-1, random_state=42` |
| **Features (n=18)** | `elevation_m`, `temperature_c`, `humidity_pct`, `wind_speed_kmh`, `wind_direction_deg`, `pressure_hpa`, `dew_point_c`, `cloud_cover_pct`, `cape_jkg`, `visibility_m`, `wind_u`, `wind_v`, `hour_sin`, `hour_cos`, `month_sin`, `month_cos`, `dew_point_depression`, `pressure_change_3h`, `precipitation_lag_1h` |
| **Target** | `is_rain_event` β {0, 1} β defined as `precipitation(t+1h) > 0.1 mm` |
| **Output** | `predict_proba(...)[:, 1]` β calibrated probability of rain in the next hour |
| **Author / Contact** | Li Zhenyue (`KyoukoLi`), Faculty of Information Science & Technology, UKM |
| **Licence** | MIT (see `LICENSE`) |
---
## 2. Intended Use
* **Primary use case**: terrain-aware rain-risk decision support inside the MicroClimate-X *hybrid* pipeline. The RF probability is one input among many β the topographic Rule Engine has *final authority* (Veto cascade + R1-R4 decision table).
* **Intended users**: hikers, drivers, construction crews, and other outdoor decision makers in complex terrain (initially Malaysian mountain regions).
* **Out-of-scope uses**:
* Lightning forecasting (CAPE β thunderstorm risk is handled by the rule engine sub-scorer, not by this model).
* Multi-hour quantitative precipitation forecasting.
* Aviation, marine, or any life-critical use without the Rule Engine veto layer in the loop.
---
## 3. Training Data
| Field | Value |
|---|---|
| **Source** | ECMWF ERA5 Reanalysis (via Open-Meteo Historical Archive API) |
| **Spatial coverage** | 5 mountain sites in West Malaysia (Genting, Cameron, Brinchang, Korbu, Kinabalu) |
| **Temporal coverage** | 2019-01-01 β 2024-12-31 (5 years, hourly) |
| **Total rows** | 175 315 |
| **Class balance** | 29.2 % positive (rain-event), 70.8 % negative |
| **Train / test split** | Time-based; 80 % oldest β train; 20 % newest β test. **No random shuffling** β would leak temporal autocorrelation. |
| **Synthetic fallback** | `scripts/1b_synth_dataset.py` generates a physically-plausible synthetic replacement when the Open-Meteo API is unreachable. The synthetic data set has the same schema and is sufficient for end-to-end pipeline verification but should **not** be used to ship a production model. |
---
## 4. Evaluation β Held-out 20 % temporal test set (n = 35 063)
Numbers below come from `figures/evaluation_summary.json`, reproducible via `make evaluate`.
### 4.1 Discrimination
| Metric | Value |
|---|---|
| ROC AUC | **0.871** |
| PR Average Precision | **0.750** |
| Test-set base rate | 0.292 |
### 4.2 Calibration
| Metric | Value |
|---|---|
| Brier score | **0.138** (lower is better; 0 is perfect, 0.25 is random) |
The reliability diagram (`figures/03_calibration_curve.png`) shows the predicted probability tracks the empirical frequency closely; no post-hoc calibration (Platt / isotonic) was deemed necessary.
### 4.3 Operating point β safety-critical threshold
| Threshold Ο | F1 | F2 | Precision | Recall |
|---|---|---|---|---|
| 0.50 (default) | 0.696 | 0.694 | 0.700 | 0.692 |
| **0.20 (chosen)** | 0.621 | **0.778** | 0.466 | **0.934** |
We adopt **Ο = 0.20** because the application is **safety-critical**: a missed rain event (false negative) on a windward slope can cascade into orographic flash flooding. F2 weights recall 4Γ higher than precision and is the appropriate metric for this regime (Sasaki, 2007).
### 4.4 Confusion matrix at Ο = 0.20
| | Pred = 0 | Pred = 1 |
|---|---|---|
| **True = 0** | 13 877 (TN) | 10 950 (FP) |
| **True = 1** | 679 (FN) | 9 557 (TP) |
Recall = 9 557 / (9 557 + 679) = **93.4 %** β the operationally important metric for "do not let people walk into a storm".
### 4.5 Top feature importances
1. `precipitation_lag_1h` β recent rain is by far the strongest signal (rain begets rain).
2. `hour_cos` / `hour_sin` β diurnal cycle (afternoon convective storms in tropical climates).
3. `pressure_change_3h` β falling pressure is a classical storm precursor.
4. `wind_v` β meridional wind component, relevant for monsoon-driven precipitation.
5. `dew_point_c` / `dew_point_depression` / `temperature_c` β moisture saturation indicators.
---
## 5. Quantitative Limitations
* **Geographic generalisation** β the model has only seen West Malaysian mountains. Hindcast validation in other tropical mountainous regions is a planned thesis Chapter 5 contribution; until then, the Rule Engine Veto cascade is the only safety net for out-of-distribution coordinates (e.g. Himalayas).
* **Convective forecasting** β the model uses *current-hour* features to predict *next-hour* rain. Forecasting horizon > 1 h would degrade accuracy substantially.
* **Class imbalance** β addressed via `class_weight='balanced'` and the F2-optimal threshold, but precision at Ο = 0.20 is moderate (47 %). False positives are tolerable because they only inflate the *rainfall sub-score*; the composite-score formula combines this with three other hazards.
* **Calibration drift** β Brier = 0.138 in 2024 hold-out. Calibration should be re-checked annually as climate signals shift.
---
## 6. Ethical / Safety Considerations
* **Decision-support only.** The system is explicitly **not** a substitute for official meteorological forecasts; the disclaimer is shown in every UI footer.
* **Hidden risk surfaced, not hidden.** The R1 decision-table rule deliberately raises an alarm when *macro* model probability is low but local terrain inputs suggest hidden orographic rain β this is the OPPOSITE of the harmful failure mode where ML over-confidently says "safe".
* **Mt-Everest test (worst-case OOD).** When fed coordinates the model has never seen, the RF returns ~0 % rain probability β and the Rule Engine then immediately vetoes on `altitude_hypoxia + extreme_cold + gale_wind`. See `tests/test_rule_engine.py::test_mt_everest_veto_hypoxia`.
---
## 7. Reproducibility
```bash
# Full pipeline from scratch β works offline via the synthetic dataset.
make install-dev
make synth # OR: download real data via scripts/1_download_dataset.py
make preprocess
make train
make evaluate # writes figures/*.png + figures/evaluation_summary.json
```
The seed is fixed (`random_state=42`) and figures are written to `figures/` so the thesis can pull them in directly.
---
## 8. Citation
If you reference this model in academic work, please cite:
> Li Zhenyue (KyoukoLi). *MicroClimate-X: A Hybrid Microclimate Risk Engine for Complex Terrain*. Bachelor's Thesis, Universiti Kebangsaan Malaysia, Faculty of Information Science & Technology, 2026. GitHub: <https://github.com/KyoukoLi/microclimate-x>
|