microclimate-x / docs /pipeline_order.md
W1nd5pac's picture
Deploy 2026-05-20T06:52:08Z โ€” 11e81c5
4eefabb verified
# Project pipeline order โ€” "App is the last"
# ้กน็›ฎๆต็จ‹้กบๅบ โ€”โ€” "App ๆ”พๅœจๆœ€ๅŽ"
> Direct response to supervisor feedback 4/15: "First identify a dataset.
> And then train the model. And then predict it. Once everything is
> finished, you can develop the app. App is the last."
>
> 4/15 ๅฏผๅธˆๅ้ฆˆ็›ดๆŽฅๅ›žๅบ”๏ผšๅ…ˆ dataset๏ผŒๅ† model๏ผŒๅ† predict๏ผŒๆœ€ๅŽๆ‰ๆ˜ฏ appใ€‚
---
## Current state (May 2026) / ๅฝ“ๅ‰็Šถๆ€๏ผˆ2026 ๅนด 5 ๆœˆ๏ผ‰
```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 1 โ€” DATASET โœ… DONE โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”‚
โ”‚ Source : Open-Meteo Historical Archive (ECMWF ERA5) โ”‚
โ”‚ Coverage : 5 Malaysian mountain sites, 5 years hourly โ”‚
โ”‚ Rows : 175 315 โ”‚
โ”‚ Target Y : is_rain_event โˆˆ {0, 1} (next-hour rain > 0.1 mm) โ”‚
โ”‚ Code : scripts/{1_download, 1b_synth, 2_preprocess}.py โ”‚
โ”‚ Documentation: docs/dataset.md โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 2 โ€” MODEL TRAINING โœ… DONE โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”‚
โ”‚ Algorithm : Random Forest, class_weight='balanced' โ”‚
โ”‚ Split : Time-based, last 20% chronological holdout โ”‚
โ”‚ CV : 5-fold TimeSeriesSplit on training portion โ”‚
โ”‚ Test results : ROC AUC 0.871 ยท PR AP 0.750 ยท Brier 0.138 โ”‚
โ”‚ Operating pt : ฯ„ = 0.20 โ†’ F2 = 0.778, Recall = 0.934 โ”‚
โ”‚ Code : scripts/3_train_model.py โ”‚
โ”‚ Documentation: models/MODEL_CARD.md โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 3 โ€” MODEL EVALUATION โœ… DONE โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”‚
โ”‚ Figures : 6 publication-quality PNGs in figures/ โ”‚
โ”‚ 01_roc_curve.png ยท ROC + AUC โ”‚
โ”‚ 02_pr_curve.png ยท Precision-Recall + AP โ”‚
โ”‚ 03_calibration_curve.png ยท Reliability + Brier โ”‚
โ”‚ 04_threshold_sweep.png ยท F1/F2/Precision/Recall vs threshold โ”‚
โ”‚ 05_feature_importance.pngยท Top-20 features โ”‚
โ”‚ 06_confusion_matrix.png ยท CM at F2-optimal threshold โ”‚
โ”‚ Summary : figures/evaluation_summary.json โ”‚
โ”‚ Code : scripts/4_evaluate_model.py โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 4 โ€” RULE ENGINE (D5 proposal ยง3.7 P4.1-P4.6) โœ… DONE โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”‚
โ”‚ P4.1 Load dynamic risk rules โ†’ backend/config.py โ”‚
โ”‚ P4.2 Fetch user context โ†’ ?activity= query parameter โ”‚
โ”‚ P4.3 Evaluate environmental โ†’ 4 score_*_risk() functions โ”‚
โ”‚ risks (rainfall, fog, wind gust, thunderstorm) โ”‚
โ”‚ ยง3.7.2 Decision table R1-R4 โ†’ apply_decision_table_3_7_2() โ”‚
โ”‚ Veto cascade โ†’ _collect_veto_triggers() โ”‚
โ”‚ P4.4 Activity weighting โ†’ apply_activity_weighting() โ”‚
โ”‚ P4.5 Composite risk score โ†’ dominant-hazard + secondary โ”‚
โ”‚ P4.6 Actionable advice โ†’ _normal_advice / _veto_advice โ”‚
โ”‚ Code : backend/rule_engine.py โ”‚
โ”‚ Documentation: docs/architecture.md, docs/thresholds.md โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 5 โ€” APP (LAST, as instructed) โœ… DONE โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”‚
โ”‚ Backend : FastAPI + uvicorn โ€” wraps trained model from Step 2 โ”‚
โ”‚ + rule engine from Step 4 โ”‚
โ”‚ Frontend : Vue 3 SPA โ€” bilingual EN/ZH, 4 mini-gauges, โ”‚
โ”‚ R1-R4 indicators, demo scenarios, error toasts โ”‚
โ”‚ Container : Multi-stage Dockerfile + docker-compose.yml โ”‚
โ”‚ Tests : 70 tests, 97% backend coverage โ”‚
โ”‚ CI : .github/workflows/ci.yml โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 6 โ€” EVALUATION FOR THESIS CHAPTER 5 ๐Ÿ”„ PLAN โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”‚
โ”‚ 6a ยท Hindcast validation against NaDMA flood / landslide archives โ”‚
โ”‚ 6b ยท Small user study with mountain hikers (1-month panel) โ”‚
โ”‚ 6c ยท Comparative ablation: RF only vs Rule only vs Hybrid โ”‚
โ”‚ 6d ยท Threshold sensitivity analysis (ฯ„ โˆˆ {0.10, 0.15, 0.20, 0.25}) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```
## Reading order for the supervisor / ็ป™ๅฏผๅธˆ่ฟ‡็š„้˜…่ฏป้กบๅบ
When walking the supervisor through the project, **strictly follow Steps 1 โ†’ 5**:
| # | Open this | Spend |
|---|---|---|
| 1 | `docs/dataset.md` ยง4 schema, ยง5 Y derivation | 60 s |
| 2 | `figures/01_roc_curve.png` + `figures/03_calibration_curve.png` | 30 s |
| 3 | `figures/04_threshold_sweep.png` + `figures/05_feature_importance.png` | 60 s |
| 4 | `docs/architecture.md` ยง"Engine B internals" โ€” show P4.1โ†’P4.6 mapping | 60 s |
| 5 | `frontend/index.html` running locally โ€” demo with the Genting & Everest scenarios | 60-90 s |
Total โ‰ˆ 5 minutes before any Q&A. App is opened **last** as agreed.
ๆŒ‰่ฟ™ไธช้กบๅบ็ป™ๅฏผๅธˆ่ฟ‡๏ผŒ**ไธฅๆ ผๆŒ‰ 1โ†’5**๏ผŒๆ•ดไฝ“ๅคงๆฆ‚ 5 ๅˆ†้’Ÿ่ฟ‡ๅฎŒๅ†่ฟ›ๅ…ฅ Q&Aใ€‚**app ไธ€ๅฎšๆ”พๆœ€ๅŽๅผ€**๏ผŒ่ทŸๅฏผๅธˆไธŠๆฌก่ฏด็š„ๅฎŒๅ…จไธ€่‡ดใ€‚