microclimate-x / docs /pipeline_order.md
W1nd5pac's picture
Deploy 2026-05-20T06:52:08Z โ€” 11e81c5
4eefabb verified

Project pipeline order โ€” "App is the last"

้กน็›ฎๆต็จ‹้กบๅบ โ€”โ€” "App ๆ”พๅœจๆœ€ๅŽ"

Direct response to supervisor feedback 4/15: "First identify a dataset. And then train the model. And then predict it. Once everything is finished, you can develop the app. App is the last."

4/15 ๅฏผๅธˆๅ้ฆˆ็›ดๆŽฅๅ›žๅบ”๏ผšๅ…ˆ dataset๏ผŒๅ† model๏ผŒๅ† predict๏ผŒๆœ€ๅŽๆ‰ๆ˜ฏ appใ€‚


Current state (May 2026) / ๅฝ“ๅ‰็Šถๆ€๏ผˆ2026 ๅนด 5 ๆœˆ๏ผ‰

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 1 โ€” DATASET                                            โœ… DONE  โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                         โ”‚
โ”‚ Source       : Open-Meteo Historical Archive (ECMWF ERA5)            โ”‚
โ”‚ Coverage     : 5 Malaysian mountain sites, 5 years hourly            โ”‚
โ”‚ Rows         : 175 315                                               โ”‚
โ”‚ Target Y     : is_rain_event โˆˆ {0, 1}  (next-hour rain > 0.1 mm)     โ”‚
โ”‚ Code         : scripts/{1_download, 1b_synth, 2_preprocess}.py        โ”‚
โ”‚ Documentation: docs/dataset.md                                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                  โ”‚
                                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 2 โ€” MODEL TRAINING                                     โœ… DONE  โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                         โ”‚
โ”‚ Algorithm    : Random Forest, class_weight='balanced'                โ”‚
โ”‚ Split        : Time-based, last 20% chronological holdout            โ”‚
โ”‚ CV           : 5-fold TimeSeriesSplit on training portion            โ”‚
โ”‚ Test results : ROC AUC 0.871 ยท PR AP 0.750 ยท Brier 0.138             โ”‚
โ”‚ Operating pt : ฯ„ = 0.20  โ†’  F2 = 0.778, Recall = 0.934               โ”‚
โ”‚ Code         : scripts/3_train_model.py                              โ”‚
โ”‚ Documentation: models/MODEL_CARD.md                                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                  โ”‚
                                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 3 โ€” MODEL EVALUATION                                   โœ… DONE  โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                         โ”‚
โ”‚ Figures      : 6 publication-quality PNGs in figures/                โ”‚
โ”‚   01_roc_curve.png         ยท ROC + AUC                               โ”‚
โ”‚   02_pr_curve.png          ยท Precision-Recall + AP                   โ”‚
โ”‚   03_calibration_curve.png ยท Reliability + Brier                     โ”‚
โ”‚   04_threshold_sweep.png   ยท F1/F2/Precision/Recall vs threshold     โ”‚
โ”‚   05_feature_importance.pngยท Top-20 features                         โ”‚
โ”‚   06_confusion_matrix.png  ยท CM at F2-optimal threshold              โ”‚
โ”‚ Summary      : figures/evaluation_summary.json                       โ”‚
โ”‚ Code         : scripts/4_evaluate_model.py                           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                  โ”‚
                                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 4 โ€” RULE ENGINE (D5 proposal ยง3.7 P4.1-P4.6)           โœ… DONE  โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                         โ”‚
โ”‚ P4.1 Load dynamic risk rules  โ†’ backend/config.py                    โ”‚
โ”‚ P4.2 Fetch user context        โ†’ ?activity= query parameter          โ”‚
โ”‚ P4.3 Evaluate environmental    โ†’ 4 score_*_risk() functions          โ”‚
โ”‚         risks (rainfall, fog, wind gust, thunderstorm)               โ”‚
โ”‚ ยง3.7.2  Decision table R1-R4   โ†’ apply_decision_table_3_7_2()        โ”‚
โ”‚ Veto cascade                   โ†’ _collect_veto_triggers()            โ”‚
โ”‚ P4.4 Activity weighting        โ†’ apply_activity_weighting()          โ”‚
โ”‚ P4.5 Composite risk score      โ†’ dominant-hazard + secondary         โ”‚
โ”‚ P4.6 Actionable advice         โ†’ _normal_advice / _veto_advice       โ”‚
โ”‚ Code         : backend/rule_engine.py                                โ”‚
โ”‚ Documentation: docs/architecture.md, docs/thresholds.md              โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                  โ”‚
                                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 5 โ€” APP (LAST, as instructed)                          โœ… DONE  โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                         โ”‚
โ”‚ Backend     : FastAPI + uvicorn โ€” wraps trained model from Step 2    โ”‚
โ”‚                + rule engine from Step 4                             โ”‚
โ”‚ Frontend    : Vue 3 SPA โ€” bilingual EN/ZH, 4 mini-gauges,            โ”‚
โ”‚                R1-R4 indicators, demo scenarios, error toasts        โ”‚
โ”‚ Container   : Multi-stage Dockerfile + docker-compose.yml            โ”‚
โ”‚ Tests       : 70 tests, 97% backend coverage                         โ”‚
โ”‚ CI          : .github/workflows/ci.yml                               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                  โ”‚
                                  โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ STEP 6 โ€” EVALUATION FOR THESIS CHAPTER 5                    ๐Ÿ”„ PLAN  โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                         โ”‚
โ”‚ 6a ยท Hindcast validation against NaDMA flood / landslide archives    โ”‚
โ”‚ 6b ยท Small user study with mountain hikers (1-month panel)           โ”‚
โ”‚ 6c ยท Comparative ablation: RF only vs Rule only vs Hybrid            โ”‚
โ”‚ 6d ยท Threshold sensitivity analysis (ฯ„ โˆˆ {0.10, 0.15, 0.20, 0.25})   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Reading order for the supervisor / ็ป™ๅฏผๅธˆ่ฟ‡็š„้˜…่ฏป้กบๅบ

When walking the supervisor through the project, strictly follow Steps 1 โ†’ 5:

# Open this Spend
1 docs/dataset.md ยง4 schema, ยง5 Y derivation 60 s
2 figures/01_roc_curve.png + figures/03_calibration_curve.png 30 s
3 figures/04_threshold_sweep.png + figures/05_feature_importance.png 60 s
4 docs/architecture.md ยง"Engine B internals" โ€” show P4.1โ†’P4.6 mapping 60 s
5 frontend/index.html running locally โ€” demo with the Genting & Everest scenarios 60-90 s

Total โ‰ˆ 5 minutes before any Q&A. App is opened last as agreed.

ๆŒ‰่ฟ™ไธช้กบๅบ็ป™ๅฏผๅธˆ่ฟ‡๏ผŒไธฅๆ ผๆŒ‰ 1โ†’5๏ผŒๆ•ดไฝ“ๅคงๆฆ‚ 5 ๅˆ†้’Ÿ่ฟ‡ๅฎŒๅ†่ฟ›ๅ…ฅ Q&Aใ€‚app ไธ€ๅฎšๆ”พๆœ€ๅŽๅผ€๏ผŒ่ทŸๅฏผๅธˆไธŠๆฌก่ฏด็š„ๅฎŒๅ…จไธ€่‡ดใ€‚