| # Supervisor Progress-Update Brief — bilingual script |
| # 导师进度汇报双语逐字稿 |
|
|
| > Follow-up meeting after the v1.0.0 hardening pass on 2026-05-11. |
| > Walk-through order is unchanged: **dataset → model → app → next steps**. |
| > Open this file on screen during the meeting; do not read word-for-word. |
| > |
| > 紧接 2026-05-11 v1.0.0 强化提交之后的**进度汇报**会议。 |
| > 顺序一律不变:**dataset → model → app → 下一步**。 |
| > 开会时屏幕上打开本文档,**不要照念**,当兜底用即可。 |
|
|
| --- |
|
|
| ## 0. What you need to do — three time windows |
| ## 0. 你要做的事 —— 三个时间窗口 |
|
|
| ### 0.1 Before the meeting (T-15 min) / 会前 15 分钟 |
|
|
| | ☐ | English | 中文 | |
| |---|---|---| |
| | ☐ | Charge laptop ≥ 80 %; charger in bag. | 笔记本充满 ≥ 80%,充电器带上。 | |
| | ☐ | `cd ~/Projects/microclimate-x && git pull && git status` — must print "working tree clean". | 拉最新代码,确认 working tree clean。 | |
| | ☐ | `make run` in **terminal A** (leave it running). | 终端 A 起后端,**不要关**。 | |
| | ☐ | `curl -s http://localhost:8000/api/health \| python3 -m json.tool` in **terminal B** — verify `"ml_loaded": true`. | 终端 B 验证健康检查,`ml_loaded` 必须为 `true`。 | |
| | ☐ | Open the 10 browser tabs in the order from `docs/MEETING_CHEAT_SHEET.md` §0 — **app tab is last**. | 按 cheat-sheet §0 顺序开 10 个标签页,**app 标签放最后**。 | |
| | ☐ | This file (`docs/progress_update_brief.md`) open on a separate screen / phone. | 把本文档单独开在副屏或手机上。 | |
| | ☐ | Phone on silent. Deep breath. | 手机静音,深呼吸。 | |
|
|
| ### 0.2 During the meeting (≈ 8 minutes) / 会中 ≈ 8 分钟 |
|
|
| | Block | EN heading | 中文标题 | Time | |
| |---|---|---|---| |
| | 1 | Opening 30 s | 开场 30 秒 | 0:00 → 0:30 | |
| | 2 | What changed since last meeting | 自上次会以来的进展 | 0:30 → 2:00 | |
| | 3 | Live demo — dataset → model → app | 现场演示(顺序不变) | 2:00 → 5:00 | |
| | 4 | Next steps for Chapter 5 | Chapter 5 下一步 | 5:00 → 6:30 | |
| | 5 | Asks + closing | 请示 + 收尾 | 6:30 → 8:00 | |
|
|
| ### 0.3 After the meeting (T+24 h) / 会后 24 小时内 |
|
|
| | ☐ | English | 中文 | |
| |---|---|---| |
| | ☐ | Write meeting minutes — capture every supervisor decision in `docs/meeting_log_<date>.md`. | 写会议纪要,把老师每条决定记到 `docs/meeting_log_<日期>.md`。 | |
| | ☐ | Open one GitHub issue per agreed action item (label: `chapter-5`). | 每个 action item 在 GitHub 开一个 issue,打 `chapter-5` 标签。 | |
| | ☐ | Email a 3-bullet summary back to the supervisor for written confirmation. | 给老师发 3 条要点的总结邮件,留书面确认。 | |
| | ☐ | Update `README.md` §9 Roadmap — tick boxes that were signed off. | 更新 `README.md` 第 9 节 Roadmap,把通过的项打勾。 | |
| | ☐ | Tag a new release if scope was confirmed (`git tag v1.1.0-rc.1`). | 如果范围确认了,打个新 tag (`v1.1.0-rc.1`)。 | |
|
|
| --- |
|
|
| ## 1. Opening 30 seconds / 开场 30 秒 |
|
|
| | English (say this) | 中文(口头要点) | |
| |---|---| |
| | "Sir, thank you for your time. Following up on our last session, I've completed a production-grade hardening pass — version 1.0.0 — and the full pipeline is now reproducible end-to-end. May I walk you through what's new in the same order as before — dataset, then model, then app — and finish with my proposed plan for Chapter 5?" | "老师感谢您抽时间。接着上次的内容,我做完了**v1.0.0 工程化强化**,整条流水线现在可以**端到端复现**。我按上次的顺序——**dataset、model、app**——给您过一遍新的进展,最后讲我对 Chapter 5 的下一步计划,可以吗?" | |
|
|
| **Why this opening**: it (a) restates the supervisor's preferred process order without him asking, (b) signals you've made forward progress not just polish, and (c) ends with an explicit ask for direction on Chapter 5 — which is what *he* wants to talk about. |
|
|
| **为什么这样开场**:(a) 不用他提就主动按他的流程顺序;(b) 强调是**前进了**而不是只在抛光;(c) 用对 Chapter 5 的请示收尾,**这正是他想聊的话题**。 |
|
|
| --- |
|
|
| ## 2. What changed since the last meeting / 自上次会议以来的进展 |
|
|
| > ~ 90 seconds. Stay on the GitHub repo tab — point to the commit history, |
| > the green CI badge, the v1.0.0 release. |
| > |
| > ≈ 90 秒。停在 GitHub repo 标签页,指给老师看 commit 历史、CI 绿勾、v1.0.0 release。 |
|
|
| | Area | English | 中文 | |
| |---|---|---| |
| | **Backend hardening** | "I added a request-ID middleware, a typed `ErrorResponse` contract so no bare HTML 500s leak, structured logging, and an enriched `/api/health` exposing uptime, cache stats, and the loaded ML feature schema." | "后端我加了 **request-ID 中间件**、**类型化错误协议** `ErrorResponse`(不再泄漏裸 HTML 500)、结构化日志、以及**升级版 `/api/health`**(暴露 uptime、缓存统计、ML 特征 schema)。" | |
| | **ML pipeline** | "I shipped `scripts/4_evaluate_model.py` which produces six publication-quality figures plus a machine-readable `evaluation_summary.json`. I also wrote a HuggingFace-style `MODEL_CARD.md` covering intended use, training data, metrics, limitations, and ethical considerations." | "ML 流水线加了 **评估脚本** `scripts/4_evaluate_model.py`,自动出 6 张论文级别图 + 一份 `evaluation_summary.json`。还写了 HuggingFace 风格的 **MODEL_CARD.md**,覆盖用途、训练数据、指标、局限、伦理考量。" | |
| | **Tests + CI** | "Total tests went from 19 to **70**, backend coverage is **97 %**. CI runs on Python 3.9 / 3.11 / 3.12 plus a Docker image-build smoke test." | "测试数从 19 涨到 **70**,**后端覆盖率 97%**。CI 跑 Python 3.9/3.11/3.12 矩阵,外加 Docker 镜像构建烟测。" | |
| | **Dev-ex** | "Multi-stage Dockerfile, docker-compose, Makefile single-word recipes, pre-commit hooks. The whole project is now `docker compose up --build` away from a clean machine." | "多阶段 Dockerfile + compose + Makefile 单词命令 + pre-commit hooks。**新机器一句 `docker compose up --build` 就能跑起来**。" | |
| | **Documentation** | "Three new docs — `architecture.md`, `thresholds.md` with citations for every Veto threshold, and `pipeline_order.md` which explicitly enforces the dataset → model → app order you asked for." | "三份新文档——`architecture.md`、`thresholds.md`(每个 Veto 阈值都附学术引用)、以及 `pipeline_order.md`(**显式按您要求的 dataset→model→app 顺序写死**)。" | |
| |
| **Artefact to show**: the GitHub commit history page; the green CI badge on the README; `CHANGELOG.md` v1.0.0 entry. |
| |
| **展示物**:GitHub commit 历史页;README 上的 CI 绿勾;`CHANGELOG.md` 中 v1.0.0 那一段。 |
| |
| --- |
| |
| ## 3. Live demo — dataset → model → app / 现场演示(顺序不变) |
| |
| > ~ 3 minutes. Same order as the 5/11 dry-run script — no surprises for the supervisor. |
| > |
| > ≈ 3 分钟。跟 5/11 的脚本完全一样的顺序,**老师不会被打乱节奏**。 |
| |
| ### 3.1 Dataset (Tab `docs/dataset.md`) — 30 s |
| |
| | EN | 中文 | |
| |---|---| |
| | "Same dataset as last time — ERA5 reanalysis, 5 Malaysian mountain sites, 175 315 hourly rows. The Y column `is_rain_event` is derived in one line and documented in §5. No change here, just confirming the foundation is unchanged." | "数据集跟上次一样——ERA5 再分析、马来西亚 5 个山地点位、17.5 万行小时数据。Y 列 `is_rain_event` 一行代码构造,文档在 §5。**这里没有变**,只是确认地基没动。" | |
| |
| ### 3.2 Model (Tabs `01_roc_curve.png` → `03_calibration_curve.png` → `04_threshold_sweep.png` → `05_feature_importance.png`) — 90 s |
| |
| | EN | 中文 | |
| |---|---| |
| | "Same model as last time — Random Forest, time-based split, τ = 0.20. Test ROC AUC **0.871**, PR AP **0.750**, Brier **0.138**, recall **93.4 %**. What's new is the **6 figures plus the model card** — every number you see here is reproducible from `make evaluate`." | "模型跟上次一样——RF、时间序列切分、τ = 0.20。测试 AUC **0.871**、PR AP **0.750**、Brier **0.138**、召回率 **93.4%**。**新东西**是 6 张图 + model card——上面任何一个数字都可以用 `make evaluate` 复现。" | |
| |
| ### 3.3 App (Tab `http://localhost:8000/app/`) — 60-90 s |
| |
| | EN | 中文 | |
| |---|---| |
| | "Step 3, the app — opened **last** as agreed. Two demo scenarios. First, Genting Highlands — a slope at 1865 m inside the training distribution. The model gives a moderate rain probability; the rule engine picks up orographic lift; the four mini-gauges decompose the risk by hazard type." | "第三步 app——按约定**最后才开**。两个 demo 场景。第一个云顶高原——1865 m 的山坡,**在训练分布之内**。模型给中等降雨概率,规则引擎检测到地形抬升,四个 mini-gauge 把风险按灾害类型拆解。" | |
| | "Second, Mt Everest — completely out of distribution. The model alone would say 'safe'. The Veto cascade fires three independent overrides — hypoxia, frostbite, gale — and the composite is forced to Danger. There's a unit test for exactly this: `test_mt_everest_veto_hypoxia`." | "第二个珠峰——**完全分布外**。光看模型会说"安全",但 Veto 级联触发**三个独立否决**——缺氧、冻伤、大风——综合分被强制设为 Danger。**专门有单元测试覆盖这个场景**:`test_mt_everest_veto_hypoxia`。" | |
| |
| --- |
| |
| ## 4. Next steps for Chapter 5 / Chapter 5 下一步 |
| |
| > ~ 90 seconds. **This is the section the supervisor will react to most.** |
| > Frame each item as a concrete deliverable + estimated time + dependency. |
| > |
| > ≈ 90 秒。**老师反应最强烈的就是这一节**。每一项都以"**交付物 + 估时 + 依赖**"形式呈现。 |
| |
| ### 4.1 Proposed Chapter 5 work plan / Chapter 5 工作计划 |
| |
| | # | Deliverable | EN one-liner | 中文一句话 | Estimate | |
| |---|---|---|---|---| |
| | 5.1 | **Comparative ablation** | "Train LogReg + XGBoost on the same features and report ROC / PR / F2 side-by-side with RF — answers 'why RF?' empirically." | "在同一特征集上训 LogReg + XGBoost,对比 ROC / PR / F2,**用数据回答"为什么选 RF"**。" | 1 week | |
| | 5.2 | **Hindcast validation** | "Replay 2020-2024 NaDMA-documented Malaysian flood / landslide events and check whether the system would have raised Warning / Danger at the right time. Reports hit-rate, lead-time, false-alarm rate." | "把 2020-2024 NaDMA 公开的马来西亚洪水/滑坡事件**逐一回放**,看系统能否在事发前给出 Warning/Danger。报告命中率、提前量、误报率。" | 2 weeks | |
| | 5.3 | **Threshold sensitivity** | "Sweep τ ∈ {0.10, 0.15, 0.20, 0.25, 0.30}, plot precision-recall trade-off, and justify the operating point with a cost-of-error analysis." | "扫 τ ∈ {0.10, 0.15, 0.20, 0.25, 0.30},画精度-召回权衡曲线,用**误差代价分析**为最终选点辩护。" | 3 days | |
| | 5.4 | **Component ablation** | "Compare three system variants — RF only / Rule only / Hybrid — on the held-out test set and on the OOD Mt Everest case. Quantifies the rule-engine contribution." | "对比三个系统变体——**纯 RF / 纯规则 / 混合**——在测试集和 OOD 珠峰场景上的表现。**量化规则引擎的贡献**。" | 4 days | |
| | 5.5 | **Small user study** *(optional)* | "Recruit 5-8 mountain hikers, run a 4-week panel, log system advice vs. their field judgment. Reports inter-rater agreement (Cohen's κ)." | "招募 5-8 名登山者,4 周面板研究,记录系统建议 vs 他们现场判断,报告 Cohen's κ 一致性。" | 4 weeks | |
| | 5.6 | **Thesis Chapter 5 draft** | "Pull §5.1-5.5 into a single 12-15 page evaluation chapter with all figures, tables, and discussion." | "把 §5.1-5.5 整合成 12-15 页的评估章节,含全部图表和讨论。" | 1 week (after 5.1-5.4) | |
| |
| ### 4.2 Decision tree to ask the supervisor / 请示决策树 |
| |
| | Question to ask | EN | 中文 | |
| |---|---|---| |
| | **Q1** | "Sir, of the five evaluation tracks above, which two should I prioritise for the **next four weeks** before we converge on the Chapter 5 outline?" | "老师,上面 5 条评估方向,**未来四周**您建议我重点做哪两条,然后再收敛到 Chapter 5 大纲?" | |
| | **Q2** | "Do you want me to include the user study (5.5)? It is the longest item and depends on participant recruitment — I want your call before committing." | "**用户研究 (5.5) 您要不要做**?这一条最长、依赖招募——想请您拍板再投入。" | |
| | **Q3** | "For the comparative ablation, do you want the comparison framed as 'why RF wins' (defending current choice) or 'what if XGBoost wins' (open exploration)? The framing affects how I report inconclusive results." | "**对比实验**您希望框成"为什么 RF 胜出"(**捍卫现有选择**)还是"如果 XGBoost 更好怎么办"(**开放探索**)?两种 framing 对**模棱两可结果**的报告方式不同。" | |
| | **Q4** | "Should I treat the Mt Everest OOD test as a thesis-level contribution (a stand-alone subsection on safety) or just an appendix item?" | "**珠峰 OOD 测试**算论文级别的贡献(单独一节讲安全性),还是放附录就够?" | |
| |
| --- |
| |
| ## 5. Asks + closing 60 seconds / 请示 + 收尾 60 秒 |
| |
| | EN (say this) | 中文(口头要点) | |
| |---|---| |
| | "Sir, to summarise: since the last meeting I've shipped v1.0.0 — production-grade hardening, 70 tests at 97 % coverage, six evaluation figures, a published model card, full Docker reproducibility. The pipeline order is unchanged from what you asked: dataset, model, app. For Chapter 5 I have five evaluation tracks scoped; I'd like your guidance on which two to prioritise for the next four weeks." | "老师,总结:自上次会议以来交付了 **v1.0.0**——工程化强化、70 个测试 97% 覆盖率、6 张评估图、model card、Docker 全复现。流水线顺序按您要求**没动**:dataset、model、app。Chapter 5 我列了 5 条评估方向,**接下来四周您建议我先做哪两条**?" | |
| | "I'll send you a 3-bullet email summary by tomorrow morning so we have written agreement on the priorities. Thank you for your time." | "明早之前给您发 3 条要点的邮件总结,**留个书面确认**。谢谢老师。" | |
| |
| --- |
| |
| ## 6. Q&A defensive lines / Q&A 兜底话术 |
| |
| > Anticipated follow-up questions from this update specifically (not the |
| > classics from the 5/11 brief — those are still live, just don't repeat |
| > them here). |
| > |
| > **针对本次进度汇报**可能出现的追问(5/11 那份的经典 Q1-Q7 仍然有效, |
| > 不重复罗列)。 |
| |
| ### Q-N1 — "Why are you spending time on tests and Docker instead of the thesis?" |
| ### Q-N1 ——为什么你在写测试和 Docker 上花时间,不写论文? |
| |
| | EN | 中文 | |
| |---|---| |
| | "Sir, the v1.0.0 hardening was a one-time investment to make every Chapter 5 number reproducible by the examiner with a single command. Without it, every evaluation result would be a black box — the examiner could not verify the AUC of 0.871 herself. With `make evaluate` reproducing all six figures byte-for-byte, the thesis claims become falsifiable. From this point on, all my time goes to evaluation and writing." | "老师,v1.0.0 的强化是**一次性投资**——为了让评审老师**用一行命令就能复现 Chapter 5 的每一个数字**。没有它,AUC = 0.871 就是黑盒,**评审无法独立验证**。现在 `make evaluate` 能把 6 张图按字节复现,论文的每个 claim 都**可证伪**。从今天起所有时间都给评估和写作。" | |
| |
| ### Q-N2 — "Why hasn't the model improved since last time?" |
| ### Q-N2 ——模型为什么自上次以后没提升? |
| |
| | EN | 中文 | |
| |---|---| |
| | "Two reasons. First, the supervisor's instruction was to *consolidate* dataset and model before adding more capacity — which is what I did. Second, the bottleneck right now is **not the model** but the **rule engine's coverage of OOD scenarios**, which is a Chapter 5 contribution rather than a hyperparameter tweak. I'd rather report a defensible 0.871 with a calibrated rule engine than chase 0.88 with an unprincipled stack." | "两个理由:(1) 您上次的指示是**先把 dataset 和 model 巩固好**再加复杂度——我严格照做了。(2) **当前瓶颈不是模型本身**,而是**规则引擎对 OOD 场景的覆盖**——这是 Chapter 5 的研究贡献,不是调超参。我宁愿报一个**可辩护的 0.871** 加一个校准好的规则引擎,**也不要不讲原理地堆栈到 0.88**。" | |
| |
| ### Q-N3 — "Show me one concrete weakness you have not yet fixed." |
| ### Q-N3 ——给我说一个你目前**还没修**的具体弱点。 |
| |
| | EN | 中文 | |
| |---|---| |
| | "Honestly, Sir, the biggest one is `cape_jkg` — the ERA5 archive returns predominantly zero CAPE for these Malaysian coordinates, which is a known coverage gap. The Random Forest learns nothing from it (0 % importance). The rule engine still uses live Open-Meteo CAPE at inference time, so the production output is fine, but the *training* signal for thunderstorm risk is weaker than I'd like. I plan to address this in §5.4 ablation by quantifying how much it matters." | "老实说,老师,最大的弱点是 **`cape_jkg`**——ERA5 在这些马来西亚坐标上的 CAPE 几乎全为零(**已知覆盖缺口**),**RF 完全没学到东西**(特征重要性 0%)。规则引擎在推理时用的是 Open-Meteo 实时 CAPE,所以生产输出没问题,但**雷暴风险的训练信号**比我希望的弱。计划在 §5.4 消融实验里**量化它的影响**。" | |
| |
| ### Q-N4 — "When can I see the first draft of Chapter 5?" |
| ### Q-N4 ——Chapter 5 初稿什么时候能给我看? |
| |
| | EN | 中文 | |
| |---|---| |
| | "If you sign off on tracks 5.1 + 5.2 + 5.4 today, the data collection finishes in 3 weeks, writing takes 1 week, so you'd have a draft in **4 weeks from today**. If you also want 5.5 (user study), add 4 weeks. I'll lock the date the moment you confirm the scope." | "如果今天您拍板 **5.1 + 5.2 + 5.4** 三条,**3 周收数据 + 1 周写作 = 4 周后给您初稿**。如果再加 **5.5(用户研究)**,再加 4 周。**您一确认范围,我立刻锁定交稿日**。" | |
|
|
| --- |
|
|
| ## 7. Materials checklist before walking in / 开会前自检清单 |
|
|
| ``` |
| ☐ Laptop ≥ 80 % battery, charger in bag |
| ☐ Terminal A: `make run` is running, do not close |
| ☐ Terminal B: `curl /api/health` returned ml_loaded: true within last 5 min |
| ☐ 10 browser tabs open in cheat-sheet §0 order — app tab is LAST |
| ☐ This file open on a separate screen / phone, NOT to be read aloud |
| ☐ docs/MEETING_CHEAT_SHEET.md open as a fall-back |
| ☐ models/MODEL_CARD.md open in case any number is challenged |
| ☐ figures/evaluation_summary.json downloadable on demand |
| ☐ Phone on silent |
| ☐ One deep breath. You shipped v1.0.0. You're prepared. |
| ``` |
|
|
| ``` |
| ☐ 笔记本电池 ≥ 80%,充电器已带 |
| ☐ 终端 A:`make run` 跑着,不要关 |
| ☐ 终端 B:5 分钟内 `curl /api/health` 返回 ml_loaded: true |
| ☐ 10 个浏览器标签页按 cheat-sheet §0 顺序开好——app 标签放最后 |
| ☐ 本文档开在副屏 / 手机,不要照念 |
| ☐ docs/MEETING_CHEAT_SHEET.md 开着兜底 |
| ☐ models/MODEL_CARD.md 开着,老师质疑任何数字立刻打开 |
| ☐ figures/evaluation_summary.json 随时可发 |
| ☐ 手机静音 |
| ☐ 深呼吸。v1.0.0 已经交付。你准备好了。 |
| ``` |
|
|
| --- |
|
|
| ## 8. Cross-references / 相关文档索引 |
|
|
| | Topic | File | |
| |---|---| |
| | Original 5/11 reply to 4/15 feedback | [`supervisor_meeting_brief.md`](supervisor_meeting_brief.md) | |
| | One-page cheat sheet (tab order, demo script) | [`MEETING_CHEAT_SHEET.md`](MEETING_CHEAT_SHEET.md) | |
| | Pipeline order ASCII chart | [`pipeline_order.md`](pipeline_order.md) | |
| | Dataset spec + Y derivation | [`dataset.md`](dataset.md) | |
| | Architecture deep-dive | [`architecture.md`](architecture.md) | |
| | Threshold citations | [`thresholds.md`](thresholds.md) | |
| | Model card | [`../models/MODEL_CARD.md`](../models/MODEL_CARD.md) | |
| | Evaluation summary JSON | [`../figures/evaluation_summary.json`](../figures/evaluation_summary.json) | |
| | What changed in v1.0.0 | [`../CHANGELOG.md`](../CHANGELOG.md) | |
|
|
| --- |
|
|
| > *Generated 2026-05-13 for the MicroClimate-X progress-update meeting at UKM. |
| > 此页为 2026-05-13 UKM 毕业设计 MicroClimate-X 进度汇报准备文档。* |
|
|