Spaces:
Paused
Paused
Changelog
All notable changes to this project will be documented in this file. Format follows Keep a Changelog; versioning follows Semantic Versioning.
[1.0.0] β 2026-05-11
The first complete release. Engineering-grade hardening across backend, ML pipeline, frontend, and DevOps; the rule engine is fully aligned with the D5 thesis proposal Β§3.7 / P4.
Added β Backend
- Request-ID middleware that stamps every response with
X-Request-IDandX-Response-Time-ms. IncomingX-Request-IDheaders propagate end to end, enabling cross-service tracing. - Centralised error contract (
backend/errors.py) β every non-2xx response is a typedErrorResponse { error, detail, request_id, context }JSON document; no bare 500-HTML responses leak. - Structured logging with per-request log records (
request_idfield on every line, ISO-8601 timestamps). - Enriched
/api/healthreporting uptime, cache row counts (live / expired / total), DB size, and inference-log size. /api/versionendpoint returning version + git short SHA + ML feature schema.- Cache hygiene β
prune_expired()runs on startup, sweeps inference-log rows older than 7 days, andcache_stats()is exposed via/api/health. - Fire-and-forget cache writes with the task reference retained
(
asyncio.create_tasklint compliance). - Defensive ML engine β
predict_rain_probabilityalways returnsfloat β [0, 1]; NaN/Inf/wrong-type feature values gracefully degrade; model-load failures fall through to the heuristic instead of crashing. - Improved heuristic fallback β now also responds to
pressure_change_3hso the "no model yet" demo still behaves sensibly. - Terrain edge cases β antimeridian wrap, polar clamp, ocean / no-data DEM cells handled instead of raising obscure type errors.
Added β Rule Engine (already shipped, now fully tested)
- 4 sub-hazard scorers β rainfall / fog / wind gust / thunderstorm.
- D5 Β§3.7.2 R1-R4 Decision Table.
- Activity-aware weighted composite (
hiker | driver | construction | general). - Dominant-hazard composite formula:
0.80Β·max + 0.20Β·mean(rest).
Added β ML pipeline
scripts/4_evaluate_model.pygenerating publication-quality figures (ROC + AUC, PR + AP, calibration / Brier, threshold sweep, top-20 feature importance, confusion matrix at F2-optimal threshold).figures/evaluation_summary.jsonmachine-readable evaluation blob for the thesis appendix.figures/threshold_sweep.csvfor full reproducibility of the precision-recall trade-off table.models/MODEL_CARD.mdβ HuggingFace-style model card with intended use, training data, evaluation, limitations, and ethical considerations.
Added β Tests
- HTTP integration tests with
respx-mocked external APIs (tests/test_api.py): happy path, cache hit, distinct cache slot per activity, invalid input β 422, upstream failure β 502, CORS, OpenAPI schema. - Cache layer tests (
tests/test_cache.py): TTL, expiry, prune, stats. - Terrain edge-case tests (
tests/test_terrain_edge.py): antimeridian, polar clamp, malformed DEM. - ML engine tests (
tests/test_ml_engine.py): unloaded behaviour, heuristic monotonicity, NaN/None resilience. - Session-scoped
conftest.pysets an isolatedMICROCLIMATEX_DBfor every test run (no clobbering the dev cache). - Total: 70 tests; backend coverage 97 %.
Added β Frontend
- Activity selector (Hiker / Driver / Construction / General) with
localStoragepersistence and keyboard accessibility (aria-pressedfocus-visible).
- 4 mini-gauges for the per-hazard sub-scores, each with a tooltip explaining what drives it.
- D5 Β§3.7.2 R1-R4 indicator badges (highlight when fired).
- Demo scenarios dropdown (Genting Β· Cameron Β· Kinabalu Β· Everest Β· Singapore).
- Loading spinner during in-flight requests.
- Toast notification for errors and "no model loaded" warnings.
- Map layer switcher β Dark base + Topographic option.
- Bilingual EN/ZH UI persisted across reloads.
Added β DevOps / Reproducibility
- GitHub Actions CI (
.github/workflows/ci.yml) β pytest matrix on Python 3.9 / 3.11 / 3.12, ruff lint, coverage XML artefact, plus a Docker image-build smoke test with Buildx + GHA cache. - Multi-stage Dockerfile β builder stage for wheels, slim runtime
with a non-root
mcxuser, baked-in HEALTHCHECK against/api/health. docker-compose.ymlwith a named data volume.Makefileβ single-word recipes forinstall,test,lint,run,synth,preprocess,train,evaluate,docker,clean.requirements-dev.txtβ split dev tooling (pytest-cov, ruff, respx, matplotlib) from runtime requirements.pyproject.tomlβ ruff configuration + pytest config..pre-commit-config.yamlβ trailing-whitespace, end-of-file, YAML/JSON/TOML checks, large-file guard, ruff lint + format..dockerignorekeeping the image lean.
Added β Documentation
docs/architecture.mdβ P4.1 β P4.6 internal flow + dominant-hazard formula rationale.docs/thresholds.mdβ every threshold cited; new Β§8-Β§12 for the four hazard categories, R1-R4 table, and activity-weight matrix.docs/dataset.mdβ formal target definition (is_rain_event) and train/test split rationale.
Changed
- Rainfall sub-scorer calibration β 45 % macro probability now lands at ~ 40 (Caution band), matching the proposal's intent.
- Composite-score formula switched from naive arithmetic mean to dominant-hazard + secondary to avoid mean dilution.
- Cache key now incorporates
activityβ different weights β different composite β must not share a slot.
Fixed
tenacity.RetryErrorfrom the retry decorator was not caught by theexcept httpx.HTTPErrorclause, producing a misleading 500. Now caught alongsidehttpx.HTTPErrorandValueError, returning a clean 502.
[0.2.0] β 2026-05-11
Initial D5-alignment pass β see commit 55fd759.
[0.1.0] β 2026-05-11
Project scaffolding and Hybrid Engine v1 β see commits b218f5b
through 4639890.