bitsofchris Claude Opus 4.7 (1M context) commited on
Commit
40e92a0
Β·
1 Parent(s): 9e5b157

docs: document the hybrid cadence (5-min display, hourly inference)

Browse files

Update the Toto inference spec to reflect the current setup: the chart
shows 5-min Ecowitt actuals from the local SQLite archive, but the
model itself sees the same series resampled to hourly. Drop the stale
dropdown references β€” the cadence and horizon knobs were removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (1) hide show
  1. docs/toto-inference.md +13 -19
docs/toto-inference.md CHANGED
@@ -26,9 +26,10 @@ weakest model in the Toto-2.0 family doing something useful zero-shot; the
26
  | Station | Ecowitt GW3000B, Westhampton Beach NY |
27
  | Channels forecasted | `outdoor.temperature` (Β°F), `outdoor.humidity` (%), `pressure.relative` (inHg), `rainfall_piezo.rain_rate` (in/hr) |
28
  | Native storage cadence | **5 min** at `cycle_type=5min` (the device is configured to upload at ~1-min intervals; Ecowitt buckets to 5 min for the 90-day tier). Earlier defaults of 30 min were the device's out-of-box upload schedule. |
29
- | `cycle_type` requested | depends on the **Display cadence** dropdown β€” `30min` (default, resampled to 1 h or 30 min) or `4hour` |
30
- | History window pulled | 7 days for `30min` cycle, 30 days for `4hour` cycle |
31
- | Resampling | pandas `df.resample(R).mean()` where R matches the dropdown (`1h` / `30min` / `4h`) |
 
32
  | Cleaning | `Series.interpolate(limit_direction="both")` fills resample gaps before the tensor goes to Toto |
33
  | NWS comparison | `https://api.weather.gov/points/{lat},{lon}` β†’ `forecastHourly` (point forecast, no distribution) |
34
 
@@ -48,9 +49,10 @@ else:
48
  target_mask = [False]*pad + [True]*n_raw # tell Toto to ignore the padded steps
49
  ```
50
 
51
- With ~10 days of station history and an hourly resample, this gives a
52
- context of ~160 hourly points (5 patches). On the `4hour` cycle the
53
- station's short history means we often hit the pad path.
 
54
 
55
  ## Tensor shape
56
 
@@ -68,19 +70,11 @@ gets noisier and the post hook is easier to read one metric at a time.
68
 
69
  `horizon_steps = round(horizon_hours / step_hours)` where:
70
 
71
- | Display cadence | `step_hours` |
72
- |---|---|
73
- | Hourly | 1.0 |
74
- | 30-min | 0.5 |
75
- | 4-hour | 4.0 |
76
-
77
- | Horizon dropdown | `horizon_hours` |
78
- |---|---|
79
- | 24 h (default) | 24 |
80
- | 48 h | 48 |
81
- | 72 h | 72 |
82
-
83
- So default: 24 hourly steps. 4-hour cycle Γ— 72 h = 18 steps (smallest). 30-min Γ— 72 h = 144 steps (largest typical).
84
 
85
  ## Distribution β†’ quantiles
86
 
 
26
  | Station | Ecowitt GW3000B, Westhampton Beach NY |
27
  | Channels forecasted | `outdoor.temperature` (Β°F), `outdoor.humidity` (%), `pressure.relative` (inHg), `rainfall_piezo.rain_rate` (in/hr) |
28
  | Native storage cadence | **5 min** at `cycle_type=5min` (the device is configured to upload at ~1-min intervals; Ecowitt buckets to 5 min for the 90-day tier). Earlier defaults of 30 min were the device's out-of-box upload schedule. |
29
+ | `cycle_type` requested | `5min` β€” finest tier the API exposes. The data lives in `data/ecowitt.db` (synced incrementally every 15 min); each refresh reads from the archive rather than hitting Ecowitt live. |
30
+ | History window pulled from the archive | 7 days |
31
+ | Resampling for the chart | `df.resample("5min").mean()` β€” fine-grained display |
32
+ | Resampling for Toto inference | `df.resample("1h").mean()` β€” coarser series so the 4M model receives a 168-point context + 48-step horizon, the regime where it forecasts cleanly. Decoupling the chart cadence from the model cadence keeps the visual fine and the model output honest. |
33
  | Cleaning | `Series.interpolate(limit_direction="both")` fills resample gaps before the tensor goes to Toto |
34
  | NWS comparison | `https://api.weather.gov/points/{lat},{lon}` β†’ `forecastHourly` (point forecast, no distribution) |
35
 
 
49
  target_mask = [False]*pad + [True]*n_raw # tell Toto to ignore the padded steps
50
  ```
51
 
52
+ With ~7 days of archive history and the hourly resample we use for
53
+ inference, this gives a context of 160 hourly points (5 patches). The
54
+ chart shows the same 7 days at 5-min cadence (β‰ˆ2 016 points) β€” but
55
+ those raw points only feed the chart, not the model.
56
 
57
  ## Tensor shape
58
 
 
70
 
71
  `horizon_steps = round(horizon_hours / step_hours)` where:
72
 
73
+ `step_hours` is the **forecast cadence** (1 h), not the chart cadence
74
+ (5 min). `horizon_hours = 48`, so `horizon_steps = 48`. The 48 hourly
75
+ predictions get drawn on top of the 5-min historical line β€” the
76
+ forecast line is therefore sparser than the actuals, but visually
77
+ continuous because Plotly connects the anchor points.
 
 
 
 
 
 
 
 
78
 
79
  ## Distribution β†’ quantiles
80