| --- |
| license: mit |
| tags: |
| - forecast |
| - weather |
| - lstm |
| - classification |
| - regression |
| - weather-forecast |
| - multitask |
| - harley-ml |
| --- |
| |
| # Hweh-6M |
|
|
| ## Summary |
|
|
| Task: Weather Forecasting |
| Inputs: 72 hours time-series |
| Outputs: 12h multivariate forecast |
| Params: 6M |
| Framework: PyTorch |
|
|
| Author: Paul Courneya (Harley-ml) |
|
|
| ## Description |
|
|
| **Hweh-6M** is a **6-million-parameter LSTM model** trained to predict the next **12 hours of weather**, including temperature, humidity, pressure, precipitation, and more, using the previous **72 hours of weather context**. |
| We recommend using this model as a backup to a weather API or for offline forecasting when internet access is unavailable. |
| However, this model was primarily trained to serve as a teacher for [Hweh-446k](https://huggingface.co/Harley-ml/Hweh-446k). |
|
|
| We would also like to give a shoutout to [**Open-Meteo**](https://open-meteo.com/) for providing a **free-to-use weather forecasting API**. |
|
|
| ### Why “Hweh”? |
|
|
| In Proto-Indo-European, the root ***h₂weh₁-** means “to blow.” We chose it as the name for a weather forecasting model because of its connection to wind and air. |
| |
| ## Architecture |
| |
| The model uses a multitask LSTM setup: |
| |
| | Parameter | Value | |
| | ----------------------- | ---------------------------------------------- | |
| | `input_dim` | `22` | |
| | `seq_len` | `72` | |
| | `num_predict` | `12` | |
| | `hidden_dim` | `384` | |
| | `num_layers` | `6` | |
| | `dropout` | `0.1` | |
| | `encoder_type` | `lstm` | |
| | `num_locations` | `82` | |
| | `location_emb_dim` | `32` | |
| | `num_weather_classes` | `7` | |
| |
| ## Training |
| |
| We trained Hweh-6M on 4.06 million rows of weather data from 82 locations for one epoch, using a batch size of 16 and gradient accumulation of 5. Training ran for 6 hours on an RTX 2060 6GB GPU. |
| |
| ### Input Features |
| |
| 1. `temperature_2m_norm` |
| 2. `relative_humidity_2m_norm` |
| 3. `apparent_temperature_norm` |
| 4. `precipitation_log_norm` |
| 5. `sea_level_pressure_norm` |
| 6. `surface_pressure_norm` |
| 7. `cloud_cover_total_norm` |
| 8. `visibility_norm` |
| 9. `wind_speed_10m_norm` |
| 10. `wind_direction_10m_sin` |
| 11. `wind_direction_10m_cos` |
| 12. `hour_sin` |
| 13. `hour_cos` |
| 14. `day_of_year_sin` |
| 15. `day_of_year_cos` |
| 16. `weather_code_onehot_clear` |
| 17. `weather_code_onehot_cloudy` |
| 18. `weather_code_onehot_fog` |
| 19. `weather_code_onehot_drizzle` |
| 20. `weather_code_onehot_rain` |
| 21. `weather_code_onehot_snow` |
| 22. `weather_code_onehot_thunderstorm` |
| |
| ### Output Features |
| |
| 1. `y_temp_c`: continuous regression |
| 2. `y_humidity`: continuous regression |
| 3. `y_apparent_temperature`: continuous regression |
| 4. `y_precipitation_mm`: continuous regression |
| 5. `y_sea_level_pressure_hpa`: continuous regression |
| 6. `y_surface_pressure_hpa`: continuous regression |
| 7. `y_cloud_cover_total`: continuous regression |
| 8. `y_wind_speed_10m`: continuous regression |
| 9. `y_wind_direction_sin`: continuous regression |
| 10. `y_wind_direction_cos`: continuous regression |
| 11. `y_rain_prob`: binary classification |
| 12. `y_weather_class`: multiclass classification |
| |
| ### Training Results |
| |
| #### Training & Evaluation Metrics |
| |
| | Step | Train Loss | Eval Loss | Weather Acc | Rain Acc | Rain Recall | Weather Recall | |
| | ---: | ---------: | --------: | ----------: | -------: | ----------: | -------------: | |
| | 1k | 2.7015 | 2.8821 | 0.4722 | 0.7193 | 0.7257 | 0.4722 | |
| | 5k | 1.7696 | 2.0623 | 0.6215 | 0.7580 | 0.8089 | 0.6215 | |
| | 10k | 1.6569 | 1.9705 | 0.6129 | 0.7791 | 0.7980 | 0.6129 | |
| | 15k | 1.5844 | 1.9335 | 0.6385 | 0.7679 | 0.8341 | 0.6385 | |
| | 20k | 1.5450 | 1.8958 | 0.6440 | 0.7497 | 0.8515 | 0.6440 | |
| | 25k | 1.4907 | 1.9510 | 0.6474 | 0.7311 | 0.8715 | 0.6474 | |
| | 30k | 1.4338 | 1.9148 | 0.6460 | 0.7490 | 0.8563 | 0.6460 | |
| | 35k | 1.4063 | 1.8880 | 0.6295 | 0.7603 | 0.8439 | 0.6295 | |
| |
| #### Regression Error Metrics (MAE) |
| |
| | Step | Apparent | Cloud | Humidity | Precip (mm) | Sea Level P | Surface P | Temp | Wind | |
| | ---: | -------: | ------: | -------: | ----------: | ----------: | --------: | -----: | -----: | |
| | 1k | 5.4453 | 30.3414 | 15.5881 | 0.1139 | 4.1773 | 41.9909 | 4.3708 | 4.5847 | |
| | 5k | 2.4863 | 26.2320 | 9.4308 | 0.1066 | 3.8299 | 20.8856 | 2.1196 | 3.8560 | |
| | 10k | 2.2052 | 25.7813 | 8.4266 | 0.1041 | 3.4667 | 13.8565 | 1.8603 | 3.4045 | |
| | 15k | 2.1536 | 25.3087 | 8.3589 | 0.1055 | 3.5594 | 11.4830 | 1.8040 | 3.3302 | |
| | 20k | 2.0494 | 25.2389 | 7.8019 | 0.1031 | 3.1997 | 9.6739 | 1.7137 | 3.1650 | |
| | 25k | 1.9981 | 24.6532 | 7.7528 | 0.1112 | 3.2609 | 9.9026 | 1.6572 | 3.2015 | |
| | 30k | 1.9257 | 24.7014 | 7.5697 | 0.1059 | 3.1229 | 8.3426 | 1.5907 | 3.1048 | |
| | 35k | 1.8935 | 24.6961 | 7.4958 | 0.1063 | 3.0854 | 8.6729 | 1.5619 | 3.0681 | |
| |
| ## Generation Examples |
| |
| | ID | Class | |
| | -- | ------------ | |
| | 0 | clear | |
| | 1 | cloudy | |
| | 2 | fog | |
| | 3 | drizzle | |
| | 4 | rain | |
| | 5 | snow | |
| | 6 | thunderstorm | |
| |
| City=Seattle |
| ``` |
| { |
| "city": "Seattle", |
| "location_id": "1", |
| "model_location_id": 0, |
| "data_source": "open-meteo forecast api (past-hours context only)", |
| "requested_at_utc": "2026-05-08T10:50:59.201916+00:00", |
| "context": { |
| "hours": 72, |
| "start_utc": "2026-05-05T10:00:00+00:00", |
| "end_utc": "2026-05-08T09:00:00+00:00", |
| "start_local": "[REDACTED]", |
| "end_local": "[REDACTED]" |
| }, |
| "model": { |
| "encoder_type": "lstm", |
| "seq_len": 72, |
| "input_dim": 22, |
| "num_weather_classes": 7 |
| }, |
| "forecast": [ |
| { |
| "lead_hours": 1, |
| "target_utc": "2026-05-08T10:00:00+00:00", |
| "target_local": "[REDACTED]", |
| "temperature_2m_c": 10.081672668457031, |
| "relative_humidity_2m_pct": 83.57363891601562, |
| "apparent_temperature_c": 8.784963607788086, |
| "precipitation_mm": 0.0014822654193267226, |
| "pressure_msl_hpa": 1019.1531372070312, |
| "surface_pressure_hpa": 1010.2232666015625, |
| "cloud_cover_pct": 23.58745574951172, |
| "wind_speed_10m_kmh": 5.772420883178711, |
| "rain_probability": 0.0017376710893586278, |
| "weather_class": 0, |
| "weather_class_name": "class_0", |
| "weather_class_probabilities": { |
| "class_0": 0.570536732673645, |
| "class_1": 0.40933191776275635, |
| "class_2": 0.019639208912849426, |
| "class_3": 0.00027382391272112727, |
| "class_4": 0.00020224291074555367, |
| "class_5": 1.215106385643594e-05, |
| "class_6": 3.905456196662271e-06 |
| } |
| }, |
| { |
| "lead_hours": 2, |
| "target_utc": "2026-05-08T11:00:00+00:00", |
| "target_local": "[REDACTED]", |
| "temperature_2m_c": 10.004827499389648, |
| "relative_humidity_2m_pct": 84.41477966308594, |
| "apparent_temperature_c": 8.68188762664795, |
| "precipitation_mm": 0.00013178338122088462, |
| "pressure_msl_hpa": 1018.954345703125, |
| "surface_pressure_hpa": 1010.0797729492188, |
| "cloud_cover_pct": 30.078432083129883, |
| "wind_speed_10m_kmh": 5.812187194824219, |
| "rain_probability": 0.004383997060358524, |
| "weather_class": 1, |
| "weather_class_name": "class_1", |
| "weather_class_probabilities": { |
| "class_0": 0.4759916663169861, |
| "class_1": 0.4977237284183502, |
| "class_2": 0.024964570999145508, |
| "class_3": 0.0006970795802772045, |
| "class_4": 0.0005866154097020626, |
| "class_5": 3.2389318221248686e-05, |
| "class_6": 3.920148628822062e-06 |
| } |
| }, |
| { |
| "lead_hours": 3, |
| "target_utc": "2026-05-08T12:00:00+00:00", |
| "target_local": "[REDACTED]" |
| } |
| ], |
| "sanity": { |
| "sequence_shape": [ |
| 72, |
| 22 |
| ], |
| "finite_features": true |
| } |
| } |
| ``` |
| |
| City=Nuuk |
| ``` |
| { |
| "city": "Nuuk", |
| "location_id": "83", |
| "model_location_id": 0, |
| "data_source": "open-meteo forecast api (past-hours context only)", |
| "requested_at_utc": "2026-05-08T10:59:51.127984+00:00", |
| "context": { |
| "hours": 72, |
| "start_utc": "2026-05-05T10:00:00+00:00", |
| "end_utc": "2026-05-08T09:00:00+00:00", |
| "start_local": "2026-05-05T09:00:00-01:00", |
| "end_local": "2026-05-08T08:00:00-01:00" |
| }, |
| "model": { |
| "encoder_type": "lstm", |
| "seq_len": 72, |
| "input_dim": 22, |
| "num_weather_classes": 7 |
| }, |
| "forecast": [ |
| { |
| "lead_hours": 1, |
| "target_utc": "2026-05-08T10:00:00+00:00", |
| "target_local": "2026-05-08T09:00:00-01:00", |
| "temperature_2m_c": 3.745473861694336, |
| "relative_humidity_2m_pct": 87.24557495117188, |
| "apparent_temperature_c": -1.1178970336914062, |
| "precipitation_mm": 0.9192219972610474, |
| "pressure_msl_hpa": 999.2293090820312, |
| "surface_pressure_hpa": 984.041015625, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 22.042539596557617, |
| "rain_probability": 0.9964759945869446, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 9.087142825592309e-05, |
| "class_1": 0.011110298335552216, |
| "class_2": 0.0009554459829814732, |
| "class_3": 0.28936928510665894, |
| "class_4": 0.19745060801506042, |
| "class_5": 0.5009360313415527, |
| "class_6": 8.744438673602417e-05 |
| } |
| }, |
| { |
| "lead_hours": 2, |
| "target_utc": "2026-05-08T11:00:00+00:00", |
| "target_local": "2026-05-08T10:00:00-01:00", |
| "temperature_2m_c": 3.6379480361938477, |
| "relative_humidity_2m_pct": 87.90388488769531, |
| "apparent_temperature_c": -1.197652816772461, |
| "precipitation_mm": 0.8211548924446106, |
| "pressure_msl_hpa": 998.41796875, |
| "surface_pressure_hpa": 983.3368530273438, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 21.754901885986328, |
| "rain_probability": 0.9918462634086609, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 0.0002011256874538958, |
| "class_1": 0.021235737949609756, |
| "class_2": 0.0012973389821127057, |
| "class_3": 0.20620499551296234, |
| "class_4": 0.25035062432289124, |
| "class_5": 0.5206562280654907, |
| "class_6": 5.390366641222499e-05 |
| } |
| }, |
| { |
| "lead_hours": 3, |
| "target_utc": "2026-05-08T12:00:00+00:00", |
| "target_local": "2026-05-08T11:00:00-01:00", |
| "temperature_2m_c": 3.482311248779297, |
| "relative_humidity_2m_pct": 88.61299896240234, |
| "apparent_temperature_c": -1.3543472290039062, |
| "precipitation_mm": 0.7267112731933594, |
| "pressure_msl_hpa": 997.7637939453125, |
| "surface_pressure_hpa": 982.8118286132812, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 21.31927490234375, |
| "rain_probability": 0.9851851463317871, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 0.0003230531292501837, |
| "class_1": 0.030711255967617035, |
| "class_2": 0.0014986724127084017, |
| "class_3": 0.17889709770679474, |
| "class_4": 0.2378082126379013, |
| "class_5": 0.5507404208183289, |
| "class_6": 2.121348188666161e-05 |
| } |
| }, |
| { |
| "lead_hours": 4, |
| "target_utc": "2026-05-08T13:00:00+00:00", |
| "target_local": "2026-05-08T12:00:00-01:00", |
| "temperature_2m_c": 3.324540138244629, |
| "relative_humidity_2m_pct": 89.35970306396484, |
| "apparent_temperature_c": -1.5628299713134766, |
| "precipitation_mm": 0.6503503322601318, |
| "pressure_msl_hpa": 997.3221435546875, |
| "surface_pressure_hpa": 982.2531127929688, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 20.908214569091797, |
| "rain_probability": 0.9797365069389343, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 0.0005368430283851922, |
| "class_1": 0.036228444427251816, |
| "class_2": 0.00266513810493052, |
| "class_3": 0.1584056168794632, |
| "class_4": 0.2387750893831253, |
| "class_5": 0.5633592009544373, |
| "class_6": 2.9700731829507276e-05 |
| } |
| }, |
| { |
| "lead_hours": 5, |
| "target_utc": "2026-05-08T14:00:00+00:00", |
| "target_local": "2026-05-08T13:00:00-01:00", |
| "temperature_2m_c": 3.088955879211426, |
| "relative_humidity_2m_pct": 90.08441162109375, |
| "apparent_temperature_c": -1.7932510375976562, |
| "precipitation_mm": 0.5726789832115173, |
| "pressure_msl_hpa": 997.1259155273438, |
| "surface_pressure_hpa": 982.1145629882812, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 20.37297821044922, |
| "rain_probability": 0.9752851724624634, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 0.0007767326897010207, |
| "class_1": 0.04325678199529648, |
| "class_2": 0.0034333993680775166, |
| "class_3": 0.15728847682476044, |
| "class_4": 0.24588856101036072, |
| "class_5": 0.5493185520172119, |
| "class_6": 3.7467838410520926e-05 |
| } |
| }, |
| { |
| "lead_hours": 6, |
| "target_utc": "2026-05-08T15:00:00+00:00", |
| "target_local": "2026-05-08T14:00:00-01:00", |
| "temperature_2m_c": 2.8550186157226562, |
| "relative_humidity_2m_pct": 90.83024597167969, |
| "apparent_temperature_c": -1.9607181549072266, |
| "precipitation_mm": 0.4950953722000122, |
| "pressure_msl_hpa": 997.0792236328125, |
| "surface_pressure_hpa": 981.837646484375, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 19.884090423583984, |
| "rain_probability": 0.9711479544639587, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 0.001066686469130218, |
| "class_1": 0.05448324978351593, |
| "class_2": 0.003781423671171069, |
| "class_3": 0.15267883241176605, |
| "class_4": 0.23838046193122864, |
| "class_5": 0.5495800971984863, |
| "class_6": 2.9315853680600412e-05 |
| } |
| }, |
| { |
| "lead_hours": 7, |
| "target_utc": "2026-05-08T16:00:00+00:00", |
| "target_local": "2026-05-08T15:00:00-01:00", |
| "temperature_2m_c": 2.6384010314941406, |
| "relative_humidity_2m_pct": 91.38716888427734, |
| "apparent_temperature_c": -2.114431381225586, |
| "precipitation_mm": 0.43851515650749207, |
| "pressure_msl_hpa": 997.214111328125, |
| "surface_pressure_hpa": 981.5133666992188, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 19.454288482666016, |
| "rain_probability": 0.9665488600730896, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 0.0014152604853734374, |
| "class_1": 0.059757016599178314, |
| "class_2": 0.0037699665408581495, |
| "class_3": 0.1557641476392746, |
| "class_4": 0.22861963510513306, |
| "class_5": 0.550651490688324, |
| "class_6": 2.2469554096460342e-05 |
| } |
| }, |
| { |
| "lead_hours": 8, |
| "target_utc": "2026-05-08T17:00:00+00:00", |
| "target_local": "2026-05-08T16:00:00-01:00", |
| "temperature_2m_c": 2.4830856323242188, |
| "relative_humidity_2m_pct": 91.72871398925781, |
| "apparent_temperature_c": -2.212369918823242, |
| "precipitation_mm": 0.38016656041145325, |
| "pressure_msl_hpa": 997.3843994140625, |
| "surface_pressure_hpa": 981.6067504882812, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 19.01665496826172, |
| "rain_probability": 0.9600462913513184, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 0.001761714811436832, |
| "class_1": 0.06388058513402939, |
| "class_2": 0.005221065599471331, |
| "class_3": 0.13114923238754272, |
| "class_4": 0.21121768653392792, |
| "class_5": 0.5867400765419006, |
| "class_6": 2.95877835014835e-05 |
| } |
| }, |
| { |
| "lead_hours": 9, |
| "target_utc": "2026-05-08T18:00:00+00:00", |
| "target_local": "2026-05-08T17:00:00-01:00", |
| "temperature_2m_c": 2.3713502883911133, |
| "relative_humidity_2m_pct": 91.92076110839844, |
| "apparent_temperature_c": -2.2490768432617188, |
| "precipitation_mm": 0.3401757478713989, |
| "pressure_msl_hpa": 997.632568359375, |
| "surface_pressure_hpa": 981.5086059570312, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 18.69092559814453, |
| "rain_probability": 0.9514879584312439, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 0.002726545324549079, |
| "class_1": 0.08241794258356094, |
| "class_2": 0.007858957163989544, |
| "class_3": 0.14714762568473816, |
| "class_4": 0.2103891372680664, |
| "class_5": 0.5494091510772705, |
| "class_6": 5.06224823766388e-05 |
| } |
| }, |
| { |
| "lead_hours": 10, |
| "target_utc": "2026-05-08T19:00:00+00:00", |
| "target_local": "2026-05-08T18:00:00-01:00", |
| "temperature_2m_c": 2.334117889404297, |
| "relative_humidity_2m_pct": 91.97515869140625, |
| "apparent_temperature_c": -2.216022491455078, |
| "precipitation_mm": 0.29920822381973267, |
| "pressure_msl_hpa": 997.88671875, |
| "surface_pressure_hpa": 981.637451171875, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 18.297332763671875, |
| "rain_probability": 0.9422094821929932, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 0.0030131975654512644, |
| "class_1": 0.0824747234582901, |
| "class_2": 0.00828808918595314, |
| "class_3": 0.1445200890302658, |
| "class_4": 0.1672952026128769, |
| "class_5": 0.5943665504455566, |
| "class_6": 4.211435225442983e-05 |
| } |
| }, |
| { |
| "lead_hours": 11, |
| "target_utc": "2026-05-08T20:00:00+00:00", |
| "target_local": "2026-05-08T19:00:00-01:00", |
| "temperature_2m_c": 2.399325370788574, |
| "relative_humidity_2m_pct": 91.36509704589844, |
| "apparent_temperature_c": -2.106843948364258, |
| "precipitation_mm": 0.2678143382072449, |
| "pressure_msl_hpa": 998.099853515625, |
| "surface_pressure_hpa": 981.798583984375, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 17.996307373046875, |
| "rain_probability": 0.9368607401847839, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 0.004000477027148008, |
| "class_1": 0.09458598494529724, |
| "class_2": 0.01128436904400587, |
| "class_3": 0.14569725096225739, |
| "class_4": 0.17799843847751617, |
| "class_5": 0.5663700699806213, |
| "class_6": 6.342450069496408e-05 |
| } |
| }, |
| { |
| "lead_hours": 12, |
| "target_utc": "2026-05-08T21:00:00+00:00", |
| "target_local": "2026-05-08T20:00:00-01:00", |
| "temperature_2m_c": 2.473114013671875, |
| "relative_humidity_2m_pct": 90.60014343261719, |
| "apparent_temperature_c": -1.94183349609375, |
| "precipitation_mm": 0.23492039740085602, |
| "pressure_msl_hpa": 998.2453002929688, |
| "surface_pressure_hpa": 981.8583374023438, |
| "cloud_cover_pct": 100.0, |
| "wind_speed_10m_kmh": 17.61905860900879, |
| "rain_probability": 0.9265281558036804, |
| "weather_class": 5, |
| "weather_class_name": "class_5", |
| "weather_class_probabilities": { |
| "class_0": 0.004572723060846329, |
| "class_1": 0.10134521126747131, |
| "class_2": 0.016071945428848267, |
| "class_3": 0.12629590928554535, |
| "class_4": 0.11316182464361191, |
| "class_5": 0.6384702920913696, |
| "class_6": 8.211767271859571e-05 |
| } |
| } |
| ], |
| "sanity": { |
| "sequence_shape": [ |
| 72, |
| 22 |
| ], |
| "finite_features": true |
| } |
| } |
| ``` |
| |
| ### Note |
| In observed outputs, the model is often within **0.3°C** of the actual value. |
|
|
| Furthermore, you can pass locations that are not present in the model’s location embedding table. We’ve observed that the model can generalize to out-of-distribution (OOD) cities, with an estimated accuracy drop of only about 2–5%. However, this figure is an estimate and does not reflect a true ground-truth measurement. |
|
|
| ## Use Cases |
|
|
| Intended for: |
|
|
| 1. Backup to API |
| 2. Offline forecasting if you have the data |
| 3. Research |
| 4. Or more simply, for fun |
|
|
| Not intended for: |
|
|
| 1. Safety-critical forecasting (aviation, emergency response) |
| 2. Replacing meteorological or API services |
|
|
| ## Limitations |
|
|
| 1. The model is not perfectly accurate and will produce approximate forecasts rather than exact real-world weather conditions. |
| 2. Prediction accuracy decreases as the forecast horizon increases up to 12 hours. |
| 3. Performance may degrade on unseen or underrepresented geographic regions and climate types. |
| 4. The model does not enforce physical laws of atmospheric dynamics and may produce physically inconsistent outputs. |
| 5. Forecast quality is sensitive to the quality and completeness of input weather data. |
| 6. Rare or extreme weather events are underrepresented in training data and may be poorly predicted. |
| 7. Weather class outputs are simplified and do not capture fine-grained meteorological distinctions. |
|
|
| # Inference |
|
|
| ```python |
| #!/usr/bin/env python3 |
| from __future__ import annotations |
| |
| import json |
| import time |
| from pathlib import Path |
| from typing import Any |
| |
| import numpy as np |
| import pandas as pd |
| import requests |
| import torch |
| from transformers import AutoConfig, AutoModel |
| from zoneinfo import ZoneInfo |
| |
| # ---------------------------- |
| # Change these values here |
| # ---------------------------- |
| MODEL_ID = r"Harley-ml/Hweh-6M" # HF repo id or local path |
| CITY = "New York" |
| SEQUENCE_META_PATH = "Harley-ml/Hweh-6M/weather_sequences.metadata.json" |
| CONTEXT_HOURS = 72 |
| FORECAST_HOURS = 12 |
| DEVICE = None # "cpu", "cuda", "cuda:0", or None for auto |
| |
| API_BASE_URL = "https://api.open-meteo.com/v1/forecast" |
| MAX_RETRIES = 6 |
| REQUEST_TIMEOUT_S = 60 |
| |
| HOURLY_VARS = [ |
| "temperature_2m", |
| "relative_humidity_2m", |
| "apparent_temperature", |
| "precipitation", |
| "weather_code", |
| "pressure_msl", |
| "surface_pressure", |
| "cloud_cover", |
| "visibility", |
| "wind_speed_10m", |
| "wind_direction_10m", |
| ] |
| |
| WEATHER_CODE_BUCKETS = 7 |
| TEMP_SCALE = 50.0 |
| HUMIDITY_SCALE = 100.0 |
| WIND_SCALE = 100.0 |
| |
| # ---------------------------- |
| # City metadata (82 locations) |
| # ---------------------------- |
| CITY_SPECS: dict[str, dict[str, Any]] = { |
| "Seattle": {"location_id": "1", "latitude": 47.6062, "longitude": -122.3321, "continent": "North America", "climate_tag": "temperate_oceanic", "elevation": 56}, |
| "Portland": {"location_id": "2", "latitude": 45.5152, "longitude": -122.6784, "continent": "North America", "climate_tag": "temperate_oceanic", "elevation": 15}, |
| "San Francisco": {"location_id": "3", "latitude": 37.7749, "longitude": -122.4194, "continent": "North America", "climate_tag": "foggy_mediterranean", "elevation": 16}, |
| "Los Angeles": {"location_id": "4", "latitude": 34.0522, "longitude": -118.2437, "continent": "North America", "climate_tag": "sunny_mediterranean", "elevation": 71}, |
| "Denver": {"location_id": "5", "latitude": 39.7392, "longitude": -104.9903, "continent": "North America", "climate_tag": "semi_arid_highland", "elevation": 1609}, |
| "Chicago": {"location_id": "6", "latitude": 41.8781, "longitude": -87.6298, "continent": "North America", "climate_tag": "humid_continental", "elevation": 181}, |
| "Dallas": {"location_id": "7", "latitude": 32.7767, "longitude": -96.7970, "continent": "North America", "climate_tag": "hot_subhumid", "elevation": 131}, |
| "Atlanta": {"location_id": "8", "latitude": 33.7490, "longitude": -84.3880, "continent": "North America", "climate_tag": "humid_subtropical", "elevation": 320}, |
| "New York": {"location_id": "9", "latitude": 40.7128, "longitude": -74.0060, "continent": "North America", "climate_tag": "humid_subtropical", "elevation": 10}, |
| "Miami": {"location_id": "10", "latitude": 25.7617, "longitude": -80.1918, "continent": "North America", "climate_tag": "tropical_humid", "elevation": 2}, |
| "Phoenix": {"location_id": "11", "latitude": 33.4484, "longitude": -112.0740, "continent": "North America", "climate_tag": "hot_arid", "elevation": 331}, |
| "Salt Lake City": {"location_id": "12", "latitude": 40.7608, "longitude": -111.8910, "continent": "North America", "climate_tag": "semi_arid", "elevation": 1288}, |
| "Anchorage": {"location_id": "13", "latitude": 61.2181, "longitude": -149.9003, "continent": "North America", "climate_tag": "subarctic_snowy", "elevation": 31}, |
| "Minneapolis": {"location_id": "14", "latitude": 44.9778, "longitude": -93.2650, "continent": "North America", "climate_tag": "cold_snowy", "elevation": 264}, |
| "Toronto": {"location_id": "15", "latitude": 43.6532, "longitude": -79.3832, "continent": "North America", "climate_tag": "humid_continental", "elevation": 76}, |
| "Montreal": {"location_id": "16", "latitude": 45.5017, "longitude": -73.5673, "continent": "North America", "climate_tag": "cold_snowy", "elevation": 233}, |
| "Vancouver": {"location_id": "17", "latitude": 49.2827, "longitude": -123.1207, "continent": "North America", "climate_tag": "temperate_oceanic", "elevation": 70}, |
| "Mexico City": {"location_id": "18", "latitude": 19.4326, "longitude": -99.1332, "continent": "North America", "climate_tag": "highland_subtropical", "elevation": 2240}, |
| "Havana": {"location_id": "19", "latitude": 23.1136, "longitude": -82.3666, "continent": "North America", "climate_tag": "tropical_humid", "elevation": 59}, |
| "San Juan": {"location_id": "20", "latitude": 18.4655, "longitude": -66.1057, "continent": "North America", "climate_tag": "tropical_humid", "elevation": 8}, |
| |
| "Lima": {"location_id": "21", "latitude": -12.0464, "longitude": -77.0428, "continent": "South America", "climate_tag": "coastal_arid", "elevation": 154}, |
| "Santiago": {"location_id": "22", "latitude": -33.4489, "longitude": -70.6693, "continent": "South America", "climate_tag": "mediterranean", "elevation": 520}, |
| "Buenos Aires": {"location_id": "23", "latitude": -34.6037, "longitude": -58.3816, "continent": "South America", "climate_tag": "humid_subtropical", "elevation": 25}, |
| "Bogotá": {"location_id": "24", "latitude": 4.7110, "longitude": -74.0721, "continent": "South America", "climate_tag": "highland_cool", "elevation": 2640}, |
| "Quito": {"location_id": "25", "latitude": -0.1807, "longitude": -78.4678, "continent": "South America", "climate_tag": "highland_equatorial", "elevation": 2850}, |
| "Caracas": {"location_id": "26", "latitude": 10.4806, "longitude": -66.9036, "continent": "South America", "climate_tag": "tropical_humid", "elevation": 900}, |
| "Rio de Janeiro": {"location_id": "27", "latitude": -22.9068, "longitude": -43.1729, "continent": "South America", "climate_tag": "tropical_humid", "elevation": 5}, |
| "São Paulo": {"location_id": "28", "latitude": -23.5505, "longitude": -46.6333, "continent": "South America", "climate_tag": "humid_subtropical", "elevation": 760}, |
| "La Paz": {"location_id": "29", "latitude": -16.4897, "longitude": -68.1193, "continent": "South America", "climate_tag": "highland_cold", "elevation": 3640}, |
| "Cusco": {"location_id": "30", "latitude": -13.5319, "longitude": -71.9675, "continent": "South America", "climate_tag": "highland_cool", "elevation": 3399}, |
| "Montevideo": {"location_id": "31", "latitude": -34.9011, "longitude": -56.1645, "continent": "South America", "climate_tag": "temperate_oceanic", "elevation": 43}, |
| "Asunción": {"location_id": "32", "latitude": -25.2637, "longitude": -57.5759, "continent": "South America", "climate_tag": "humid_subtropical", "elevation": 43}, |
| "Manaus": {"location_id": "33", "latitude": -3.1190, "longitude": -60.0217, "continent": "South America", "climate_tag": "tropical_humid", "elevation": 92}, |
| "Recife": {"location_id": "34", "latitude": -8.0476, "longitude": -34.8770, "continent": "South America", "climate_tag": "tropical_coastal", "elevation": 4}, |
| "Punta Arenas": {"location_id": "35", "latitude": -53.1638, "longitude": -70.9171, "continent": "South America", "climate_tag": "cold_windy", "elevation": 34}, |
| |
| "London": {"location_id": "36", "latitude": 51.5074, "longitude": -0.1278, "continent": "Europe", "climate_tag": "temperate_oceanic", "elevation": 11}, |
| "Paris": {"location_id": "37", "latitude": 48.8566, "longitude": 2.3522, "continent": "Europe", "climate_tag": "temperate_oceanic", "elevation": 35}, |
| "Madrid": {"location_id": "38", "latitude": 40.4168, "longitude": -3.7038, "continent": "Europe", "climate_tag": "hot_summer_mediterranean", "elevation": 667}, |
| "Rome": {"location_id": "39", "latitude": 41.9028, "longitude": 12.4964, "continent": "Europe", "climate_tag": "hot_summer_mediterranean", "elevation": 21}, |
| "Berlin": {"location_id": "40", "latitude": 52.52, "longitude": 13.4050, "continent": "Europe", "climate_tag": "temperate_continental", "elevation": 34}, |
| "Stockholm": {"location_id": "41", "latitude": 59.3293, "longitude": 18.0686, "continent": "Europe", "climate_tag": "cold_marine", "elevation": 28}, |
| "Oslo": {"location_id": "42", "latitude": 59.9139, "longitude": 10.7522, "continent": "Europe", "climate_tag": "cold_snowy", "elevation": 23}, |
| "Helsinki": {"location_id": "43", "latitude": 60.1699, "longitude": 24.9384, "continent": "Europe", "climate_tag": "cold_snowy", "elevation": 25}, |
| "Reykjavik": {"location_id": "44", "latitude": 64.1466, "longitude": -21.9426, "continent": "Europe", "climate_tag": "cold_windy", "elevation": 12}, |
| "Kyiv": {"location_id": "45", "latitude": 50.4501, "longitude": 30.5234, "continent": "Europe", "climate_tag": "humid_continental", "elevation": 179}, |
| "Lisbon": {"location_id": "46", "latitude": 38.7223, "longitude": -9.1393, "continent": "Europe", "climate_tag": "sunny_mediterranean", "elevation": 7}, |
| "Athens": {"location_id": "47", "latitude": 37.9838, "longitude": 23.7275, "continent": "Europe", "climate_tag": "sunny_mediterranean", "elevation": 70}, |
| "Zurich": {"location_id": "48", "latitude": 47.3769, "longitude": 8.5417, "continent": "Europe", "climate_tag": "temperate_continental", "elevation": 408}, |
| "Dublin": {"location_id": "49", "latitude": 53.3498, "longitude": -6.2603, "continent": "Europe", "climate_tag": "temperate_oceanic", "elevation": 20}, |
| "Vienna": {"location_id": "50", "latitude": 48.2082, "longitude": 16.3738, "continent": "Europe", "climate_tag": "temperate_continental", "elevation": 171}, |
| |
| "Dubai": {"location_id": "51", "latitude": 25.2048, "longitude": 55.2708, "continent": "Asia", "climate_tag": "hot_arid", "elevation": 16}, |
| "Riyadh": {"location_id": "52", "latitude": 24.7136, "longitude": 46.6753, "continent": "Asia", "climate_tag": "hot_arid", "elevation": 612}, |
| "Delhi": {"location_id": "53", "latitude": 28.7041, "longitude": 77.1025, "continent": "Asia", "climate_tag": "hot_semi_arid", "elevation": 216}, |
| "Mumbai": {"location_id": "54", "latitude": 19.0760, "longitude": 72.8777, "continent": "Asia", "climate_tag": "tropical_humid", "elevation": 14}, |
| "Bangkok": {"location_id": "55", "latitude": 13.7563, "longitude": 100.5018, "continent": "Asia", "climate_tag": "tropical_monsoon", "elevation": 2}, |
| "Singapore": {"location_id": "56", "latitude": 1.3521, "longitude": 103.8198, "continent": "Asia", "climate_tag": "tropical_humid", "elevation": 15}, |
| "Tokyo": {"location_id": "57", "latitude": 35.6762, "longitude": 139.6503, "continent": "Asia", "climate_tag": "humid_subtropical", "elevation": 40}, |
| "Seoul": {"location_id": "58", "latitude": 37.5665, "longitude": 126.9780, "continent": "Asia", "climate_tag": "humid_continental", "elevation": 38}, |
| "Ulaanbaatar": {"location_id": "59", "latitude": 47.8864, "longitude": 106.9057, "continent": "Asia", "climate_tag": "cold_steppe", "elevation": 1350}, |
| "Kathmandu": {"location_id": "60", "latitude": 27.7172, "longitude": 85.3240, "continent": "Asia", "climate_tag": "highland_subtropical", "elevation": 1400}, |
| "Chiang Mai": {"location_id": "61", "latitude": 18.7883, "longitude": 98.9853, "continent": "Asia", "climate_tag": "tropical_seasonal", "elevation": 300}, |
| "Lhasa": {"location_id": "62", "latitude": 29.6520, "longitude": 91.1721, "continent": "Asia", "climate_tag": "high_altitude_cold", "elevation": 3656}, |
| "Jakarta": {"location_id": "63", "latitude": -6.2088, "longitude": 106.8456, "continent": "Asia", "climate_tag": "tropical_humid", "elevation": 8}, |
| "Manila": {"location_id": "64", "latitude": 14.5995, "longitude": 120.9842, "continent": "Asia", "climate_tag": "tropical_humid", "elevation": 16}, |
| "Karachi": {"location_id": "65", "latitude": 24.8607, "longitude": 67.0011, "continent": "Asia", "climate_tag": "hot_arid", "elevation": 10}, |
| |
| "Cairo": {"location_id": "66", "latitude": 30.0444, "longitude": 31.2357, "continent": "Africa", "climate_tag": "hot_arid", "elevation": 23}, |
| "Alexandria": {"location_id": "67", "latitude": 31.2001, "longitude": 29.9187, "continent": "Africa", "climate_tag": "coastal_mediterranean", "elevation": 5}, |
| "Casablanca": {"location_id": "68", "latitude": 33.5731, "longitude": -7.5898, "continent": "Africa", "climate_tag": "coastal_mediterranean", "elevation": 56}, |
| "Marrakech": {"location_id": "69", "latitude": 31.6295, "longitude": -7.9811, "continent": "Africa", "climate_tag": "hot_semi_arid", "elevation": 466}, |
| "Lagos": {"location_id": "70", "latitude": 6.5244, "longitude": 3.3792, "continent": "Africa", "climate_tag": "tropical_humid", "elevation": 41}, |
| "Nairobi": {"location_id": "71", "latitude": -1.2921, "longitude": 36.8219, "continent": "Africa", "climate_tag": "temperate_highland", "elevation": 1795}, |
| "Addis Ababa": {"location_id": "72", "latitude": 8.9806, "longitude": 38.7578, "continent": "Africa", "climate_tag": "temperate_highland", "elevation": 2355}, |
| "Cape Town": {"location_id": "73", "latitude": -33.9249, "longitude": 18.4241, "continent": "Africa", "climate_tag": "mediterranean", "elevation": 25}, |
| "Johannesburg": {"location_id": "74", "latitude": -26.2041, "longitude": 28.0473, "continent": "Africa", "climate_tag": "subtropical_highland", "elevation": 1753}, |
| "Windhoek": {"location_id": "75", "latitude": -22.5609, "longitude": 17.0658, "continent": "Africa", "climate_tag": "semi_arid", "elevation": 1650}, |
| "Accra": {"location_id": "76", "latitude": 5.6037, "longitude": -0.1870, "continent": "Africa", "climate_tag": "tropical_humid", "elevation": 61}, |
| "Kigali": {"location_id": "77", "latitude": -1.9441, "longitude": 30.0619, "continent": "Africa", "climate_tag": "highland_tropical", "elevation": 1567}, |
| "Tunis": {"location_id": "78", "latitude": 36.8065, "longitude": 10.1815, "continent": "Africa", "climate_tag": "mediterranean", "elevation": 4}, |
| "Dakar": {"location_id": "79", "latitude": -14.7167, "longitude": -17.4677, "continent": "Africa", "climate_tag": "hot_coastal", "elevation": 25}, |
| "Mombasa": {"location_id": "80", "latitude": -4.0435, "longitude": 39.6682, "continent": "Africa", "climate_tag": "tropical_coastal", "elevation": 17}, |
| |
| "Sydney": {"location_id": "81", "latitude": -33.8688, "longitude": 151.2093, "continent": "Oceania", "climate_tag": "humid_subtropical", "elevation": 58}, |
| "Melbourne": {"location_id": "82", "latitude": -37.8136, "longitude": 144.9631, "continent": "Oceania", "climate_tag": "temperate_oceanic", "elevation": 31}, |
| } |
| |
| CITY_TIMEZONES: dict[str, str] = { |
| "Seattle": "America/Los_Angeles", |
| "Portland": "America/Los_Angeles", |
| "San Francisco": "America/Los_Angeles", |
| "Los Angeles": "America/Los_Angeles", |
| "Denver": "America/Denver", |
| "Chicago": "America/Chicago", |
| "Dallas": "America/Chicago", |
| "Atlanta": "America/New_York", |
| "New York": "America/New_York", |
| "Miami": "America/New_York", |
| "Phoenix": "America/Phoenix", |
| "Salt Lake City": "America/Denver", |
| "Anchorage": "America/Anchorage", |
| "Minneapolis": "America/Chicago", |
| "Toronto": "America/Toronto", |
| "Montreal": "America/Toronto", |
| "Vancouver": "America/Vancouver", |
| "Mexico City": "America/Mexico_City", |
| "Havana": "America/Havana", |
| "San Juan": "America/Puerto_Rico", |
| "Lima": "America/Lima", |
| "Santiago": "America/Santiago", |
| "Buenos Aires": "America/Argentina/Buenos_Aires", |
| "Bogotá": "America/Bogota", |
| "Quito": "America/Guayaquil", |
| "Caracas": "America/Caracas", |
| "Rio de Janeiro": "America/Sao_Paulo", |
| "São Paulo": "America/Sao_Paulo", |
| "La Paz": "America/La_Paz", |
| "Cusco": "America/Lima", |
| "Montevideo": "America/Montevideo", |
| "Asunción": "America/Asuncion", |
| "Manaus": "America/Manaus", |
| "Recife": "America/Recife", |
| "Punta Arenas": "America/Punta_Arenas", |
| "London": "Europe/London", |
| "Paris": "Europe/Paris", |
| "Madrid": "Europe/Madrid", |
| "Rome": "Europe/Rome", |
| "Berlin": "Europe/Berlin", |
| "Stockholm": "Europe/Stockholm", |
| "Oslo": "Europe/Oslo", |
| "Helsinki": "Europe/Helsinki", |
| "Reykjavik": "Atlantic/Reykjavik", |
| "Kyiv": "Europe/Kyiv", |
| "Lisbon": "Europe/Lisbon", |
| "Athens": "Europe/Athens", |
| "Zurich": "Europe/Zurich", |
| "Dublin": "Europe/Dublin", |
| "Vienna": "Europe/Vienna", |
| "Dubai": "Asia/Dubai", |
| "Riyadh": "Asia/Riyadh", |
| "Delhi": "Asia/Kolkata", |
| "Mumbai": "Asia/Kolkata", |
| "Bangkok": "Asia/Bangkok", |
| "Singapore": "Asia/Singapore", |
| "Tokyo": "Asia/Tokyo", |
| "Seoul": "Asia/Seoul", |
| "Ulaanbaatar": "Asia/Ulaanbaatar", |
| "Kathmandu": "Asia/Kathmandu", |
| "Chiang Mai": "Asia/Bangkok", |
| "Lhasa": "Asia/Shanghai", |
| "Jakarta": "Asia/Jakarta", |
| "Manila": "Asia/Manila", |
| "Karachi": "Asia/Karachi", |
| "Cairo": "Africa/Cairo", |
| "Alexandria": "Africa/Cairo", |
| "Casablanca": "Africa/Casablanca", |
| "Marrakech": "Africa/Casablanca", |
| "Lagos": "Africa/Lagos", |
| "Nairobi": "Africa/Nairobi", |
| "Addis Ababa": "Africa/Addis_Ababa", |
| "Cape Town": "Africa/Johannesburg", |
| "Johannesburg": "Africa/Johannesburg", |
| "Windhoek": "Africa/Windhoek", |
| "Accra": "Africa/Accra", |
| "Kigali": "Africa/Kigali", |
| "Tunis": "Africa/Tunis", |
| "Dakar": "Africa/Dakar", |
| "Mombasa": "Africa/Nairobi", |
| "Sydney": "Australia/Sydney", |
| "Melbourne": "Australia/Melbourne", |
| } |
| |
| # ---------------------------- |
| # Helpers |
| # ---------------------------- |
| def weather_code_to_bucket(code) -> int: |
| if code is None: |
| return 1 |
| try: |
| if pd.isna(code): |
| return 1 |
| except Exception: |
| pass |
| |
| code = int(code) |
| if code == 0: |
| return 0 |
| if code in (1, 2, 3): |
| return 1 |
| if code in (45, 48): |
| return 2 |
| if code in (51, 53, 55, 56, 57): |
| return 3 |
| if code in (61, 63, 65, 66, 67, 80, 81, 82): |
| return 4 |
| if code in (71, 73, 75, 77, 85, 86): |
| return 5 |
| if code in (95, 96, 99): |
| return 6 |
| return 1 |
| |
| |
| def cyc(x: np.ndarray, period: float) -> tuple[np.ndarray, np.ndarray]: |
| angle = 2.0 * np.pi * (x / period) |
| return np.sin(angle), np.cos(angle) |
| |
| |
| def clamp_array(x: np.ndarray, lo: float | None = None, hi: float | None = None) -> np.ndarray: |
| return np.clip(x, lo, hi) |
| |
| |
| def request_with_backoff(session: requests.Session, url: str, params: dict[str, Any]) -> dict[str, Any]: |
| last_exc: Exception | None = None |
| for attempt in range(MAX_RETRIES): |
| try: |
| resp = session.get(url, params=params, timeout=REQUEST_TIMEOUT_S) |
| if resp.status_code == 429: |
| retry_after = resp.headers.get("Retry-After") |
| sleep_s = float(retry_after) if retry_after else min(60.0, 2**attempt) |
| print(f"Rate limited. Sleeping {sleep_s:.1f}s and retrying.", flush=True) |
| time.sleep(sleep_s) |
| continue |
| resp.raise_for_status() |
| return resp.json() |
| except Exception as e: |
| last_exc = e |
| sleep_s = min(60.0, 2**attempt) |
| print(f"Request failed: {e}. Sleeping {sleep_s:.1f}s and retrying.", flush=True) |
| time.sleep(sleep_s) |
| raise RuntimeError(f"Failed after {MAX_RETRIES} retries: {params}") from last_exc |
| |
| |
| def load_sequence_meta(path: str) -> dict[str, Any]: |
| p = Path(path) |
| if not p.exists(): |
| return {"location_to_id": {}} |
| with open(p, "r", encoding="utf-8") as f: |
| meta = json.load(f) |
| meta.setdefault("location_to_id", {}) |
| return meta |
| |
| |
| def load_model(): |
| config = AutoConfig.from_pretrained(MODEL_ID, trust_remote_code=True) |
| model = AutoModel.from_pretrained(MODEL_ID, config=config, trust_remote_code=True) |
| model.eval() |
| return model, config |
| |
| |
| def fetch_recent_history(city: str, context_hours: int) -> pd.DataFrame: |
| if city not in CITY_SPECS: |
| raise ValueError(f"Unknown city: {city}") |
| |
| spec = CITY_SPECS[city] |
| session = requests.Session() |
| session.headers.update({"User-Agent": "Mozilla/5.0"}) |
| |
| params = { |
| "latitude": spec["latitude"], |
| "longitude": spec["longitude"], |
| "hourly": ",".join(HOURLY_VARS), |
| "timezone": "UTC", |
| "temperature_unit": "celsius", |
| "wind_speed_unit": "kmh", |
| "precipitation_unit": "mm", |
| "past_hours": int(context_hours) + 2, |
| "forecast_hours": 0, |
| } |
| |
| data = request_with_backoff(session, API_BASE_URL, params=params) |
| hourly = data.get("hourly", {}) |
| if "time" not in hourly: |
| raise ValueError(f"No hourly data returned for {city}: {data}") |
| |
| df = pd.DataFrame(hourly) |
| if df.empty: |
| raise ValueError(f"Empty hourly response for {city}.") |
| |
| df["time"] = pd.to_datetime(df["time"], errors="coerce", utc=True) |
| df = df.dropna(subset=["time"]).sort_values("time").drop_duplicates(subset=["time"]).reset_index(drop=True) |
| |
| needed = HOURLY_VARS |
| missing = [c for c in needed if c not in df.columns] |
| if missing: |
| raise ValueError(f"Missing hourly columns in API response: {missing}") |
| |
| for c in needed: |
| df[c] = pd.to_numeric(df[c], errors="coerce") |
| |
| df["weather_code"] = df["weather_code"].fillna(1) |
| df["precipitation"] = df["precipitation"].fillna(0.0) |
| |
| for c in [ |
| "temperature_2m", |
| "relative_humidity_2m", |
| "apparent_temperature", |
| "precipitation", |
| "pressure_msl", |
| "surface_pressure", |
| "cloud_cover", |
| "visibility", |
| "wind_speed_10m", |
| "wind_direction_10m", |
| ]: |
| df[c] = df[c].interpolate(limit_direction="both").ffill().bfill() |
| |
| now_utc = pd.Timestamp.now(tz="UTC") |
| df = df[df["time"] <= now_utc].copy() |
| |
| if len(df) < context_hours: |
| raise ValueError(f"Not enough observed rows: got {len(df)}, need {context_hours}") |
| |
| return df.tail(context_hours).reset_index(drop=True) |
| |
| |
| def build_single_sequence(df: pd.DataFrame) -> np.ndarray: |
| hour = df["time"].dt.hour.to_numpy() |
| doy = df["time"].dt.dayofyear.to_numpy() |
| |
| hour_sin, hour_cos = cyc(hour.astype(float), 24.0) |
| doy_sin, doy_cos = cyc(doy.astype(float), 365.25) |
| |
| temp = np.nan_to_num(df["temperature_2m"].astype(float).to_numpy(), nan=0.0) |
| humidity = np.nan_to_num(df["relative_humidity_2m"].astype(float).to_numpy(), nan=0.0) |
| apparent = np.nan_to_num(df["apparent_temperature"].astype(float).to_numpy(), nan=0.0) |
| precip = np.nan_to_num(df["precipitation"].astype(float).to_numpy(), nan=0.0) |
| pressure = np.nan_to_num(df["pressure_msl"].astype(float).to_numpy(), nan=0.0) |
| surface_pressure = np.nan_to_num(df["surface_pressure"].astype(float).to_numpy(), nan=0.0) |
| cloud_cover = np.nan_to_num(df["cloud_cover"].astype(float).to_numpy(), nan=0.0) |
| visibility = np.nan_to_num(df["visibility"].astype(float).to_numpy(), nan=0.0) |
| wind = np.nan_to_num(df["wind_speed_10m"].astype(float).to_numpy(), nan=0.0) |
| wind_dir = np.nan_to_num(df["wind_direction_10m"].astype(float).to_numpy(), nan=0.0) |
| |
| humidity = clamp_array(humidity, 0.0, 100.0) |
| cloud_cover = clamp_array(cloud_cover, 0.0, 100.0) |
| precip = clamp_array(precip, 0.0, None) |
| wind = clamp_array(wind, 0.0, None) |
| visibility = clamp_array(visibility, 0.0, None) |
| |
| wind_dir_sin, wind_dir_cos = cyc(wind_dir, 360.0) |
| weather_bucket = df["weather_code"].fillna(1).apply(weather_code_to_bucket).to_numpy(dtype=np.int64) |
| |
| rows = [] |
| for i in range(len(df)): |
| wc_oh = np.zeros(WEATHER_CODE_BUCKETS, dtype=np.float32) |
| wc_oh[weather_bucket[i]] = 1.0 |
| |
| row = np.concatenate( |
| [ |
| np.array( |
| [ |
| temp[i] / TEMP_SCALE, |
| humidity[i] / HUMIDITY_SCALE, |
| apparent[i] / TEMP_SCALE, |
| np.log1p(max(precip[i], 0.0)) / 3.0, |
| pressure[i] / 1100.0, |
| surface_pressure[i] / 1100.0, |
| cloud_cover[i] / 100.0, |
| visibility[i] / 50000.0, |
| wind[i] / WIND_SCALE, |
| wind_dir_sin[i], |
| wind_dir_cos[i], |
| hour_sin[i], |
| hour_cos[i], |
| doy_sin[i], |
| doy_cos[i], |
| ], |
| dtype=np.float32, |
| ), |
| wc_oh, |
| ] |
| ) |
| rows.append(row) |
| |
| seq = np.asarray(rows, dtype=np.float32) |
| |
| if not np.isfinite(seq).all(): |
| bad = np.argwhere(~np.isfinite(seq)) |
| raise ValueError(f"Non-finite values remain in sequence at positions like: {bad[:10].tolist()}") |
| |
| return seq |
| |
| |
| def to_iso(ts: pd.Timestamp, tz_name: str | None = None) -> str: |
| if tz_name: |
| try: |
| return ts.tz_convert(ZoneInfo(tz_name)).isoformat() |
| except Exception: |
| pass |
| return ts.isoformat() |
| |
| |
| def get_logits(out): |
| if isinstance(out, dict) and "logits" in out: |
| return out["logits"] |
| if hasattr(out, "logits"): |
| return out.logits |
| return out |
| |
| |
| def resolve_location_index(seq_meta: dict[str, Any], city_location_id: str) -> int: |
| location_to_id = seq_meta.get("location_to_id", {}) |
| |
| if city_location_id in location_to_id: |
| return int(location_to_id[city_location_id]) |
| |
| try: |
| as_int = int(city_location_id) |
| if as_int in location_to_id: |
| return int(location_to_id[as_int]) |
| if str(as_int) in location_to_id: |
| return int(location_to_id[str(as_int)]) |
| except Exception: |
| pass |
| |
| for unk_key in ("UNK", "<UNK>", "unknown", "UNKNOWN"): |
| if unk_key in location_to_id: |
| return int(location_to_id[unk_key]) |
| |
| return 0 |
| |
| |
| def predict(): |
| seq_meta = load_sequence_meta(SEQUENCE_META_PATH) |
| model, config = load_model() |
| |
| if CITY not in CITY_SPECS: |
| raise ValueError(f"Unknown city: {CITY}") |
| |
| if CONTEXT_HOURS <= 0: |
| raise ValueError("CONTEXT_HOURS must be > 0") |
| |
| if hasattr(config, "seq_len") and int(config.seq_len) != CONTEXT_HOURS: |
| raise ValueError(f"Set CONTEXT_HOURS to {int(config.seq_len)} for this model.") |
| |
| city_spec = CITY_SPECS[CITY] |
| city_tz = CITY_TIMEZONES.get(CITY, "UTC") |
| model_location_id = resolve_location_index(seq_meta, str(city_spec["location_id"])) |
| |
| df = fetch_recent_history(CITY, CONTEXT_HOURS) |
| seq = build_single_sequence(df) |
| |
| X = torch.from_numpy(seq).unsqueeze(0) |
| loc = torch.tensor([model_location_id], dtype=torch.long) |
| |
| target_device = torch.device( |
| DEVICE if DEVICE else ("cuda" if torch.cuda.is_available() else "cpu") |
| ) |
| model = model.to(target_device) |
| X = X.to(target_device) |
| loc = loc.to(target_device) |
| |
| weather_class_names = getattr(config, "weather_class_names", None) |
| if not weather_class_names: |
| weather_class_names = [f"class_{i}" for i in range(int(getattr(config, "num_weather_classes", 7)))] |
| |
| with torch.no_grad(): |
| out = model(X=X, location_id=loc) |
| logits = get_logits(out) |
| |
| ( |
| temp_pred, |
| humidity_pred, |
| apparent_pred, |
| precip_pred, |
| sea_level_pressure_pred, |
| surface_pressure_pred, |
| cloud_cover_pred, |
| wind_pred, |
| wind_dir_sin_pred, |
| wind_dir_cos_pred, |
| rain_logit, |
| weather_logits, |
| ) = logits |
| |
| temp_pred = temp_pred.squeeze(0).detach().cpu().numpy() |
| humidity_pred = humidity_pred.squeeze(0).detach().cpu().numpy() |
| apparent_pred = apparent_pred.squeeze(0).detach().cpu().numpy() |
| precip_pred = precip_pred.squeeze(0).detach().cpu().numpy() |
| sea_level_pressure_pred = sea_level_pressure_pred.squeeze(0).detach().cpu().numpy() |
| surface_pressure_pred = surface_pressure_pred.squeeze(0).detach().cpu().numpy() |
| cloud_cover_pred = cloud_cover_pred.squeeze(0).detach().cpu().numpy() |
| wind_pred = wind_pred.squeeze(0).detach().cpu().numpy() |
| rain_prob = torch.sigmoid(rain_logit).squeeze(0).detach().cpu().numpy() |
| weather_probs = torch.softmax(weather_logits, dim=-1).squeeze(0).detach().cpu().numpy() |
| weather_idx = np.argmax(weather_probs, axis=-1).astype(np.int64) |
| |
| humidity_pred = np.clip(humidity_pred, 0.0, 100.0) |
| cloud_cover_pred = np.clip(cloud_cover_pred, 0.0, 100.0) |
| precip_pred = np.clip(precip_pred, 0.0, None) |
| wind_pred = np.clip(wind_pred, 0.0, None) |
| rain_prob = np.clip(rain_prob, 0.0, 1.0) |
| |
| context_start = df["time"].iloc[0] |
| context_end = df["time"].iloc[-1] |
| requested_at_utc = pd.Timestamp.now(tz="UTC") |
| |
| horizon = min( |
| int(FORECAST_HOURS), |
| int(temp_pred.shape[0]), |
| int(humidity_pred.shape[0]), |
| int(weather_idx.shape[0]), |
| ) |
| |
| forecast = [] |
| for lead in range(1, horizon + 1): |
| target_time = context_end + pd.Timedelta(hours=lead) |
| idx = lead - 1 |
| w_idx = int(weather_idx[idx]) |
| |
| forecast.append( |
| { |
| "lead_hours": lead, |
| "target_utc": target_time.isoformat(), |
| "target_local": to_iso(target_time, city_tz), |
| "temperature_2m_c": float(temp_pred[idx]), |
| "relative_humidity_2m_pct": float(humidity_pred[idx]), |
| "apparent_temperature_c": float(apparent_pred[idx]), |
| "precipitation_mm": float(precip_pred[idx]), |
| "pressure_msl_hpa": float(sea_level_pressure_pred[idx]), |
| "surface_pressure_hpa": float(surface_pressure_pred[idx]), |
| "cloud_cover_pct": float(cloud_cover_pred[idx]), |
| "wind_speed_10m_kmh": float(wind_pred[idx]), |
| "rain_probability": float(rain_prob[idx]), |
| "weather_class": w_idx, |
| "weather_class_name": weather_class_names[w_idx] if w_idx < len(weather_class_names) else f"class_{w_idx}", |
| "weather_class_probabilities": { |
| name: float(prob) for name, prob in zip(weather_class_names, weather_probs[idx]) |
| }, |
| } |
| ) |
| |
| result = { |
| "city": CITY, |
| "location_id": str(city_spec["location_id"]), |
| "model_location_id": int(model_location_id), |
| "data_source": "open-meteo forecast api (past-hours context only)", |
| "requested_at_utc": requested_at_utc.isoformat(), |
| "context": { |
| "hours": int(len(df)), |
| "start_utc": context_start.isoformat(), |
| "end_utc": context_end.isoformat(), |
| "start_local": to_iso(context_start, city_tz), |
| "end_local": to_iso(context_end, city_tz), |
| }, |
| "model": { |
| "model_id": MODEL_ID, |
| "encoder_type": getattr(config, "encoder_type", None), |
| "seq_len": int(getattr(config, "seq_len", CONTEXT_HOURS)), |
| "input_dim": int(getattr(config, "input_dim", seq.shape[1])), |
| "num_weather_classes": int(getattr(config, "num_weather_classes", len(weather_class_names))), |
| }, |
| "forecast": forecast, |
| "sanity": { |
| "sequence_shape": list(seq.shape), |
| "finite_features": bool(np.isfinite(seq).all()), |
| }, |
| } |
| |
| print(json.dumps(result, indent=2)) |
| |
| |
| if __name__ == "__main__": |
| predict() |
| ``` |
| ### Related Models |
|
|
| 1. [DistilHweh-446k](https://huggingface.co/Harley-ml/DistilHweh-446k) |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{Hweh-6m, |
| title = {Hweh-6M: A 6M-Parameter LSTM for Short-Term Multivariate Weather Forecasting}, |
| author = {Paul Courneya; Harley-ml}, |
| year = {2026}, |
| url = {https://huggingface.co/Harley-ml/Hweh-6M} |
| } |
| ``` |