Polish WildFIRE-FM model card with final paper assets

d3bc17d verified 7 days ago

6.46 kB

license: mit
tags:
  - wildfire
  - geospatial
  - weather
  - earth-observation
  - foundation-models
  - evaluation
  - pytorch
pipeline_tag: image-segmentation
library_name: pytorch
pretty_name: WildFIRE-FM

WildFIRE-FM

WildFIRE-FM is a wildfire-specialized regional reference backbone for 12-hour gridded wildfire occupancy prediction on a 5 km California grid. It is released with five seeded PyTorch checkpoints, model code, final-paper artifacts, and data-source notes. The raw data are not redistributed.

The model is intended as a reproducible reference backbone for fixed-contract wildfire evaluation, not as a general global wildfire forecasting product. It was trained with regional weather, active-fire supervision, static fuel/canopy/exposure layers, and event-level wildfire resources used by supporting tasks in the paper.

Release Contents

Weights. Five seeded checkpoints are available at models/wildfire_fm/checkpoints/seed_*/best_firms_prauc.pt. Each file is listed with SHA-256 and byte size in models/wildfire_fm/checkpoint_manifest.json.

Model code. The compact U-Net definition is provided in models/wildfire_fm/modeling_unet.py, with a short loading example below.

Paper artifacts. The final manuscript PDF and the final paper figures/tables are included under paper/ and paper_outputs/. Compact CSV/JSON summaries are under artifacts/results/.

Data notes. Data sources and access entry points are documented in data_sources/DATA_SOURCES.md; users must obtain source data from the original providers.

Model Details

Field	Value
Task	12-hour gridded wildfire occupancy prediction
Grid	California regional grid, 5 km, EPSG:5070
Inputs	16 channels: weather fields, validity masks, static fuel/canopy/exposure layers
Architecture	Compact U-Net with occupancy and auxiliary spatial-support heads
Training split	June-August 2024 train, September 2024 validation, October 2024 test
Released seeds	1, 7, 42, 99, 123

Quick Load

import torch
from models.wildfire_fm.modeling_unet import UNetSmallFlex

model = UNetSmallFlex(
    in_ch=16,
    base=32,
    dropout=0.1,
    norm_type="group",
    norm_groups=8,
    use_aux_spatial_head=True,
)
checkpoint = torch.load(
    "models/wildfire_fm/checkpoints/seed_1/best_firms_prauc.pt",
    map_location="cpu",
)
state = checkpoint.get("model", checkpoint)
model.load_state_dict(state)
model.eval()

The checkpoint expects the same 16-channel gridded input described in the paper and in data_sources/DATA_SOURCES.md. This repository does not include raw HRRR, FIRMS, LANDFIRE, WRC, LandScan, WFIGS, MTBS, or comparator feature caches.

Evaluation Snapshot

The paper evaluates WildFIRE-FM and ten Earth-FM comparators under fixed task contracts. A few final-paper WildFIRE-FM values are:

Occupancy union F1: 59.0656 ± 2.7372 percent.
Fire-spread AP: 30.0900 ± 1.2500 percent.
Final burned-area log-RMSE: 1.1657 ± 0.0126, where lower is better.
Analog retrieval nDCG@10: 0.5099 ± 0.0336.
Smoke PM2.5 RMSE: 4.4646 ± 0.0060, where lower is better.
Extreme-heat RMSE-C: 0.2179 ± 0.0043, where lower is better.

The full final-paper tables are included as TeX blocks under paper_outputs/tables/.

Fixed-Contract Checks From The Final Paper

Head-selection regret. This final-paper figure shows that choosing a lightweight head by a ranking metric can lose decision performance under the same frozen features.

Supporting-task rank map. This final-paper figure shows that model ordering changes across burned area, analog retrieval, smoke PM2.5, and extreme heat task contracts.

Primary-task rank changes. This final-paper figure summarizes rank changes across fixed primary-task contracts.

Data Sources

The study uses public or provider-hosted resources, but the processed data are not bundled here:

NOAA HRRR fields for regional weather inputs.
NASA FIRMS active-fire detections for occupancy supervision.
LANDFIRE fuel and canopy layers for static landscape context.
Wildfire Risk to Communities housing density and LandScan population for exposure context.
WFIGS and MTBS event-level resources for burned-area and analog tasks.
External Earth-FM/backbone assets for comparator features.

See data_sources/DATA_SOURCES.md for source roles and access links.

Reproducing Released Paper Outputs

The lightweight check verifies the released final-paper artifacts from compact summaries. It does not require raw data or GPUs.

python3 scripts/reproduce_paper_outputs.py

Full raw-data reruns require separately downloaded source data, local feature caches, and cluster-specific paths. Sanitized reference scripts and a Slurm template are provided under experiments/.

Repository Layout

models/wildfire_fm/        model code, manifests, and checkpoint metadata
paper/                     final manuscript PDF and LaTeX source snapshot
paper_outputs/             final paper figures and TeX table blocks
artifacts/results/         compact CSV/JSON summaries for released outputs
experiments/               sanitized raw-rerun references and Slurm template
data_sources/              source-data roles and access notes
scripts/                   artifact verification and figure/table rebuild helpers

Limitations

WildFIRE-FM is a regional reference model trained for the paper's fixed-contract comparisons. Use outside the California regional grid requires new preprocessing, validation, and contract-specific evaluation. The repository does not provide operational alerts, raw data, or third-party comparator weights.

Citation

@misc{wildfire_fm_evaluation_contracts_2026,
  title = {Does Your Wildfire Prediction Model Actually Work, or Just Score Well?},
  author = {Yangshuang Xu and Yuyang Dai and Liling Chang and Qi Wang and Yushun Dong},
  year = {2026},
  note = {WildFIRE-FM model and fixed-contract wildfire evaluation artifacts}
}

The citation will be updated with arXiv metadata after the preprint is public.