Vayu — AQI Prediction Models

Pretrained ML model artifacts for Vayu, an end-to-end Air Quality Index prediction system for 29 Indian cities.

Models

Forecaster (`/forecaster`)

XGBoost regressors trained on 846,372 hourly pollutant readings (2015–2024) to predict AQI at three horizons.

File	Horizon	R²	RMSE
`xgb_6h.pkl`	+6 hours	0.9691	6.94
`xgb_12h.pkl`	+12 hours	0.9038	12.25
`xgb_24h.pkl`	+24 hours	0.7764	18.68

Classifier (`/classifier`)

XGBoost classifier mapping current pollutant levels to CPCB AQI categories.

File	Description
`xgb_classifier.pkl`	XGBoost — 4-class CPCB classifier
`best_classifier.pkl`	Deployed model (copy of xgb_classifier)
`classifier_metadata.json`	Label maps, class names, evaluation metrics

Encoders (`/encoders`)

File	Description
`city_encoder.pkl`	LabelEncoder for 29 city names (0–28)
`features.pkl`	Ordered feature list shared across all models
`nmf_scaler.pkl`	MinMaxScaler for NMF pollutant preprocessing

SHAP & NMF (`/shap`)

File	Description
`shap_explainer_6h.pkl`	SHAP TreeExplainer for +6h model
`shap_explainer_12h.pkl`	SHAP TreeExplainer for +12h model
`shap_explainer_24h.pkl`	SHAP TreeExplainer for +24h model
`nmf_model.pkl`	NMF model for city-level pollution source attribution

Input Features (14)

Feature	Description
`pm2_5_ugm3`	Fine particulate matter (log1p transformed)
`pm10_ugm3`	Coarse particulate matter (log1p transformed)
`co_ugm3`	Carbon monoxide (log1p transformed)
`no2_ugm3`	Nitrogen dioxide (log1p transformed)
`so2_ugm3`	Sulfur dioxide
`o3_ugm3`	Ground-level ozone (log1p transformed)
`hour`	Hour of day (0–23)
`month`	Month (1–12)
`day_of_week`	Day of week (0=Monday)
`is_weekend`	1 if Saturday or Sunday
`city_enc`	Label-encoded city integer (0–28)
`AQI_lag_1`	AQI 1 hour prior
`AQI_lag_6`	AQI 6 hours prior
`AQI_lag_24`	AQI 24 hours prior

Usage

import pickle, numpy as np

with open("forecaster/xgb_6h.pkl", "rb") as f:
    model = pickle.load(f)

with open("encoders/city_encoder.pkl", "rb") as f:
    city_encoder = pickle.load(f)

features = [95.4, 142.3, 620.0, 28.5, 12.1, 45.2, 14, 4, 0, 0, 
            city_encoder.transform(["Delhi"])[0], 187, 181, 174]

predicted_aqi = model.predict(np.array(features).reshape(1, -1))
print(predicted_aqi)

Training Data

Source: rachitgoyell/vayu-raw on HuggingFace
Size: 846,372 hourly readings
Cities: 29 Indian urban centres
Period: 2015–2024
Pollutants: PM2.5, PM10, CO, NO2, SO2, O3

License

MIT

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

rachitgoyell
/

vayu-models

Vayu — AQI Prediction Models

Models

Forecaster (`/forecaster`)

Classifier (`/classifier`)

Encoders (`/encoders`)

SHAP & NMF (`/shap`)

Input Features (14)

Usage

Training Data

Links

License

Vayu — AQI Prediction Models

Models

Forecaster (/forecaster)

Classifier (/classifier)

Encoders (/encoders)

SHAP & NMF (/shap)

Input Features (14)

Usage

Training Data

Links

License

Forecaster (`/forecaster`)

Classifier (`/classifier`)

Encoders (`/encoders`)

SHAP & NMF (`/shap`)