ModernTCN for Probabilistic Multivariate Forecasting
The backbone of this model is based on:
- Donghao Luo, Xue Wang. ModernTCN: A Modern Pure Convolution Structure for General Time Series Analysis. ICLR 2024 Spotlight. OpenReview
This version is adapted for probabilistic forecasting. Instead of only predicting future values, it predicts a Gaussian distribution for each future step and each variable.
Give the model a window of past values and it returns:
loc: the predicted meanscale: the predicted standard deviation
That means you get both a forecast and uncertainty bands.
The current checkpoint in this repo is a starter model trained on the hourly ETTh1 benchmark. It is a practical first release, not a heavily tuned benchmark run.
Intended use
This model is a good fit for:
- energy and load forecasting
- traffic and mobility forecasting
- retail demand forecasting
- industrial sensor forecasting
- other regularly sampled multivariate operational time series
Inputs and outputs
Input:
past_values:[batch_size, context_length, num_input_channels]
Optional training target:
future_values:[batch_size, prediction_length, num_input_channels]
Output:
loc:[batch_size, prediction_length, num_input_channels]scale:[batch_size, prediction_length, num_input_channels]loss: scalar, whenfuture_valuesis provided
Current checkpoint
Current released checkpoint details:
- Dataset:
ETTh1 - Frequency: hourly
- Context length:
512 - Prediction length:
96 - Variables:
HUFL,HULL,MUFL,MULL,LUFL,LULL,OT - External normalization: dataset-level standardization
- In-model normalization:
RevIN - Objective: Gaussian negative log-likelihood
- Optimizer:
AdamW - Learning rate:
1e-3 - Batch size:
32 - Backbone width:
stage_dims=[16, 32, 64] - Backbone depth:
blocks_per_stage=[1, 1, 2] - Best checkpoint: epoch
1
This checkpoint was trained on CPU-friendly settings so we could get a complete end-to-end model ready for the Hub.
Evaluation
These are the numbers for the current starter checkpoint:
| Split | Dataset | Horizon | MAE | RMSE | NLL |
|---|---|---|---|---|---|
| Validation | ETTh1 | 96 | 1.6252 | 3.0474 | 0.8281 |
| Test | ETTh1 | 96 | 1.7555 | 3.0745 | 1.0163 |
For reference, this is a probabilistic adaptation of ModernTCN, so point-forecast metrics from the original paper are only a rough reference point, not a direct apples-to-apples comparison.
How to use it
import torch
from transformers import AutoModel
model = AutoModel.from_pretrained(
"your-username/modern-tcn-probabilistic-etth1-cpu-starter",
trust_remote_code=True,
)
past_values = torch.randn(1, 512, 7)
outputs = model(past_values=past_values)
print(outputs.loc.shape)
print(outputs.scale.shape)
If you want simple Gaussian uncertainty bands:
lower_95 = outputs.loc - 1.96 * outputs.scale
upper_95 = outputs.loc + 1.96 * outputs.scale
If you want forecast samples:
samples = model.sample(past_values, num_samples=100)
print(samples.shape)
Train on your own data
The training script supports:
.csvwith a header row.npyshaped[time, channels].npzwith avaluesarray or a single 2D array
Example:
python scripts/train.py --data-path data/series.csv --timestamp-column timestamp --value-columns load,temperature,price --context-length 512 --prediction-length 96 --epochs 20 --batch-size 32 --output-dir runs/modern-tcn-probabilistic
After training, the script writes:
runs/.../best_modelruns/.../last_modelruns/.../history.jsonruns/.../summary.jsonruns/.../data_config.json
If you want dataset-level scaling:
python scripts/train.py --data-path data/series.csv --normalization standard
Limitations
- The model expects the same variable ordering used in training.
- Uncertainty quality still needs calibration checks before real deployment.
- Performance can drop under strong distribution shift.
- The current checkpoint is a starter CPU-trained run, not a fully tuned benchmark model.
- The Keras implementation mirrors the architecture, but the Hugging Face packaging path is centered on PyTorch.
Citation
If you use this model, please cite the original ModernTCN paper:
@inproceedings{
donghao2024moderntcn,
title={Modern{TCN}: A Modern Pure Convolution Structure for General Time Series Analysis},
author={Donghao Luo and Xue Wang},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=vpJMJerXHU}
}
- Downloads last month
- 52