ModernTCN for Probabilistic Multivariate Forecasting

The backbone of this model is based on:

  • Donghao Luo, Xue Wang. ModernTCN: A Modern Pure Convolution Structure for General Time Series Analysis. ICLR 2024 Spotlight. OpenReview

This version is adapted for probabilistic forecasting. Instead of only predicting future values, it predicts a Gaussian distribution for each future step and each variable.

Give the model a window of past values and it returns:

  • loc: the predicted mean
  • scale: the predicted standard deviation

That means you get both a forecast and uncertainty bands.

The current checkpoint in this repo is a starter model trained on the hourly ETTh1 benchmark. It is a practical first release, not a heavily tuned benchmark run.

Intended use

This model is a good fit for:

  • energy and load forecasting
  • traffic and mobility forecasting
  • retail demand forecasting
  • industrial sensor forecasting
  • other regularly sampled multivariate operational time series

Inputs and outputs

Input:

  • past_values: [batch_size, context_length, num_input_channels]

Optional training target:

  • future_values: [batch_size, prediction_length, num_input_channels]

Output:

  • loc: [batch_size, prediction_length, num_input_channels]
  • scale: [batch_size, prediction_length, num_input_channels]
  • loss: scalar, when future_values is provided

Current checkpoint

Current released checkpoint details:

  • Dataset: ETTh1
  • Frequency: hourly
  • Context length: 512
  • Prediction length: 96
  • Variables: HUFL,HULL,MUFL,MULL,LUFL,LULL,OT
  • External normalization: dataset-level standardization
  • In-model normalization: RevIN
  • Objective: Gaussian negative log-likelihood
  • Optimizer: AdamW
  • Learning rate: 1e-3
  • Batch size: 32
  • Backbone width: stage_dims=[16, 32, 64]
  • Backbone depth: blocks_per_stage=[1, 1, 2]
  • Best checkpoint: epoch 1

This checkpoint was trained on CPU-friendly settings so we could get a complete end-to-end model ready for the Hub.

Evaluation

These are the numbers for the current starter checkpoint:

Split Dataset Horizon MAE RMSE NLL
Validation ETTh1 96 1.6252 3.0474 0.8281
Test ETTh1 96 1.7555 3.0745 1.0163

For reference, this is a probabilistic adaptation of ModernTCN, so point-forecast metrics from the original paper are only a rough reference point, not a direct apples-to-apples comparison.

How to use it

import torch
from transformers import AutoModel

model = AutoModel.from_pretrained(
    "your-username/modern-tcn-probabilistic-etth1-cpu-starter",
    trust_remote_code=True,
)

past_values = torch.randn(1, 512, 7)
outputs = model(past_values=past_values)

print(outputs.loc.shape)
print(outputs.scale.shape)

If you want simple Gaussian uncertainty bands:

lower_95 = outputs.loc - 1.96 * outputs.scale
upper_95 = outputs.loc + 1.96 * outputs.scale

If you want forecast samples:

samples = model.sample(past_values, num_samples=100)
print(samples.shape)

Train on your own data

The training script supports:

  • .csv with a header row
  • .npy shaped [time, channels]
  • .npz with a values array or a single 2D array

Example:

python scripts/train.py --data-path data/series.csv --timestamp-column timestamp --value-columns load,temperature,price --context-length 512 --prediction-length 96 --epochs 20 --batch-size 32 --output-dir runs/modern-tcn-probabilistic

After training, the script writes:

  • runs/.../best_model
  • runs/.../last_model
  • runs/.../history.json
  • runs/.../summary.json
  • runs/.../data_config.json

If you want dataset-level scaling:

python scripts/train.py --data-path data/series.csv --normalization standard

Limitations

  • The model expects the same variable ordering used in training.
  • Uncertainty quality still needs calibration checks before real deployment.
  • Performance can drop under strong distribution shift.
  • The current checkpoint is a starter CPU-trained run, not a fully tuned benchmark model.
  • The Keras implementation mirrors the architecture, but the Hugging Face packaging path is centered on PyTorch.

Citation

If you use this model, please cite the original ModernTCN paper:

@inproceedings{
  donghao2024moderntcn,
  title={Modern{TCN}: A Modern Pure Convolution Structure for General Time Series Analysis},
  author={Donghao Luo and Xue Wang},
  booktitle={The Twelfth International Conference on Learning Representations},
  year={2024},
  url={https://openreview.net/forum?id=vpJMJerXHU}
}
Downloads last month
52
Safetensors
Model size
543k params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support