Datadog
/

Toto-2.0-313m

@@ -1,10 +1,134 @@
 ---
 tags:
-- model_hub_mixin
 - pytorch_model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Code: [More Information Needed]
-- Paper: [More Information Needed]
-- Docs: [More Information Needed]

 ---
 tags:
+- time-series-forecasting
+- foundation-models
+- pretrained-models
+- time-series
+- timeseries
+- forecasting
+- observability
+- safetensors
 - pytorch_model_hub_mixin
+license: apache-2.0
+pipeline_tag: time-series-forecasting
+thumbnail: https://corp.dd-static.net/img/about/presskit/kit/press_kit.png
+model-index:
+- name: Toto-2.0-313m
+  results:
+    - task:
+        type: time-series-forecasting
+      dataset:
+        name: BOOM
+        type: BOOM
+      metrics:
+        - name: CRPS
+          type: CRPS
+          value: 0.351
+        - name: MASE
+          type: MASE
+          value: 0.585
+      source:
+        name: BOOM 💥 Observability Time-Series Forecasting Leaderboard
+        url: https://huggingface.co/spaces/Datadog/BOOM
+    - task:
+        type: time-series-forecasting
+      dataset:
+        name: GIFT-Eval
+        type: GIFT-Eval
+      metrics:
+        - name: CRPS
+          type: CRPS
+          value: 0.481
+        - name: MASE
+          type: MASE
+          value: 0.703
+      source:
+        name: GIFT-Eval Time Series Forecasting Leaderboard
+        url: https://huggingface.co/spaces/Salesforce/GIFT-Eval
 ---
+# Toto-2.0-313m
+Toto (**T**ime Series **O**ptimized **T**ransformer for [**O**bservability](https://www.datadoghq.com/knowledge-center/observability/)) is a family of time series foundation models for multivariate forecasting developed by [Datadog](https://www.datadoghq.com/). **Toto 2.0** is the current generation, featuring u-μP-scaled transformers ranging from 4M to 2.5B parameters.
+---
+## ✨ Key Features
+- **Zero-Shot Forecasting**: Forecast without fine-tuning on your specific time series.
+- **Multi-Variate Support**: Efficiently process multiple variables using alternating time/variate attention.
+- **Probabilistic Predictions**: Generate point forecasts and uncertainty estimates via a quantile output head.
+- **Decoder-Only Architecture**: Support for variable prediction horizons and context lengths.
+- **u-μP Scaling**: Stable training transfer across all model sizes.
+<div style="width: 100%; margin: auto; padding: 1rem;">
+  <img src="figures/architecture.png" alt="Toto 2.0 architecture" style="width: 100%; height: auto;" />
+  <em style="display: block; margin-top: 0.5rem; text-align: center;">
+    Overview of the Toto 2.0 architecture.
+  </em>
+</div>
+---
+## ⚡ Quick Start
+Inference code is available on [GitHub](https://github.com/DataDog/toto).
+### Installation
+```bash
+pip install "toto-2 @ git+https://github.com/DataDog/toto.git#subdirectory=toto2"
+```
+### Inference Example
+```python
+import torch
+from toto2 import Toto2Model
+model = Toto2Model.from_pretrained("Datadog/Toto-2.0-313m")
+model = model.to("cuda").eval()
+# (batch, n_variates, time_steps)
+target = torch.randn(1, 1, 512, device="cuda")
+target_mask = torch.ones_like(target, dtype=torch.bool)
+series_ids = torch.zeros(1, 1, dtype=torch.long, device="cuda")
+# Returns quantiles of shape (9, batch, n_variates, horizon)
+# Quantile levels: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
+quantiles = model.forecast(
+    {"target": target, "target_mask": target_mask, "series_ids": series_ids},
+    horizon=96,
+)
+```
+For more examples, see the [Quick Start notebook](https://github.com/DataDog/toto/blob/main/toto2/notebooks/quick_start.ipynb) and [GluonTS integration notebook](https://github.com/DataDog/toto/blob/main/toto2/notebooks/gluonts_integration.ipynb).
+---
+## 💾 Available Checkpoints
+| Checkpoint | Parameters |
+|---|---|
+| [Toto-2.0-4m](https://huggingface.co/Datadog/Toto-2.0-4m) | 4M |
+| [Toto-2.0-22m](https://huggingface.co/Datadog/Toto-2.0-22m) | 22M |
+| [Toto-2.0-313m](https://huggingface.co/Datadog/Toto-2.0-313m) | 313M |
+| [Toto-2.0-1B](https://huggingface.co/Datadog/Toto-2.0-1B) | 1B |
+| [Toto-2.0-2.5B](https://huggingface.co/Datadog/Toto-2.0-2.5B) | 2.5B |
+---
+## 🔗 Additional Resources
+- **[Blog Post](https://www.datadoghq.com/blog/ai/toto-2/)**
+- **[GitHub Repository](https://github.com/DataDog/toto)**
+- **[BOOM Dataset](https://huggingface.co/datasets/Datadog/BOOM)**
+- **[Toto 1.0 Weights](https://huggingface.co/Datadog/Toto-Open-Base-1.0)**
+---
+## 📖 Citation
+```bibtex
+(citation coming soon)
+```