Use lowercase 'm' for the family range; drop saturation hedge
Browse files
README.md
CHANGED
|
@@ -64,7 +64,7 @@ model-index:
|
|
| 64 |
|
| 65 |
# Toto-2.0-4m
|
| 66 |
|
| 67 |
-
Toto (Time Series Optimized Transformer for [Observability](https://www.datadoghq.com/knowledge-center/observability/)) is a family of time series foundation models for multivariate forecasting developed by [Datadog](https://www.datadoghq.com/). Toto 2.0 is the current generation, featuring u-μP-scaled transformers ranging from
|
| 68 |
|
| 69 |
The family sets a new state of the art on three forecasting benchmarks: [BOOM](https://huggingface.co/spaces/Datadog/BOOM), our observability benchmark; [GIFT-Eval](https://huggingface.co/spaces/Salesforce/GIFT-Eval), the standard general-purpose benchmark; and the recent contamination-resistant [TIME](https://arxiv.org/abs/2602.12147) benchmark.
|
| 70 |
|
|
|
|
| 64 |
|
| 65 |
# Toto-2.0-4m
|
| 66 |
|
| 67 |
+
Toto (Time Series Optimized Transformer for [Observability](https://www.datadoghq.com/knowledge-center/observability/)) is a family of time series foundation models for multivariate forecasting developed by [Datadog](https://www.datadoghq.com/). Toto 2.0 is the current generation, featuring u-μP-scaled transformers ranging from 4m to 2.5B parameters, all trained from a single recipe. Forecast quality improves reliably with parameter count across the family.
|
| 68 |
|
| 69 |
The family sets a new state of the art on three forecasting benchmarks: [BOOM](https://huggingface.co/spaces/Datadog/BOOM), our observability benchmark; [GIFT-Eval](https://huggingface.co/spaces/Salesforce/GIFT-Eval), the standard general-purpose benchmark; and the recent contamination-resistant [TIME](https://arxiv.org/abs/2602.12147) benchmark.
|
| 70 |
|