Emaad commited on
Commit
4d2cdba
Β·
verified Β·
1 Parent(s): c6347cf

Use non-breaking hyphens in Model column to stop unwanted line breaks

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -123,11 +123,11 @@ All five Toto 2.0 sizes share the same training recipe; pick a size based on you
123
 
124
  | Model | Params | Single-pass latency<br>(1,024 horizon) | Block decoding<br>(block=768) | Recommended for |
125
  |---|---|---|---|---|
126
- | [Toto-2.0-4m](https://huggingface.co/Datadog/Toto-2.0-4m) | 4m | ~3.8 ms | ~10.0 ms | Edge / CPU deployment; tightest latency or memory budgets. |
127
- | [Toto-2.0-22m](https://huggingface.co/Datadog/Toto-2.0-22m) | 22m | ~5.0 ms | ~12.8 ms | Efficient default β€” matches or beats Toto 1.0 quality with ~7Γ— fewer parameters. |
128
- | [Toto-2.0-313m](https://huggingface.co/Datadog/Toto-2.0-313m) | 313m | ~15.4 ms | ~32.4 ms | Strong general-purpose checkpoint; top-3 foundation model on GIFT-Eval. |
129
- | [Toto-2.0-1B](https://huggingface.co/Datadog/Toto-2.0-1B) | 1B | ~20.9 ms | ~46.3 ms | Best quality / cost tradeoff for production workloads. |
130
- | [Toto-2.0-2.5B](https://huggingface.co/Datadog/Toto-2.0-2.5B) | 2.5B | ~36.2 ms | ~78.0 ms | Highest accuracy; #1 foundation model on every benchmark. |
131
 
132
  > Single-pass decoding fills the entire horizon in one forward pass and is recommended up to ~768 steps. Block decoding generates the horizon in 768-step segments conditioned on the previous segment's median (with KV caching); it is slower but more stable at long horizons. Both modes use the same checkpoint.
133
 
 
123
 
124
  | Model | Params | Single-pass latency<br>(1,024 horizon) | Block decoding<br>(block=768) | Recommended for |
125
  |---|---|---|---|---|
126
+ | [Toto‑2.0‑4m](https://huggingface.co/Datadog/Toto-2.0-4m) | 4m | ~3.8 ms | ~10.0 ms | Edge / CPU deployment; tightest latency or memory budgets. |
127
+ | [Toto‑2.0‑22m](https://huggingface.co/Datadog/Toto-2.0-22m) | 22m | ~5.0 ms | ~12.8 ms | Efficient default β€” matches or beats Toto 1.0 quality with ~7Γ— fewer parameters. |
128
+ | [Toto‑2.0‑313m](https://huggingface.co/Datadog/Toto-2.0-313m) | 313m | ~15.4 ms | ~32.4 ms | Strong general-purpose checkpoint; top-3 foundation model on GIFT-Eval. |
129
+ | [Toto‑2.0‑1B](https://huggingface.co/Datadog/Toto-2.0-1B) | 1B | ~20.9 ms | ~46.3 ms | Best quality / cost tradeoff for production workloads. |
130
+ | [Toto‑2.0‑2.5B](https://huggingface.co/Datadog/Toto-2.0-2.5B) | 2.5B | ~36.2 ms | ~78.0 ms | Highest accuracy; #1 foundation model on every benchmark. |
131
 
132
  > Single-pass decoding fills the entire horizon in one forward pass and is recommended up to ~768 steps. Block decoding generates the horizon in 768-step segments conditioned on the previous segment's median (with KV caching); it is slower but more stable at long horizons. Both modes use the same checkpoint.
133