Emaad commited on
Commit
2eb0725
Β·
verified Β·
1 Parent(s): f993da4

Tighten spacing: drop section HRs; turn post-image prose into figcaption

Browse files
Files changed (1) hide show
  1. README.md +6 -15
README.md CHANGED
@@ -42,24 +42,21 @@ model-index:
42
  >
43
  > For real workloads, please use the base [Toto 2.0 collection](https://huggingface.co/collections/Datadog/toto-20). The base checkpoints are pretrained without any public data, generalize to every benchmark we have evaluated, and are what we recommend deploying.
44
 
45
- ---
46
-
47
  ## ✨ What this is
48
 
49
  A single Toto 2.0 2.5B base checkpoint finetuned on a mix that **includes the GIFT-Eval training split**, used to probe how far the base model can be pushed on a single in-distribution benchmark.
50
 
51
- ![GIFT-Eval bar metrics β€” Toto 2.0 2.5B-FT highlighted](assets/bar_metrics_gift_eval.png)
52
-
53
- On the full GIFT-Eval leaderboard (foundation models + finetuned + ensemble + agentic), Toto-2.0-2.5B-FT places **#2 on CRPS rank, MASE rank, and #3 on raw CRPS / MASE**, behind only the [Toto 2.0 Family-and-Friends](https://huggingface.co/Datadog/Toto-2.0-Family-and-Friends) ensemble.
54
-
55
- ---
56
 
57
  ## πŸ” Finetuning recipe
58
 
59
  Starting from a fully-decayed [Toto-2.0-2.5B](https://huggingface.co/Datadog/Toto-2.0-2.5B) base checkpoint, we finetuned for 10,000 steps on a mix designed to expose the model to in-distribution structure without overfitting to GIFT-Eval alone:
60
 
61
  | Source | Share |
62
- |---|---|
63
  | GIFT-Eval Pretrain | 45% |
64
  | Datadog 5-minute+ observability metrics | 25% |
65
  | GIFT-Eval train split | 15% |
@@ -71,8 +68,6 @@ The public portion (45% GIFT-Eval Pretrain) is drawn from the Toto 1.0 mix of GI
71
 
72
  NorMuon and AdamW learning rates were both dropped by roughly an order of magnitude from pretraining (to 0.05 and 0.001 respectively). All other architecture and inference settings match the base 2.5B model.
73
 
74
- ---
75
-
76
  ## ⚑ Quick Start
77
 
78
  ```python
@@ -87,20 +82,16 @@ model = model.to("cuda").eval()
87
 
88
  See the base [Toto-2.0-2.5B](https://huggingface.co/Datadog/Toto-2.0-2.5B) model card for the full inference example.
89
 
90
- ---
91
-
92
  ## πŸ”— Additional Resources
93
 
94
  - **Technical Report** β€” *(coming soon)*
95
  - [Blog Post](https://www.datadoghq.com/blog/ai/toto-2/)
96
  - [Base model: Toto-2.0-2.5B](https://huggingface.co/Datadog/Toto-2.0-2.5B) β€” the unfinetuned checkpoint, which is what we recommend deploying
97
- - [Toto 2.0 Collection](https://huggingface.co/collections/Datadog/toto-20) β€” all five base sizes (4M β†’ 2.5B)
98
  - [Toto 2.0 Family-and-Friends](https://huggingface.co/Datadog/Toto-2.0-Family-and-Friends) β€” companion FFORMA-ensemble submission, also benchmark-only
99
  - [GIFT-Eval benchmark](https://huggingface.co/spaces/Salesforce/GIFT-Eval) β€” leaderboard hosting this submission
100
  - [GitHub Repository](https://github.com/DataDog/toto)
101
 
102
- ---
103
-
104
  ## πŸ“ License
105
 
106
  Apache 2.0.
 
42
  >
43
  > For real workloads, please use the base [Toto 2.0 collection](https://huggingface.co/collections/Datadog/toto-20). The base checkpoints are pretrained without any public data, generalize to every benchmark we have evaluated, and are what we recommend deploying.
44
 
 
 
45
  ## ✨ What this is
46
 
47
  A single Toto 2.0 2.5B base checkpoint finetuned on a mix that **includes the GIFT-Eval training split**, used to probe how far the base model can be pushed on a single in-distribution benchmark.
48
 
49
+ <figure>
50
+ <img src="assets/bar_metrics_gift_eval.png" alt="GIFT-Eval bar metrics β€” Toto 2.0 2.5B-FT highlighted">
51
+ <figcaption>On the full GIFT-Eval leaderboard (foundation models + finetuned + ensemble + agentic), Toto-2.0-2.5B-FT places <b>#2 on CRPS rank, MASE rank, and #3 on raw CRPS / MASE</b>, behind only the <a href="https://huggingface.co/Datadog/Toto-2.0-Family-and-Friends">Toto 2.0 Family-and-Friends</a> ensemble.</figcaption>
52
+ </figure>
 
53
 
54
  ## πŸ” Finetuning recipe
55
 
56
  Starting from a fully-decayed [Toto-2.0-2.5B](https://huggingface.co/Datadog/Toto-2.0-2.5B) base checkpoint, we finetuned for 10,000 steps on a mix designed to expose the model to in-distribution structure without overfitting to GIFT-Eval alone:
57
 
58
  | Source | Share |
59
+ |---|---:|
60
  | GIFT-Eval Pretrain | 45% |
61
  | Datadog 5-minute+ observability metrics | 25% |
62
  | GIFT-Eval train split | 15% |
 
68
 
69
  NorMuon and AdamW learning rates were both dropped by roughly an order of magnitude from pretraining (to 0.05 and 0.001 respectively). All other architecture and inference settings match the base 2.5B model.
70
 
 
 
71
  ## ⚑ Quick Start
72
 
73
  ```python
 
82
 
83
  See the base [Toto-2.0-2.5B](https://huggingface.co/Datadog/Toto-2.0-2.5B) model card for the full inference example.
84
 
 
 
85
  ## πŸ”— Additional Resources
86
 
87
  - **Technical Report** β€” *(coming soon)*
88
  - [Blog Post](https://www.datadoghq.com/blog/ai/toto-2/)
89
  - [Base model: Toto-2.0-2.5B](https://huggingface.co/Datadog/Toto-2.0-2.5B) β€” the unfinetuned checkpoint, which is what we recommend deploying
90
+ - [Toto 2.0 Collection](https://huggingface.co/collections/Datadog/toto-20) β€” all five base sizes (4m β†’ 2.5B)
91
  - [Toto 2.0 Family-and-Friends](https://huggingface.co/Datadog/Toto-2.0-Family-and-Friends) β€” companion FFORMA-ensemble submission, also benchmark-only
92
  - [GIFT-Eval benchmark](https://huggingface.co/spaces/Salesforce/GIFT-Eval) β€” leaderboard hosting this submission
93
  - [GitHub Repository](https://github.com/DataDog/toto)
94
 
 
 
95
  ## πŸ“ License
96
 
97
  Apache 2.0.