nielsr HF Staff commited on
Commit
1b16e96
·
verified ·
1 Parent(s): 9165737

Improve model card metadata and add paper reference

Browse files

This PR improves the model card for `Toto-1.0-QA-Experimental` by:
- Updating the `pipeline_tag` to `image-text-to-text` for better discoverability.
- Adding `library_name: transformers` as the model is compatible with the Transformers library.
- Moving the paper reference from the YAML metadata to the Markdown section per Hugging Face recommendations.
- Adding the full list of authors and linking the official repository.

Files changed (1) hide show
  1. README.md +30 -35
README.md CHANGED
@@ -1,5 +1,15 @@
1
  ---
2
- model_id: Toto-1.0-QA-Experimental
 
 
 
 
 
 
 
 
 
 
3
  tags:
4
  - visual-question-answering
5
  - time-series
@@ -9,56 +19,41 @@ tags:
9
  - anomaly-reasoning
10
  - arfbench
11
  - observability
12
- paper:
13
- - https://arxiv.org/abs/2604.21199
14
- datasets:
15
- - Datadog/ARFBench
16
  leaderboards:
17
  - ARFBench
18
- license: apache-2.0
19
- pipeline_tag: visual-question-answering
20
- metrics:
21
- - accuracy
22
- - f1
23
- base_model:
24
- - Qwen/Qwen3-VL-32B-Instruct
25
- - Datadog/Toto-Open-Base-1.0
26
  ---
27
 
28
  # Toto-1.0-QA-Experimental
29
 
30
- `Toto-1.0-QA-Experimental` is a hybrid time-series foundation model (TSFM) and vision-language model (VLM) for ARFBench. It achieves comparable macro F1 and accuracy to top frontier models on ARFBench:
31
 
32
- |![arfbench-accuracy-f1-combined](https://cdn-uploads.huggingface.co/production/uploads/681d68309722c5341cd3fa59/Fs1zeUOkZ6G_yPpOyvlYq.png)|
33
- |:-:|
34
- |Overall accuracy and F1 on the ARFBench time series question-answering benchmark, as of paper release. Toto-1.0-QA-Experimental achieves the top accuracy and comparable F1 to top frontier models.|
35
 
 
36
 
37
- It combines:
38
 
39
- - a vision-language backbone (`Qwen/Qwen3-VL-32B-Instruct`) for image-conditioned question answering,
40
- - Toto time-series representations (`Datadog/Toto-Open-Base-1.0`),
41
- - lightweight projection modules that inject time-series signals into VLM inference.
 
 
 
 
 
42
 
43
  |![toto-vlm-arch](https://cdn-uploads.huggingface.co/production/uploads/681d68309722c5341cd3fa59/VOihICj_-HTNdbNyNseD_.png)|
44
  |:-:|
45
  |Overview of the Toto-1.0-QA-Experimental Architecture.|
46
 
47
- This model repository stores inference artifacts, including:
48
-
49
- - `vlm/` (merged vision-language model weights),
50
- - `ts_modules.pt` (time-series modules),
51
- - `config.json` and processor files.
52
 
53
  ---
54
 
55
  ## Basic Inference Example
56
 
57
- The example below assumes you already have:
58
-
59
- - time-series tensors,
60
- - one or more image paths,
61
- - a text question.
62
 
63
  ```python
64
  import torch
@@ -168,10 +163,10 @@ Running Toto-1.0-QA-Experimental typically requires multi-GPU setup (tested on 4
168
 
169
  ## Resources
170
 
171
- - [ARFBench Paper](https://arxiv.org/abs/2604.21199)
172
- - [Dataset](https://huggingface.co/datasets/Datadog/ARFBench)
173
- - [Leaderboard](https://huggingface.co/spaces/Datadog/ARFBench)
174
- - [Code](https://github.com/DataDog/arfbench)
175
 
176
  ---
177
 
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen3-VL-32B-Instruct
4
+ - Datadog/Toto-Open-Base-1.0
5
+ datasets:
6
+ - Datadog/ARFBench
7
+ license: apache-2.0
8
+ metrics:
9
+ - accuracy
10
+ - f1
11
+ pipeline_tag: image-text-to-text
12
+ library_name: transformers
13
  tags:
14
  - visual-question-answering
15
  - time-series
 
19
  - anomaly-reasoning
20
  - arfbench
21
  - observability
22
+ model_id: Toto-1.0-QA-Experimental
 
 
 
23
  leaderboards:
24
  - ARFBench
 
 
 
 
 
 
 
 
25
  ---
26
 
27
  # Toto-1.0-QA-Experimental
28
 
29
+ `Toto-1.0-QA-Experimental` is a hybrid time-series foundation model (TSFM) and vision-language model (VLM) for ARFBench.
30
 
31
+ The model was introduced in the paper [ARFBench: Benchmarking Time Series Question Answering Ability for Software Incident Response](https://arxiv.org/abs/2604.21199).
 
 
32
 
33
+ **Authors:** Stephan Xie, Ben Cohen, Mononito Goswami, Junhong Shen, Emaad Khwaja, Chenghao Liu, David Asker, Othmane Abou-Amal, Ameet Talwalkar.
34
 
35
+ ## Model Description
36
 
37
+ The model achieves comparable macro F1 and accuracy to top frontier models on ARFBench by combining:
38
+ - A vision-language backbone (`Qwen/Qwen3-VL-32B-Instruct`) for image-conditioned question answering.
39
+ - Toto time-series representations (`Datadog/Toto-Open-Base-1.0`).
40
+ - Lightweight projection modules that inject time-series signals into VLM inference.
41
+
42
+ |![arfbench-accuracy-f1-combined](https://cdn-uploads.huggingface.co/production/uploads/681d68309722c5341cd3fa59/Fs1zeUOkZ6G_yPpOyvlYq.png)|
43
+ |:-:|
44
+ |Overall accuracy and F1 on the ARFBench time series question-answering benchmark, as of paper release. Toto-1.0-QA-Experimental achieves the top accuracy and comparable F1 to top frontier models.|
45
 
46
  |![toto-vlm-arch](https://cdn-uploads.huggingface.co/production/uploads/681d68309722c5341cd3fa59/VOihICj_-HTNdbNyNseD_.png)|
47
  |:-:|
48
  |Overview of the Toto-1.0-QA-Experimental Architecture.|
49
 
50
+ This model repository stores inference artifacts, including merged vision-language model weights, time-series modules, and configuration files.
 
 
 
 
51
 
52
  ---
53
 
54
  ## Basic Inference Example
55
 
56
+ The example below assumes you already have time-series tensors, one or more image paths, and a text question. The required components are available in the [official Github repository](https://github.com/DataDog/arfbench).
 
 
 
 
57
 
58
  ```python
59
  import torch
 
163
 
164
  ## Resources
165
 
166
+ - **Paper:** [ARFBench on ArXiv](https://arxiv.org/abs/2604.21199)
167
+ - **Code:** [GitHub - DataDog/arfbench](https://github.com/DataDog/arfbench)
168
+ - **Dataset:** [Datadog/ARFBench](https://huggingface.co/datasets/Datadog/ARFBench)
169
+ - **Leaderboard:** [ARFBench Space](https://huggingface.co/spaces/Datadog/ARFBench)
170
 
171
  ---
172