File size: 1,602 Bytes
3a013b1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94c4135
3a013b1
94c4135
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Model,Model Type,Overall F1,Presence,Identification,Start Time,End Time,Magnitude,Categorization,Correlation,Indicator
Random Choice,Baseline,22.5,45.6,21.2,18.9,18.2,20.4,21.7,15.8,17.8
Per-category Frequent Choice,Baseline,17.3,45.9,10.8,16.3,14.1,6.0,14.6,12.0,13.1
Non-domain Experts (n=2),Baseline,61.3,68.0,79.0,67.4,67.2,40.3,61.2,58.4,62.4
Domain Experts (n=2),Baseline,64.6,76.1,77.5,74.2,72.6,51.8,67.3,64.1,57.6
Model-Expert Oracle,Baseline,82.8,89.0,68.3,83.4,1.0,67.0,75.6,94.4,77.8
Qwen3 32B,LLM,36.1,55.7,28.4,26.6,26.9,31.4,36.8,32.3,35.4
GPT-5 (text),LLM,43.8,66.1,38.1,27.9,27.0,44.8,47.6,38.0,42.4
Qwen3-VL 8B,VLM,34.7,63.5,28.6,21.8,23.5,47.0,42.8,33.1,13.8
Claude Sonnet 4.5,VLM,37.9,63.2,16.8,33.2,31.3,49.3,49.8,33.8,19.8
GPT-4o,VLM,42.4,64.2,34.6,30.3,36.1,51.8,50.8,40.1,27.2
GPT-4.1,VLM,44.0,65.1,29.2,33.5,32.7,63.7,55.9,42.9,23.3
Qwen3-VL 32B,VLM,45.1,65.1,25.0,30.8,46.7,46.9,49.0,47.5,34.7
Claude Opus 4.6,VLM,46.7,65.8,34.3,36.1,45.1,53.8,59.2,51.6,24.1
Gemini 3 Pro,VLM,49.6,67.8,38.6,43.3,57.1,50.3,54.5,57.0,29.2
GPT-5.4,VLM,51.4,62.6,29.6,53.3,55.1,51.7,54.1,47.7,49.1
GPT-5,VLM,51.9,66.8,32.8,44.2,47.8,59.1,57.0,49.0,45.9
OpenTSLM 1B (TS-LLM),Post-trained TSFM,1.2,0.0,8.2,2.7,0.0,6.0,0.0,0.0,0.0
ChatTS 8B (TS-LLM),Post-trained TSFM,22.1,48.1,22.2,15.0,14.4,27.9,17.9,21.4,9.2
Toto-1.0-Qwen3 32B (TSFM-LLM),Post-trained TSFM,33.9,59.9,17.5,41.3,23.0,35.9,66.2,18.6,14.1
Qwen3-VL 32B (post-trained),Post-trained TSFM,46.6,69.7,40.5,37.2,36.7,48.9,50.3,46.8,33.9
Toto-1.0-QA-Experimental 32B (TSFM-VLM),Post-trained TSFM,48.9,66.3,46.9,23.0,48.8,54.1,58.4,44.2,42.7