File size: 1,605 Bytes
3a013b1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94c4135
3a013b1
94c4135
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Model,Model Type,Overall Accuracy,Presence,Identification,Start Time,End Time,Magnitude,Categorization,Correlation,Indicator
Random Choice,Baseline,24.5,50.0,20.0,20.0,20.0,20.0,20.0,20.0,20.0
Per-category Frequent Choice,Baseline,45.1,84.7,36.8,35.7,34.4,17.1,32.7,42.9,48.5
Non-domain Experts (n=2),Baseline,69.7,80.4,66.7,64.3,68.8,60.5,61.5,72.1,72.0
Domain Experts (n=2),Baseline,72.7,89.3,77.8,67.9,75.0,60.5,72.4,74.4,68.3
Model-Expert Oracle,Baseline,87.2,96.4,77.8,78.6,100.0,68.4,84.6,95.4,85.4
Qwen3 32B (text),LLM,47.9,80.9,28.9,27.3,35.5,37.3,39.8,50.9,46.3
GPT-5 (text),LLM,56.4,82.6,47.4,29.6,38.7,51.4,50.0,56.9,59.0
Qwen3-VL 8B,VLM,45.3,80.2,26.3,25.0,31.3,57.9,45.2,57.1,17.8
Claude Sonnet 4.5,VLM,47.2,83.8,18.4,30.4,37.5,53.9,53.8,58.8,17.2
GPT-4o,VLM,47.2,79.3,39.5,35.7,43.8,61.8,51.9,45.3,23.9
GPT-4.1,VLM,47.9,80.2,28.9,33.9,40.6,68.4,56.7,45.9,23.3
Qwen3-VL 32B,VLM,52.8,80.2,23.7,33.9,56.3,59.2,50.0,61.8,36.2
Claude Opus 4.6,VLM,54.8,88.3,31.6,37.5,53.1,57.9,63.5,65.9,25.2
Gemini 3 Pro,VLM,58.1,82.9,28.9,44.6,62.5,56.7,54.8,71.2,41.1
GPT-5.4,VLM,61.3,81.1,31.6,63.6,65.6,57.9,56.7,61.8,60.7
GPT-5,VLM,62.7,82.0,31.6,44.6,68.8,65.8,59.6,63.5,61.3
OpenTSLM (TS-LLM),Post-trained TSFM,0.8,0.0,0.0,3.6,0.0,5.3,0.0,0.0,0.0
ChatTS (TS-LLM),Post-trained TSFM,31.1,59.5,15.8,16.1,15.6,28.9,20.2,40.0,14.7
Toto-1.0-Qwen3 (TSFM-LLM),Post-trained TSFM,48.8,82.9,10.5,35.7,34.4,47.4,71.2,41.8,35.6
Qwen3-VL 32B (post-trained),Post-trained TSFM,56.9,84.7,36.8,41.1,43.8,63.2,52.9,67.6,39.3
Toto-1.0-QA-Experimental (TSFM-VLM),Post-trained TSFM,63.9,84.7,47.4,26.8,59.4,64.5,66.3,68.8,60.1