hassansh commited on
Commit
90eaeb1
·
verified ·
1 Parent(s): 7299986

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -8
README.md CHANGED
@@ -44,14 +44,27 @@ ZAYA1-8B-VL builds upon and uses our [ZAYA1-7B LLM](https://huggingface.co/Zyphr
44
 
45
  ZAYA1-8B-VL is trained only upon open data. Detailed dataset descriptions can be found in the accompanying technical report.
46
 
47
-
48
- Model AI2D (test) ChartQA (test) DocVQA (test) InfoVQA (test) TextVQA (val) OCRBench VQA v2.0 (val) MathVista (mini) MMMU (val) SEED (image) Blink (val) RealWorldQA CountBenchQA PixMoCount (test) Point-Bench (avg) RefCOCO (avg)
49
- ZAYA1-VL-8B-A1B 87.5 82.2 92.5 74 74.4 79.8 80 64 46 72.7 45.9 65 88.1 83.1 58 84.3
50
- MolmoE-8B-A1B 73.6 77.9 77.7 53.9 78.1 55 82.8 39.1 -- 68.7 -- 60.4 77.4 45.2 58 --
51
- InternVL3.5-20B-A4B 85.5 87 92.9 78.1 78.5 86.7 78.4 73.5 72.6 76.8 58.9 71.2 82.1 47.3 -- 89.1
52
- Qwen3.5-2B 78.6 78.4 79 83.1 78.3 52.9 49.2 75.8 61 69 84.2 65.5 40.6 80.1
53
- Molmo2-4B 85.4 86.1 87.8 78.6 83.1 62 85.3 56.5 48.8 78 63.5 73.8 91.2 87 68.5 --
54
- Qwen3.5-4B 83.7 82.4 81.1 85.3 80.4 82.3 56.9 76.6 56.8 74.2 84.8 84.2 64.4 87.7
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
  ## Quick start
57
 
 
44
 
45
  ZAYA1-8B-VL is trained only upon open data. Detailed dataset descriptions can be found in the accompanying technical report.
46
 
47
+ | Eval | ZAYA1-VL-8B-A1B | MolmoE-8B-A1B | InternVL3.5-20B-A4B | Qwen3.5-2B | Molmo2-4B | Qwen3.5-4B |
48
+ |---|---:|---:|---:|---:|---:|---:|
49
+ | AI2D (test) | **87.5** | <u>73.6</u> | 85.5 | 78.6 | 85.4 | 83.7 |
50
+ | ChartQA (test) | 82.2 | <u>77.9</u> | **87.0** | 78.4 | 86.1 | 82.4 |
51
+ | DocVQA (test) | 92.5 | <u>77.7</u> | 92.9 | -- | 87.8 | -- |
52
+ | InfoVQA (test) | 74.0 | <u>53.9</u> | 78.1 | -- | 78.6 | -- |
53
+ | TextVQA (val) | <u>74.4</u> | 78.1 | 78.5 | 79.0 | **83.1** | 81.1 |
54
+ | OCRBench | 79.8 | <u>55.0</u> | **86.7** | 83.1 | 62.0 | 85.3 |
55
+ | VQA v2.0 (val) | 80.0 | 82.8 | 78.4 | 78.3 | **85.3** | 80.4 |
56
+ | MathVista (mini) | 64.0 | <u>39.1</u> | 73.5 | 52.9 | 56.5 | **82.3** |
57
+ | MMMU (val) | 46.0 | -- | **72.6** | 49.2 | <u>48.8</u> | 56.9 |
58
+ | SEED (image) | 72.7 | <u>68.7</u> | 76.8 | 75.8 | **78.0** | 76.6 |
59
+ | Blink (val) | <u>45.9</u> | -- | 58.9 | 61.0 | **63.5** | 56.8 |
60
+ | RealWorldQA | 65.0 | <u>60.4</u> | 71.2 | 69.0 | 73.8 | **74.2** |
61
+ | CountBenchQA | 88.1 | 77.4 | 82.1 | 84.2 | **91.2** | 84.8 |
62
+ | PixMoCount (test) | 83.1 | <u>45.2</u> | 47.3 | 65.5 | **87.0** | 84.2 |
63
+ | Point-Bench (avg) | 58.0 | 58.0 | -- | <u>40.6</u> | **68.5** | 64.4 |
64
+ | RefCOCO (avg) | 84.3 | -- | **89.1** | <u>80.1</u> | -- | 87.7 |
65
+
66
+
67
+ (based on VLMEvalKit)
68
 
69
  ## Quick start
70