Instructions to use Zyphra/ZAYA1-VL-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Zyphra/ZAYA1-VL-8B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="Zyphra/ZAYA1-VL-8B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModelForSeq2SeqLM model = AutoModelForSeq2SeqLM.from_pretrained("Zyphra/ZAYA1-VL-8B", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Zyphra/ZAYA1-VL-8B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Zyphra/ZAYA1-VL-8B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zyphra/ZAYA1-VL-8B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/Zyphra/ZAYA1-VL-8B
- SGLang
How to use Zyphra/ZAYA1-VL-8B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Zyphra/ZAYA1-VL-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zyphra/ZAYA1-VL-8B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Zyphra/ZAYA1-VL-8B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zyphra/ZAYA1-VL-8B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use Zyphra/ZAYA1-VL-8B with Docker Model Runner:
docker model run hf.co/Zyphra/ZAYA1-VL-8B
Update README.md
Browse files
README.md
CHANGED
|
@@ -40,17 +40,17 @@ ZAYA1-VL-8B is trained only upon open data. Detailed dataset descriptions can be
|
|
| 40 |
|---|---:|---:|---:|---:|---:|---:|
|
| 41 |
| AI2D (test) | **87.5** | <u>73.6</u> | 78.6 | 85.5 | 85.4 | 83.7 |
|
| 42 |
| ChartQA (test) | 82.2 | <u>77.9</u> | 78.4 | **87.0** | 86.1 | 82.4 |
|
| 43 |
-
| DocVQA (test) | 92.5 | <u>77.7</u> | -- | 92.9 | 87.8 | -- |
|
| 44 |
-
| InfoVQA (test) | 74.0 | <u>53.9</u> | -- | 78.1 | 78.6 | -- |
|
| 45 |
| TextVQA (val) | <u>74.4</u> | 78.1 | 79.0 | 78.5 | **83.1** | 81.1 |
|
| 46 |
| OCRBench | 79.8 | <u>55.0</u> | 83.1 | **86.7** | 62.0 | 85.3 |
|
| 47 |
-
| VQA v2.0 (val) | 80.0 | 82.8 | 78.3 | 78.4 | **85.3** | 80.4 |
|
| 48 |
| MathVista (mini) | 64.0 | <u>39.1</u> | 52.9 | 73.5 | 56.5 | **82.3** |
|
| 49 |
-
| MMMU (val) | 46.0 | -- | 49.2 | **72.6** |
|
| 50 |
| SEED (image) | 72.7 | <u>68.7</u> | 75.8 | 76.8 | **78.0** | 76.6 |
|
| 51 |
| Blink (val) | <u>45.9</u> | -- | 61.0 | 58.9 | **63.5** | 56.8 |
|
| 52 |
| RealWorldQA | 65.0 | <u>60.4</u> | 69.0 | 71.2 | 73.8 | **74.2** |
|
| 53 |
-
| CountBenchQA | 88.1 | 77.4 | 84.2 | 82.1 | **91.2** | 84.8 |
|
| 54 |
| PixMoCount (test) | 83.1 | <u>45.2</u> | 65.5 | 47.3 | **87.0** | 84.2 |
|
| 55 |
| Point-Bench (avg) | 58.0 | 58.0 | <u>40.6</u> | -- | **68.5** | 64.4 |
|
| 56 |
| RefCOCO (avg) | 84.3 | -- | <u>80.1</u> | **89.1** | -- | 87.7 |
|
|
|
|
| 40 |
|---|---:|---:|---:|---:|---:|---:|
|
| 41 |
| AI2D (test) | **87.5** | <u>73.6</u> | 78.6 | 85.5 | 85.4 | 83.7 |
|
| 42 |
| ChartQA (test) | 82.2 | <u>77.9</u> | 78.4 | **87.0** | 86.1 | 82.4 |
|
| 43 |
+
| DocVQA (test) | 92.5 | <u>77.7</u> | -- | **92.9** | 87.8 | -- |
|
| 44 |
+
| InfoVQA (test) | 74.0 | <u>53.9</u> | -- | 78.1 | **78.6** | -- |
|
| 45 |
| TextVQA (val) | <u>74.4</u> | 78.1 | 79.0 | 78.5 | **83.1** | 81.1 |
|
| 46 |
| OCRBench | 79.8 | <u>55.0</u> | 83.1 | **86.7** | 62.0 | 85.3 |
|
| 47 |
+
| VQA v2.0 (val) | 80.0 | 82.8 | <u>78.3</u> | 78.4 | **85.3** | 80.4 |
|
| 48 |
| MathVista (mini) | 64.0 | <u>39.1</u> | 52.9 | 73.5 | 56.5 | **82.3** |
|
| 49 |
+
| MMMU (val) | <u>46.0</u> | -- | 49.2 | **72.6** | 48.8 | 56.9 |
|
| 50 |
| SEED (image) | 72.7 | <u>68.7</u> | 75.8 | 76.8 | **78.0** | 76.6 |
|
| 51 |
| Blink (val) | <u>45.9</u> | -- | 61.0 | 58.9 | **63.5** | 56.8 |
|
| 52 |
| RealWorldQA | 65.0 | <u>60.4</u> | 69.0 | 71.2 | 73.8 | **74.2** |
|
| 53 |
+
| CountBenchQA | 88.1 | <u>77.4</u> | 84.2 | 82.1 | **91.2** | 84.8 |
|
| 54 |
| PixMoCount (test) | 83.1 | <u>45.2</u> | 65.5 | 47.3 | **87.0** | 84.2 |
|
| 55 |
| Point-Bench (avg) | 58.0 | 58.0 | <u>40.6</u> | -- | **68.5** | 64.4 |
|
| 56 |
| RefCOCO (avg) | 84.3 | -- | <u>80.1</u> | **89.1** | -- | 87.7 |
|