ONNX
zaya

ZAYA1-8B

ZAYA1-8B is a small mixture of experts language model with 760M active parameters and 8.4B total parameters trained end-to-end by Zyphra. ZAYA1-8B sets a new standard of intelligence efficiency for its parameter count through a combination of novel architecture and innovations in pretraining and post-training.

ZAYA1-8B excels at detailed long-form reasoning especially for mathematical and coding task. It punches heavily above its weight in these regimes and due to its inference efficiency and small size can be highly effective in test-time compute harnesses.

Due to its small total parameter count, ZAYA1-8B can also be deployed on-device for local LLM applications.

Learn more in our technical report and blog.

This is the post-trained reasoning version of ZAYA1-8B. The pretraining base can be found here.

Performance

ZAYA1-8B performs extremely strongly, especially in challenging mathematical, reasoning, and coding benchmarks. ZAYA1-8B is competitive with models several times its own size including frontier-scale reasoning models at mathematical benchmarks.

ZAYA_ttc_paper_light_no_dsv32_lcb_no_o4_hmmt_feb_dsv32_925_claude45_base_labels_matched_gap_transparent

western_os_comparison_transparent_barchart

First we compare ZAYA1-8B to the SOTA Qwen3 and Qwen3.5 model series of approximately the same parameter count as well as the recently released Gemma4 models and secondly to a variety of larger open-weights models.

In-class comparison against open-source reasoning models

Category Benchmark ZAYA1-8B
(0.7B / 8.0B)
Qwen3-4B-Thinking-2507
(4.0B / 4.0B)
Qwen3.5-4B
(4.0B / 4.0B)
Gemma-4-E4B-it
(4.0B / 8.0B*)
Math AIME'26 89.1 77.5 84.5 50.3
Math HMMT Feb.'26 71.6 60.8 63.6 32.1
Math IMO-AnswerBench 59.3 50.9 48.7 27.3
Math APEX-shortlist 32.2 16.9 -- 6.1
Code LiveCodeBench-v6 65.8 54.2 -- 54.2
Knowledge GPQA-Diamond 71.0 66.5 76.2 57.4
Knowledge MMLU-Pro 74.2 74.3 79.1 70.2
Instruction IFEval 85.58 86.8 89.8 88.50
Instruction IFBench 52.56 52.9 59.2 42.67
Style & chat EQBench 72.95 79.6 79.5 80.15
Style & chat Creative Writing v3 62.97 58.6 72.9 83.75
Agentic BFCL-v4 39.22 49.7 45.2 31.7
Agentic τ² 43.12 52.9 82.1 37.7

Scaling comparison against larger open-source reasoning models

Model Active Total AIME'26 HMMT'26 LCB-v6 IFEval GPQA-D MMLU-Pro
ZAYA1-8B 0.7B 8B 89.1 71.6 63.8 85.8 71.0 74.2
Arcee-Trinity-Mini 3B 26B 59.6 36.9 33.3 62.0 46.8 70.6
N3-Nano-30B 3B 30B 90.1 75.5 64.6 92.8 75.1 78.9
OLMo-3.1-32B-Think 32B 32B 78.9 50.6 58.3 93.2 59.6 75.8
Qwen3-Next-80B-A3B-Think 3B 80B 90.2 79.3 67.8 88.5 76.7 82.6
Intellect-3 12B 106B 86.3 72.2 66.8 81.2 74.6 82.3
Mistral-Small-4-119B 6B 119B 86.4 70.6 57.9 84.0 77.2 81.6

All numbers are run on the Zyphra evaluation harness. Models are ordered by total parameter count.

Downloads last month
213
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for onnx-community/ZAYA1-8B-ONNX

Finetuned
Zyphra/ZAYA1-8B
Quantized
(14)
this model

Paper for onnx-community/ZAYA1-8B-ONNX