gemma4-31B-it-speculator.eagle3
This is a preliminary model release, we will continue to train the model and improve the acceptance rates in the next few days.
Model Overview
- Verifier: google/gemma-4-31b-it
- Speculative Decoding Algorithm: EAGLE-3
- Model Architecture: Eagle3Speculator
- Release Date: 04/09/2026
- Version: 1.0
- Model Developers: RedHat
This is a speculator model designed for use with google/gemma-4-31b-it, based on the EAGLE-3 speculative decoding algorithm.
It was trained using the speculators library on a combination of the Magpie-Align/Magpie-Llama-3.1-Pro-300K-Filtered dataset and the train_sft split of the HuggingFaceH4/ultrachat_200k dataset. Training data used Magpie + UltraChat with responses from the gemma-4-31B-it model (no reasoning).
This model should be used with the google/gemma-4-31b-it chat template, specifically through the /chat/completions endpoint.
vLLM version
UPDATE: Now supported on vllm-main!
Use with vLLM
vllm serve google/gemma-4-31b-it \
--tensor-parallel-size 2 \
--speculative-config '{
"model": "RedHatAI/gemma-4-31B-it-speculator.eagle3",
"num_speculative_tokens": 3,
"method": "eagle3"
}' \
--no-enable-prefix-caching \
--max-num-seqs 64 \
--enforce-eager
Evaluations
Model / run:
vLLM: UPDATE: Now supported on vllm-main!
Training data: Magpie + UltraChat; responses from the gemma 4 31B it model (no reasoning).
Use cases
| Use Case | Dataset | Number of Samples |
|---|---|---|
| Coding | HumanEval | 164 |
| Math Reasoning | math_reasoning | 80 |
| Question Answering | qa | 80 |
| MT_bench (Question) | question | 80 |
| RAG | rag | 80 |
| Summarization | summarization | 80 |
| Translation | translation | 80 |
Acceptance lengths (draft length, temperature=default)
| Dataset | k=1 | k=2 | k=3 | k=4 | k=5 |
|---|---|---|---|---|---|
| HumanEval | 1.86 | 2.55 | 3.10 | 3.50 | 3.80 |
| math_reasoning | 1.87 | 2.59 | 3.15 | 3.59 | 3.93 |
| qa | 1.64 | 2.01 | 2.22 | 2.34 | 2.38 |
| question | 1.73 | 2.21 | 2.53 | 2.71 | 2.83 |
| rag | 1.72 | 2.21 | 2.50 | 2.65 | 2.80 |
| summarization | 1.60 | 1.92 | 2.07 | 2.15 | 2.20 |
| translation | 1.69 | 2.13 | 2.41 | 2.57 | 2.68 |
- Downloads last month
- 3,230