EAGLE3-Llama-3.1-8B-Instruct

EAGLE3 draft model for speculative decoding, trained against the target model meta-llama/Llama-3.1-8B-Instruct.

Method

EAGLE3 baseline (autoregressive tree speculative decoding).

Usage

End-to-end training and inference code: https://github.com/shiweijiezero/SpecBlock

Quick eval with the HF backend:

python benchmarks_hf/run_eval.py     --algorithm eagle3     --model-path meta-llama/Llama-3.1-8B-Instruct     --draft-model-path <local-clone-of-this-repo>     --benchmark-list mtbench:80 humaneval:164 gsm8k:200     --output ./hf_results/eagle3_llama.jsonl

Citation

@misc{shi2026specblockblockiterativespeculativedecoding,
      title={SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting},
      author={Weijie Shi and Qiang Xu and Fan Deng and Yaguang Wu and Jiarun Liu and Yehong Xu and Hao Chen and Jia Zhu and Jiajie Xu and Xiangjun Huang and Jian Yang and Xiaofang Zhou},
      year={2026},
      eprint={2605.07243},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2605.07243}
}
Downloads last month
-
Safetensors
Model size
0.4B params
Tensor type
I64
·
BF16
·
BOOL
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for weijiezz/EAGLE3-Llama-3.1-8B-Instruct

Finetuned
(2724)
this model

Paper for weijiezz/EAGLE3-Llama-3.1-8B-Instruct