| base_model: Qwen/Qwen3-8B | |
| language: | |
| - en | |
| license: apache-2.0 | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| tags: | |
| - speculative-decoding | |
| - specblock | |
| - draft-model | |
| # SpecBlock-Qwen3-8B | |
| SpecBlock draft model for speculative decoding, trained against the target model [`Qwen/Qwen3-8B`](https://huggingface.co/Qwen/Qwen3-8B). | |
| This model was introduced in the paper [SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting](https://huggingface.co/papers/2605.07243). | |
| ## Method | |
| SpecBlock — multi-block test-time training with cross-slot hidden injection between decoder layers and dynamic tree drafting. | |
| ## Usage | |
| End-to-end training and inference code can be found in the official repository: https://github.com/shiweijiezero/SpecBlock | |
| Quick eval with the HF backend: | |
| ```bash | |
| python benchmarks_hf/run_eval.py \ | |
| --algorithm specblock \ | |
| --model-path Qwen/Qwen3-8B \ | |
| --draft-model-path <local-clone-of-this-repo> \ | |
| --benchmark-list mtbench:80 humaneval:164 gsm8k:200 \ | |
| --output ./hf_results/specblock_qwen3.jsonl | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @misc{shi2026specblockblockiterativespeculativedecoding, | |
| title={SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting}, | |
| author={Weijie Shi and Qiang Xu and Fan Deng and Yaguang Wu and Jiarun Liu and Yehong Xu and Hao Chen and Jia Zhu and Jiajie Xu and Xiangjun Huang and Jian Yang and Xiaofang Zhou}, | |
| year={2026}, | |
| eprint={2605.07243}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2605.07243} | |
| } | |
| ``` |