Gemma-4-E4B-DECKARD-HERETIC-NVFP4
NVFP4-quantized EAGLE-style speculative-decoding drafter for the standard (non-uncensored) Gemma 4 31B DECKARD HERETIC. Pair with the matching target model for accelerated single-stream decode on Blackwell-class GPUs (DGX Spark / RTX PRO 6000 / RTX 5090 / B100 / B200).
For the abliterated/uncensored variant of this drafter, see AEON-7/Gemma-4-E4B-DECKARD-HERETIC-Uncensored-NVFP4. For the target it accelerates, see the gemma-4-31B-it-speculator.eagle3-NVFP4 collection on this profile.
Files
model.safetensors— NVFP4-quantized weightshf_quant_config.json— modelopt quant configchat_template.jinja— Gemma 4 chat templateconfig.json/generation_config.json/tokenizer.*/processor_config.json
Quick start (vLLM, as drafter)
vllm serve <target-model-id> \
--speculative-config '{"method":"eagle3","model":"AEON-7/Gemma-4-E4B-DECKARD-HERETIC-NVFP4","num_speculative_tokens":3}' \
--trust-remote-code
License
Inherits the Gemma Terms of Use. Use of this model is subject to those terms.
☕ Support the work
If this release has been useful, tips are deeply appreciated — they go directly toward more compute, more models, and more open releases.
₿ Bitcoin (BTC)![]() bc1q09xmzn00q4z3c5raene0f3pzn9d9pvawfm0py4
|
Ξ Ethereum (ETH)![]() 0x1512667F6D61454ad531d2E45C0a5d1fd82D0500
|
◎ Solana (SOL)![]() DgQsjHdAnT5PNLQTNpJdpLS3tYGpVcsHQCkpoiAKsw8t
|
ⓜ Monero (XMR)![]() 836XrSKw4R76vNi3QPJ5Fa9ugcyvE2cWmKSPv3AhpTNNKvqP8v5ba9JRL4Vh7UnFNjDz3E2GXZDVVenu3rkZaNdUFhjAvgd
|
Ethereum L2s (Base, Arbitrum, Optimism, Polygon, etc.) and EVM-compatible tokens can be sent to the same Ethereum address.
- Downloads last month
- 25



