Whats the difference between mistralai/Mistral-Small-4-119B-2603-eagle (this) and mistralai/Mistral-Small-4-119B-2603?
#6
by evewashere - opened
:)
Hi,
https://huggingface.co/mistralai/Mistral-Small-4-119B-2603 is the standalone model, you can just use it as it is.
The https://huggingface.co/mistralai/Mistral-Small-4-119B-2603-eagle can be considered as an "addon" to speedup inference (for the Mistral-Small-4-119B-2603 model) in the case of speculative decoding. It's not a model on its own, you can check the weights there's only 2 trf layers. It's just a drafting head on top (EAGLE paper, a bit similar to Medusa or Deepseek V3's MTP modules if you're familiar but for inference)
On a side note, I agree it can be confusing because the first paragraphs in the README for the eagle head are the same as Mistral-Small-4-119B-2603, which makes it look like it's a separate model.