supported eagle context size?

by Jannik2099 - opened 18 days ago

Jannik2099

Thanks for the model, it's a superb step up from Mistral 3 all around.

On the model card, you recommend serving speculative decoding with

--speculative_config {
...
"max_model_len": "16384"
}

Does the eagle head only support 16k context, or was it trained for 256k context like the base model, and this is merely the recommended config because you are seeing diminishing returns above 16k?

Do you recommend serving with 256k eagle context if VRAM allows it?

juliendenize

Mistral AI_ org 14 days ago

•

edited 13 days ago

Hey, you can actually remove it (i did it in the model card) the eagle head should work properly :)

Edit: actually got confused by some remapping, I added it back while I investigate a bit more.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment