Eagle model

#3
by sulpher - opened

I'm not sure right now to what extent this is already supported in llama.cpp or other engines, but will you also be providing quants of the Eagle model?

The Eagle model is a few hundred megabytes in size. Not much to quantize there. And llama.cpp does not currently have any support for Eagle specdec.

But llama.cpp has general support for speculative decoding models, and as there is essentially no documentation on the Eagle model (at least i could not find it?) i am not sure if it could not work with minor changes as a general speculative decoding model?

Sign up or log in to comment