Inconsistent description with the evaluation results
#3
by Michalea - opened
Thank you for your contribution!
I have a issue. In evaluation you have:
"The Eagle acceptance rate benchmark results (MT-Bench) with draft length 1 are presented in the table below for medium reasoning
It has higher acceptance rate at single predicted token."
- the results are worse than in https://huggingface.co/nvidia/gpt-oss-120b-Eagle3-short-context - in spire of using 3x more data and draft length 1 vs 3.
What is the reason? Maybe it is a mistake in the readme?
The README for the other EAGLE3 models shows the acceptance rate when using 3 drafted tokens. If you run the other models with a single draft token only, for a fair comparison, you should find that the acceptance rate of this model is higher.