context length/number of generated tokens during training

#4
by Michalea - opened

Hello,
I would like to ask you what num. of generated tokens was used to train EAGLE-3 head?
Most of E-3 heads for this model train only on 2-4k first tokens.

Sign up or log in to comment