Attention Drift Collection Models trained as a part of the "Attention Drift: What Speculative Decoding Models Learn" paper, shared for reproducing experiments. • 13 items • Updated 3 days ago