RoPE theta 5mln instead of 1mln

#15
by Michalea - opened

Hello, do anyone know why theta is 5mln while the context is 262k? - Qwen recommends usage of 5mln. theta with 1mln context.

Sign up or log in to comment