why only 13b active on the flash?
#2
by szilard995 - opened
Qwen 27b eats this alive. 13b A is not enough. We need 50b at least
This is a flash version,why don't you consider the Pro Version?
Also ,Qwen3.5 only activates 17b params.
This is a flash version,why don't you consider the Pro Version?
Also ,Qwen3.5 only activates 17b params.
3.5 is old and the pro is huge wtf
yes exactly. who can even fit 1.6t? This is not optimized for local at all. Give us a 200b total 100b dense