why only 13b active on the flash?

#2
by szilard995 - opened

Qwen 27b eats this alive. 13b A is not enough. We need 50b at least

This is a flash version,why don't you consider the Pro Version?
Also ,Qwen3.5 only activates 17b params.

This is a flash version,why don't you consider the Pro Version?
Also ,Qwen3.5 only activates 17b params.

3.5 is old and the pro is huge wtf

yes exactly. who can even fit 1.6t? This is not optimized for local at all. Give us a 200b total 100b dense

Sign up or log in to comment