Qwen3.5-397B-A17B

#1
by Georgiy1108 - opened

Hello. What was the difference between the launch settings for this model compared to the previous launch? How many attempts did it take you to find the optimal parameter? I want to train a heavier Qwen3.5-397B-A17B model.

It has vision weights as well, rest it is exactly same. If you use vLLM the older will not work due to missing vision config but this will work.

Sign up or log in to comment