Plans for 122B?
Hey, just tested 9B, and it is great, I was hoping/wondering if you would do 122B eventually?
Also, is this ablit/heretic... or some other dataset finetune?
Thank you.
Hello,
Depending on how many people will actually make use of them, I may do 122B as well or stop at 35B and only do 4-9-27-35b models.
This is not Heretic. I use my own methodologies and datasets for prompts.
Please consider it.
I am currently still working 16 hours a day on 35b-a3b. Well, more specifically I'm working on a specialized technique for maximizing the experts. Once 35b is out, I might do 122b at fp8 (so all quants up to that, no bf16).
I don't have the VRAM for bf16.
I am currently still working 16 hours a day on 35b-a3b. Well, more specifically I'm working on a specialized technique for maximizing the experts. Once 35b is out, I might do 122b at fp8 (so all quants up to that, no bf16).
I don't have the VRAM for bf16.
Thank you, I'm really looking forward to 35b.
great job, thank you and qwen team.
Hello everyone, I can confirm 122b is well underway now and I will hopefully be able to release one of the quality people have gotten used to with my releases SOON.
Trials take me about 10 minutes each so it's a tad bit slower :)