Plans for 122B?

by SerySmith - opened Mar 4

Discussion

SerySmith

Mar 4

Hey, just tested 9B, and it is great, I was hoping/wondering if you would do 122B eventually?

Also, is this ablit/heretic... or some other dataset finetune?
Thank you.

HauhauCS

Owner Mar 4

Hello,

Depending on how many people will actually make use of them, I may do 122B as well or stop at 35B and only do 4-9-27-35b models.

This is not Heretic. I use my own methodologies and datasets for prompts.

Magicminds

Mar 7

Please consider it.

atrix

Mar 9

I would use it @HauhauCS / perfect size for those of us rocking RTX 6000 Pro's or Sparks.

HauhauCS

Owner Mar 9

I am currently still working 16 hours a day on 35b-a3b. Well, more specifically I'm working on a specialized technique for maximizing the experts. Once 35b is out, I might do 122b at fp8 (so all quants up to that, no bf16).

I don't have the VRAM for bf16.

E7Reine

Mar 9

I am currently still working 16 hours a day on 35b-a3b. Well, more specifically I'm working on a specialized technique for maximizing the experts. Once 35b is out, I might do 122b at fp8 (so all quants up to that, no bf16).

I don't have the VRAM for bf16.

Thank you, I'm really looking forward to 35b.

saipubw

Mar 10

great job, thank you and qwen team.

HauhauCS

Owner Mar 13

Hello everyone, I can confirm 122b is well underway now and I will hopefully be able to release one of the quality people have gotten used to with my releases SOON.

Trials take me about 10 minutes each so it's a tad bit slower :)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment