I hope for 27B from you (nt)

by datayoda - opened Feb 27

Discussion

datayoda

Feb 27

(Nt)

AesSedai

Owner Feb 27

Hi, I'm honestly not sure if dense models benefit from this quantization schema like MoE's do. Probably, since attention et al are in every model basically. The tradeoff ratio for BPW for FFNs tips less in the balance of FFNs with smaller models though, it's easy to say "just add 6GiB of tensors" when the model is already 100GB+ from MoE FFNs and that's not quite the same balance with a 27B model.

I might try quanting the 27B and see what the results look like.

datayoda

Feb 27

Thanks! Ur shit is just so good :)

AesSedai

Owner Mar 5

I tried the usual mixtures I do and it doesn't look like it works well for dense models compared to MoE's. So I'm going to refrain from uploading a quant of Qwen3.5-27B for now unless I come up with something else.

datayoda

Mar 5

Np- thanks for trying!

eleius

23 days ago

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment