Ablation studies on effects of quantization on SSM weights?

#15

by dinerburger - opened Feb 13

Feb 13

Has the Unsloth team run any ablation studies on the effects of quantizing the SSM weights? Looking at it, both ssm_ba.weight and sm_out.weight represent a very small portion of the overall size of the model, and leaving them in BF16 stands to potentially increase attention accuracy for little "cost" in terms of file size. However, I'm hesitant to go too hard in the paint on this without better data; as part of your Dynamic Quant 2.0 strategy it'd be really great to see how these weights contribute to overall accuracy. Thanks again for all your hard work!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment