Doesn't seem to work.

#1
by Nick-Co - opened

On a Mac mini m4 pro with 24gb this just outputs nonsense in oMLX v.0.3.6

I’m going to upload its updated quant soon. I don’t know what happened to this one. oQ2 is the only quant for which it spits garbage on my setup (with requantized models from scratch)

Thanks for letting me know, I wouldn’t have caught it because I don’t use this model in my stack.

It should work now. I just updated and it passed my smoke test.

It should work now. I just updated and it passed my smoke test.

Yep, it works now; thank you!

It still spits out the occasional formatting issues or weird words here and there, but that is most likely due to the 3-bit quant as these Gemma models seem particularly sensitive to such compression.

Maybe the oQ4 (I can't run it sadly) or a oQ3.5e would be more accurate, but it'll have to wait for a hardware upgrade.

bearzi changed discussion status to closed

Sign up or log in to comment