Doesn't seem to work.

by Nick-Co - opened 15 days ago

Discussion

Nick-Co

15 days ago

On a Mac mini m4 pro with 24gb this just outputs nonsense in oMLX v.0.3.6

bearzi

Owner 15 days ago

•

edited 15 days ago

I’m going to upload its updated quant soon. I don’t know what happened to this one. oQ2 is the only quant for which it spits garbage on my setup (with requantized models from scratch)

Thanks for letting me know, I wouldn’t have caught it because I don’t use this model in my stack.

bearzi

Owner 15 days ago

It should work now. I just updated and it passed my smoke test.

Nick-Co

15 days ago

•

edited 15 days ago

It should work now. I just updated and it passed my smoke test.

Yep, it works now; thank you!

It still spits out the occasional formatting issues or weird words here and there, but that is most likely due to the 3-bit quant as these Gemma models seem particularly sensitive to such compression.

Maybe the oQ4 (I can't run it sadly) or a oQ3.5e would be more accurate, but it'll have to wait for a hardware upgrade.

bearzi changed discussion status to closed 15 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment