thetom-ai
/

MiniMax-M2.7-ConfigI-MLX

Text Generation

turboquant-plus

Mixture of Experts

4-bit precision

Model card Files Files and versions

Awesome Quant - great performance on m4 max

#1

by Narutoouz - opened 6 days ago

•

edited 6 days ago

I am getting 51 tokens / s , then falls down to 15 - 20 tokens/ s .
Apple silicon local AI coding as came a long way.

Thanks for making this quant and enabling mlx community to enjoy a wonderful model in their own hardware!

Can you also make gemma 26b it quant of this sort ?

thankyou

Owner 4 days ago

https://huggingface.co/thetom-ai/Gemma-4-26B-A4B-it-ConfigI-MLX
Untested so far. got lots on my plate.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment