GGUF quants available — all sizes Q2_K through Q8_0 + BF16

#8
by dennny123 - opened

Hey! Just finished quantizing M2.7 to GGUF with all the standard quant levels:

  • BF16 (full precision, ~427GB)
  • Q8_0 (~243GB)
  • Q6_K (~188GB)
  • Q5_K_M (~162GB)
  • Q4_K_M (~138GB)
  • Q3_K_M (~109GB)
  • Q2_K (~83GB)

Uploading now, most should be available within the next couple hours.

https://huggingface.co/dennny123/MiniMax-M2.7-GGUF

Just curious, who would need BF16 (427GB) when the original is only 230GB (FP8)? What would be the use case?

These are blunt quantizations with no immatrix right? Then thanks but NO thanks! MiniMax model is prone to catastrophic errors when experts are quantized "en gross", so NO.

Sign up or log in to comment