GGUF quants available — all sizes Q2_K through Q8_0 + BF16

by dennny123 - opened about 16 hours ago

Discussion

dennny123

about 16 hours ago

Hey! Just finished quantizing M2.7 to GGUF with all the standard quant levels:

BF16 (full precision, ~427GB)
Q8_0 (~243GB)
Q6_K (~188GB)
Q5_K_M (~162GB)
Q4_K_M (~138GB)
Q3_K_M (~109GB)
Q2_K (~83GB)

Uploading now, most should be available within the next couple hours.

https://huggingface.co/dennny123/MiniMax-M2.7-GGUF

ronche2023

about 12 hours ago

Just curious, who would need BF16 (427GB) when the original is only 230GB (FP8)? What would be the use case?

dehnhaide

about 11 hours ago

These are blunt quantizations with no immatrix right? Then thanks but NO thanks! MiniMax model is prone to catastrophic errors when experts are quantized "en gross", so NO.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment