Model is too big

by flrn-pjd - opened Nov 14, 2025

Nov 14, 2025

Hi, something doesn't seem right with the quantization, why is this 3bit quantized model 113GB, when the original MXFP4 model is 63 GB?

NexVeridian

Owner Nov 17, 2025

•

edited Nov 17, 2025

Yeah, just requantizated the model and same size, seems like a bug in mlx-lm, bug was created when MXFP4 was added to gpt-oss here 344ea959ac3e6b33f188ae383a255a1bccffc013, 8a7d6720f8471be05bf9ef016535c7b7ed3c5303 the commit before MXFP4 was 51.2 GB, NexVeridian/gpt-oss-safeguard-120b-3bit seems to be the correct size for some reason though

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment