Model is too big

#1
by flrn-pjd - opened

Hi, something doesn't seem right with the quantization, why is this 3bit quantized model 113GB, when the original MXFP4 model is 63 GB?

Yeah, just requantizated the model and same size, seems like a bug in mlx-lm, bug was created when MXFP4 was added to gpt-oss here 344ea959ac3e6b33f188ae383a255a1bccffc013, 8a7d6720f8471be05bf9ef016535c7b7ed3c5303 the commit before MXFP4 was 51.2 GB, NexVeridian/gpt-oss-safeguard-120b-3bit seems to be the correct size for some reason though

Sign up or log in to comment