Weight size and VRAM usage much more than the original model

by SteveImmanuel - opened Dec 5, 2025

Dec 5, 2025

Hi, your model seems to handle better in not refusing request. I am curious, however, why your final weights are much bigger than the original gpt-oss-120b. Is it because unquantized or maybe your approach does indeed require storing more weights? Thanks

WyattTheSkid

Dec 13, 2025

If you quantize it to mxfp4 yourself you shouldn’t have any problems with that

radna0

Feb 11

Anyone got mxfp4 of this model? Not gguf I mean.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment