Request: MXFP4 GGUF Quantization for Blackwell GPUs

#2
by Z-Fire821 - opened

Hi OpenMOSE team,

thanks for this amazing Qwen3-VL-REAP model!

I was wondering if there are any plans to provide a GGUF version with MXFP4 (Microscaling Float 4) quantization?

Since many of us are now running Blackwell-based GPUs, MXFP4 would be a huge game-changer for the vision-token processing speed and overall prompt eval performance on these cards.

It would be awesome if you (or anyone in the community like noctrex) could look into this!

Best regards,
Johannes

Hi!
Sorry, I don't have a Blackwell GPU so I haven't been able to test MX-FP4 quantization. I'll try it out when I have access to blackwell gpus.

Sign up or log in to comment