MXFP4 vs other 4-bit quant algos?
#3
by dinerburger - opened
Hey noctrex. I was wondering what your thought process was for selecting MXFP4 as the tensor format. Was it primarily a speed concern?
Thanks again and keep up the great work!
Well no, it's not about speed. MXFP4 is a little bit slower than Q4. I like the technology. This seems to be the new standard going forward for the next years. The fact that NVIDIA makes use of hardware accelerated FP4, after FP8, seems to show where the technology goes. Also being floating point instead of integer, it should better retain some of the details.
Gotcha thanks, appreciate you taking the time! Iโm sorta new to this scene, so Iโm trying to hoove up all I can from more established members.
Thanks again!
dinerburger changed discussion status to closed