NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4

#1
by Maverobot - opened

Is it possible to turn this native 4bit version to mlx format? I think its performance would be better than quantizing the full-size model.

Sign up or log in to comment