Is it possible to turn this native 4bit version to mlx format? I think its performance would be better than quantizing the full-size model.
· Sign up or log in to comment