model speed feedback
#1
by shibuzhang - opened
I'd like to report an issue: on my Mac Studio (M3 Ultra, 512 GB), this model only achieves about 7 tokens per second, which is significantly slower than the official version's typical speed. Nevertheless, thank you for releasing this model; its quality is excellent.
I just tested on the same mac you have and using LM studio with runtime LM Studio MLX 1.3.0. the output is around 26.8 tokens per second.
Thank you for the reply. It's likely I had an issue with my local files; let me double-check the hash value of the downloaded model first.
shibuzhang changed discussion status to closed