model speed feedback

by shibuzhang - opened Mar 8

Mar 8

•

I'd like to report an issue: on my Mac Studio (M3 Ultra, 512 GB), this model only achieves about 7 tokens per second, which is significantly slower than the official version's typical speed. Nevertheless, thank you for releasing this model; its quality is excellent.

cs2764

Owner Mar 8

I just tested on the same mac you have and using LM studio with runtime LM Studio MLX 1.3.0. the output is around 26.8 tokens per second.

shibuzhang

Mar 8

Thank you for the reply. It's likely I had an issue with my local files; let me double-check the hash value of the downloaded model first.

shibuzhang changed discussion status to closed Mar 8

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment