Doesn't work for me

#1
by x-polyglot-x - opened

I get repeated errors in MLX Studio. I downloaded the model from within that software, and it seems to load but never responds to the chat messages.

Here's a common error:
Error invoking remote method 'chat:sendMessage': Error: Failed to send message: API error: 404 - {"error": "Expected shape (248320, 256) but received shape (248320, 1024) for parameter language_model.model.embed_tokens.weight"}

The methodology seems interesting and I'd like to try it.

Hey, this is a big worry, can you send me the logs on the top right corner?

Can you please make sure you have the most updated version of MLX Studio? This was an issue 2-3 patches ago.

Hi,

I'll see if I can help. I enabled "verbose" mode, but I'm not seeing a "top-right corner" for logs (only things there are light / dark mode, and a ? button that takes me to the "version" of MLX Studio).

I'm on Version 1.3.6 (just downloaded it today and the model).

Hope this helps! I like the layout and potential features, so I'm excited to get it working. No rush!

I’m using it on the Max Studio now and I can’t get it to reproduce is why I’m worried. Can you do a few things for me?

1.) Try turning off paged caching
2.) Turn of KV cache quantization

If you go in the “server” tab and click on the started model, on the top right corner is a “logs” button which should show the logs

OK, good news: I followed your advice and made those two changes.

Now, it is working great! :)

Thanks for such a quick response. Hovering around 25 tokens / sec on m4 max with 128gb memory. It's a tight fit, but working now! Now comes the tweaking and testing phase....

Cheers!

I’ll look into this, I appreciate your words - this kind of use case for those with 128gb who are tired of disk streaming is the audience I was aiming for.

dealignai changed discussion status to closed

Sign up or log in to comment