MiMo V2 Flash IQ5_K
#1
by geveent - opened
Hi John! Happy New Year!
Thank you for making MiMo V2 Flash available to us. I just wanted to check in.
With an Intel Xeon system using 8-channel memory and a single RTX 5090, I was able to achieve 28 t/s TG.
wow very nice! it does seem like a rather fast model, but in my limited experience it was having issues with tool calling using my pydantic-ai agentic setup (glm-4.7 ~2bpw quant works fine with the test setup though) so not sure what is going on.
I'd be interested to hear how the model is working for you, as I'd love the speed if it were more reliable in my testing.