MTP support in model

#5
by scottgl - opened

I swear I had it working yesterday, but now claude is telling me that there are no MTP weights in your quantization. Could you check if this is true? I'm only getting 11 t/s without MTP.

Same issue. MTP fails in vllm.

Owner

Hello, I have had this brought to my attention. I am on holiday but will return home in about 12 hours or so. I will make a patch and re-upload a new revision with MTP included after that.

I already created MTP model weights, which I might fork off of your model to repost, it you don't mind. Upload in progress:
https://huggingface.co/scottgl/Qwen3.5-122B-A10B-MTP-NVFP4

Owner

MTP now active, also I have updated PR example script to include MTP weights.
https://github.com/vllm-project/llm-compressor/pull/2383

There are pending PR's in vllm to fix MTP tool calling parsers. MTP works with straight chat requests, but tool calling for me shows MTP breaks. Waiting for PRs to be merged to nightly

Sign up or log in to comment