MTP support in model

by scottgl - opened Feb 28

Feb 28

I swear I had it working yesterday, but now claude is telling me that there are no MTP weights in your quantization. Could you check if this is true? I'm only getting 11 t/s without MTP.

bhenrym14

Feb 28

Same issue. MTP fails in vllm.

Sehyo

Owner Feb 28

Hello, I have had this brought to my attention. I am on holiday but will return home in about 12 hours or so. I will make a patch and re-upload a new revision with MTP included after that.

scottgl

Feb 28

•

edited Mar 1

I already created MTP model weights, which I might fork off of your model to repost, it you don't mind. Upload in progress:
https://huggingface.co/scottgl/Qwen3.5-122B-A10B-MTP-NVFP4

Sehyo

Owner Mar 1

MTP now active, also I have updated PR example script to include MTP weights.
https://github.com/vllm-project/llm-compressor/pull/2383

unoid

Mar 2

There are pending PR's in vllm to fix MTP tool calling parsers. MTP works with straight chat requests, but tool calling for me shows MTP breaks. Waiting for PRs to be merged to nightly

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment