nvidia/Nemotron-Cascade-2-30B-A3B · Add documentation on how to use with vLLM to README.md

Add documentation on how to use with vLLM to README.md

by stelterlab - opened 30 days ago

•

Please add the necessary arguments for running this model with vLLM and SGLang. It seems to be:

    --tool-call-parser qwen3_coder \
    --enable-auto-tool-choice \
    --reasoning-parser qwen3 \
    --trust-remote-code

for vLLM.

Hi @stelterlab , thanks for pointing out. We will add the arguments for vllm and sglang serving.

And just to correct:
The reasoner parser suitable should be deepseek_r1 or nemotron_v3 instead of qwen3.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment