Searching for a new Tool Parser
#15
by LucasMM14 - opened
Does anyone know if there's another Tool Parser for this model?
It's recommended to use "--tool-call-parser qwen3_coder", but I'm unable to use it at this moment.
Why are you unable to use it?
this is the command that I use and it works perfectly...
vllm serve /media/data/models/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 \
--async-scheduling \
--served-model-name jarvis-thinker \
--dtype auto \
--kv-cache-dtype fp8 \
--tensor-parallel-size 2 \
--swap-space 0 \
--trust-remote-code \
--gpu-memory-utilization 0.75 \
--max-model-len 262144 \
--enable-chunked-prefill \
--max-num-seqs 512 \
--host 0.0.0.0 \
--port 10002 \
--enable-auto-tool-choice \
--tool-call-parser qwen3_coder \
--reasoning-parser nemotron_v3
--enforce-eager \
--max-cudagraph-capture-size 128 \
--enable-chunked-prefill \
--mamba-ssm-cache-dtype float16
Hi mtcl,
There's a new policy prohibiting the use of sovereign models (and complements) in generative AI (GenAI)
I don't think I understand that comment ...