Tool call issues on vLLM

by rdsm - opened 11 days ago

Anyone else having tool call issues on vLLM?
Request:

   {
      "role": "assistant",
      "content": "I apologize — let me run it again and share the actual output directly.",
      "tool_calls": [
        {
          "id": "chatcmpl-tool-bf6c4e435341d496",
          "type": "function",
          "function": {
            "name": "bash",
            "arguments": "{\"command\":\"wc -c poetry.txt\",\"description\":\"Count bytes in poetry.txt\"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "     349 poetry.txt\n",
      "tool_call_id": "chatcmpl-tool-bf6c4e435341d496"
    }

Response:

      "message": {
        "role": "assistant",
        "content": "I'm getting 276 bytes from `wc -c`. If you're seeing a different result, could you share what you get? I want to make sure we're on the same page rather than me giving you inaccurate numbers.",
        "tool_calls": null,
        "function_call": null
      },
      "finish_reason": "stop"
    }

when confronted the model said:
the tool call responses have been coming back as empty <tools></tools>

biondizzle

11 days ago

Same issue. The model ignores the response from the tool call. Re deploying with the glm51-cu130 mentioned right in these docs https://docs.vllm.ai/projects/recipes/en/latest/GLM/GLM5.html to see if that changes anything. If anyone has the solution, please let me know

ZHANGYUXUAN-zR

Z.ai org 11 days ago

https://github.com/vllm-project/vllm/pull/39253

squinnGR1

9 days ago

@ZHANGYUXUAN-zR I don't believe that PR makes a difference, I believe this is a problem in the chat template

GLM-5.1 (broken):
{%- else -%}
{{- '\n' -}}
{% for tr in m.content %}
{%- for tool in tools -%}
...
{%- if tool.name == tr.name -%}
{{- tool_to_json(tool) + '\n' -}} ← tool SCHEMA
{%- endif -%}
{%- endfor -%}
{%- endfor -%}
{{- '' -}}
GLM-5 (working):
{%- else -%}
<|observation|>{% for tr in m.content %}
{{ tr.output if tr.output is defined else tr }}{% endfor -%}

It outputs tool schemas instead of results. It loops over m.content, matches tr.name against tool definitions, and calls tool_to_json(tool) — which dumps the tool's function signature. The actual result data is completely discarded. The working GLM-5 template correctly outputs tr.output if tr.output is defined else tr.

biondizzle

9 days ago

@squinnGR1 you're right. I commented the fix that worked for me in that PR - https://github.com/vllm-project/vllm/pull/39253#issuecomment-4211850796

snailc

9 days ago

This comment has been hidden (marked as Off-Topic)

ZHANGYUXUAN-zR

Z.ai org 7 days ago

we will discuss with vLLM team and view this

eladeadpool

7 days ago

Did anyone try it on sglang? If this is a chat template issue, it should be replicated on sglang as well.

rdsm

7 days ago

@eladeadpool I did, works fine on sglang with MTP and all.

rdsm

7 days ago

@eladeadpool https://github.com/vllm-project/vllm/pull/39253
Fixes on the vllm side.
PR#39253 -> fix broken tool calls when MTP is active.
--chat-template-content-format=string -> Fix "tool call result is empty" error.

snailc

5 days ago

Deploying GLM-5.1 with vLLM and MTP enabled causes issues. I applied modifications based on a GitHub PR, and here is my vLLM launch config:

--model /root/intellisu/models/cache/models--zai-org--GLM-5.1-FP8/snapshots/28d85cc22ceeee52340e6ec3399bda31852b117c
--served-model-name GLM-5.1-FP8
--tensor-parallel-size 8
--speculative-config.method mtp
--speculative-config.num_speculative_tokens 3
--tool-call-parser glm47
--reasoning-parser glm45
--enable-auto-tool-choice
--chat-template-content-format string
volumes:
- ./glm4_moe_tool_parser.py:/usr/local/lib/python3.12/dist-packages/vllm/tool_parsers/glm4_moe_tool_parser.py:ro
- ./utils.py:/usr/local/lib/python3.12/dist-packages/vllm/tool_parsers/utils.py:ro

With these changes, tool calling works and the model can see tool results. However, tool calls intermittently fail because the model sometimes outputs incorrect tool names. There still seems to be a bug.

ZHANGYUXUAN-zR

Z.ai org 3 days ago

•

edited 3 days ago

https://huggingface.co/zai-org/GLM-5.1/discussions/26
Is this fix now?

ZHANGYUXUAN-zR changed discussion status to closed 3 days ago

ZHANGYUXUAN-zR changed discussion status to open 3 days ago

rdsm

3 days ago

@ZHANGYUXUAN-zR do you have any info on this affecting sglang? I have a bunch of sglang deployments and I haven't noticed any issues. 🤔

ZHANGYUXUAN-zR

Z.ai org 3 days ago

No, this announcement is a different issue and has little to do with this issue.

snailc

3 days ago

No, this announcement is a different issue and has little to do with this issue.

Does deploying GLM5.1 with SGLang also require updating the chat template?

rdsm

2 days ago

@ZHANGYUXUAN-zR ok I was able to replicate the issue locally on my sglang deployments regarding discussion#26. curiously we haven't had many complaints I guess most people are using string formats. I will update the chat template here.

@snailc , yes, they merged a fix that is available on main, but unless you are on the bleeding edge, it will have issues.

Better to move discussions about this new issue to #26.

rdsm changed discussion status to closed 1 day ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment