Function call token ordering mismatch with Harmony format and chat template

#218
by bharathi1604 - opened

I noticed a mismatch between the function-call format emitted by the current
chat_template.jinja and the expected Harmony-style serialization.

Template-generated example:

<|start|>assistant to=functions.GET_ORDER_STATUS<|channel|>commentary json<|message|>"{"order_id": "ORD_12345"}"<|call|>

However, the Harmony reference format expects the following structure:

<|start|>assistant<|channel|>commentary to=functions.GET_ORDER_STATUS <|constrain|>json<|message|>{"order_id":"ORD_12345"}<|call|>|

Key differences observed:

  1. to=functions.* appears after <|channel|>commentary in Harmony,
    but before it in the current template.
  2. The content type (json) is wrapped inside <|constrain|> in Harmony,
    whereas the template emits it as a raw token.
  3. The arguments payload is not double-JSON-encoded in Harmony.
  4. Token ordering is strict in Harmony and differs from the template output.

This mismatch likely explains discrepancies when validating or consuming
function-call outputs against Harmony-compliant parsers.

This observation is based directly on the official template at:
https://huggingface.co/openai/gpt-oss-20b/blob/main/chat_template.jinja

please solve this quickly. Because while finetuning this may affect tool calling

I have meet the same problem!

Is this the reason why function calling doesn't work properly when serving the GPT-OSS model with vLLM, even when using the 'openai' tool-calling parser?

Is this the reason why function calling doesn't work properly when serving the GPT-OSS model with vLLM, even when using the 'openai' tool-calling parser?

If the tool response format is wrong, then vllm cann't parse the tool correctly. However I can use the vllm with current template to call function correctly althought the template is different with the official template which I think it is the generability of the model

Sign up or log in to comment