Function call token ordering mismatch with Harmony format and chat template
I noticed a mismatch between the function-call format emitted by the current
chat_template.jinja and the expected Harmony-style serialization.
Template-generated example:
<|start|>assistant to=functions.GET_ORDER_STATUS<|channel|>commentary json<|message|>"{"order_id": "ORD_12345"}"<|call|>
However, the Harmony reference format expects the following structure:
<|start|>assistant<|channel|>commentary to=functions.GET_ORDER_STATUS <|constrain|>json<|message|>{"order_id":"ORD_12345"}<|call|>|
Key differences observed:
to=functions.*appears after<|channel|>commentaryin Harmony,
but before it in the current template.- The content type (
json) is wrapped inside<|constrain|>in Harmony,
whereas the template emits it as a raw token. - The arguments payload is not double-JSON-encoded in Harmony.
- Token ordering is strict in Harmony and differs from the template output.
This mismatch likely explains discrepancies when validating or consuming
function-call outputs against Harmony-compliant parsers.
This observation is based directly on the official template at:
https://huggingface.co/openai/gpt-oss-20b/blob/main/chat_template.jinja
please solve this quickly. Because while finetuning this may affect tool calling
I have meet the same problem!
Is this the reason why function calling doesn't work properly when serving the GPT-OSS model with vLLM, even when using the 'openai' tool-calling parser?
Is this the reason why function calling doesn't work properly when serving the GPT-OSS model with vLLM, even when using the 'openai' tool-calling parser?
If the tool response format is wrong, then vllm cann't parse the tool correctly. However I can use the vllm with current template to call function correctly althought the template is different with the official template which I think it is the generability of the model