Fixes tool calling bleeding into <think> blocks
Please do a sanity check on this file before merging. It seems to work well for me but I've been wrong before.
While I'm trying to figure out how to edit a PR, this is the diff that needs applied on top of it:
diff --git a/qwen3-think.jinja b/qwen3-think.jinja
index debbb44..693ac65 100644
--- a/qwen3-think.jinja
+++ b/qwen3-think.jinja
@@ -73,8 +73,6 @@
{{- '<|im_start|>' + message.role }}
{%- if message.reasoning_content is defined and message.reasoning_content %}
{{- '\n<think>\n' + message.reasoning_content + '\n</think>\n' }}
- {%- else %}
- {{- '\n<think>\n</think>\n' }}
{%- endif %}
{%- if message.content is defined and message.content is string and message.content | trim | length > 0 %}
{{- '\n' + (message.content | trim) + '\n' }}
Should be ready to review now 🤞.
will review in a bit - nowhere near my computer for now, in bed with severe fever 😥 will review and merge asap
No rush, people can fetch it from here if they urgently need it. Get well soon! 💉
hi.. is possible qwen3 80 next instuct and thinking opus distilled??
Please Qwopus3.6 36B A3B.
Merged! Thanks @codyknowscode for the fix — the empty <think></think> removal and content guard are solid.
I applied one small follow-up fix on top: the tool response last-item handler was changed to auto-append <|im_start|>assistant\n<think>\n, which caused a double generation prompt when add_generation_prompt=True. Reverted that specific line back to just <|im_end|>\n — the add_generation_prompt block at the bottom already handles it correctly.
Tested against 7 scenarios (simple chat, thinking, tool calls with/without reasoning, parallel tool calls, empty reasoning_content, full tool cycle with generation prompt). All clean.
Amazing! Thanks
@samuelcardillo actually, reverting that makes it excessively unable to perform tool calls from think, I'm getting a lot of these, especially at larger context sizes:
Will try out a few things and open a PR when I figure out how to do it properly.
@samuelcardillo do you have a script I can run against the template to reproduce your findings? I can't seem to figure it out, without <|im_start|>assistant\n<think>\n there it never actually enters thinking after a tool call (running this in OpenCode).
