froggeric's picture
Update chat_template.README.md
4a77231 verified

Chat Template

A fixed Jinja chat template for Qwen 3.5 in LM Studio. The official template crashes on tool calls, doesn't support the "developer" role, and has no way to toggle thinking. This one handles all of that.

What's broken in the official template

  1. Tool calls crash. The official template uses Python's |items dictionary filter and |safe, neither of which exist in LM Studio's C++ Jinja runtime. Any tool call triggers an out-of-bounds error.
  2. No "developer" role. Modern APIs sometimes send message.role == "developer". The official template raises an exception and dies.
  3. Empty tool outputs crash. LM Studio mishandles empty JSON blocks in certain mappings, causing Jinja to bail out.
  4. Thinking + tool calls = infinite loop. LM Studio's internal parser sees tool call syntax inside the thinking block and truncates the output in a loop (#827, #1592, #453).

What this template does

LM Studio-compatible tool arguments

Replaced |items iteration with direct dictionary key lookups. Swapped is sequence for is iterable (which LM Studio actually supports). Added length assertions to catch empty payloads before they crash.

"developer" role support

Intercepts "developer" messages and maps them to "system" internally. No crash, no data loss.

Thinking toggle from any message

Drop <|think_on|> or <|think_off|> anywhere in a system or user prompt. The template detects the tag, strips it from context so the model never sees it, and flips the thinking mode.

System: You are a coding assistant. <|think_off|>
User: Check the weather in Paris.

The tag disappears. The model answers fast, no internal reasoning.

System: You are a coding assistant. <|think_on|>
User: Implement a red-black tree in Rust.

The model thinks first, then answers.

This matters because LM Studio's parser can't handle tool calls inside thinking blocks. When using tools, always add <|think_off|> to your prompt.

Empty thinking blocks in history

The official template wraps every past assistant turn in think tags, even when there's no reasoning content. In multi-turn conversations those empty blocks accumulate, waste tokens, and break prefix caching. This template checks for reasoning_content before emitting the tags. One-line fix, significant impact on cache stability at long context. (Qwen3.5-35B-A3B#68)

No false positives

Earlier community templates stripped the string /think from all text, which broke legitimate paths like cd /mnt/project/think. The <|think_on|> / <|think_off|> syntax is specific enough that it will never appear naturally in code or conversation.

Comparison

Feature Official LuffyTheFox mod-ellary Pneuny This
Tool arguments work Crashes Fixed Missing Fixed Fixed
|safe removed Crashes Fixed Missing Fixed Fixed
"developer" role Missing Missing Missing Missing Added
Thinking toggle None None /think (system only) None <|think_off|> anywhere
Empty think in history Broken Broken Tags omitted Broken Fixed
Text safety N/A N/A Breaks on /think in paths N/A Safe
Clean instructions Yes Yes Yes Injects "I cannot call a tool" Yes

Install in LM Studio

  1. Open LM Studio
  2. Go to the My Models tab (or the right-side panel in Chat)
  3. Select your Qwen 3.5 model
  4. Scroll to Prompt Template
  5. Delete the default template, paste this one in
  6. Save

References