v16 - Reasoning despite setting --reasoning off in llama.cpp

#18
by diroka - opened

am i doing something wrong or is this expected behaviour? it is having a lot of problems with tool calling, tested web search and browser navigation in hermes so far, with qwen3.6 35b. and i see very often the thinking blocks, although i turned thinking off by setting --reasoning off in llama.cpp command.

Edit: the thinking blocks dont appear at all with the built in template. using the UD-IQ4_NL-XL quant

Edit: the thinking blocks dont appear at all with the built in template. using the UD-IQ4_NL-XL quant

froggeric changed discussion status to closed

Solved in the final v16 release which I have now promoted to official release.

great! thanks for the response. i tried it and it looks better when reasoning is on, but when reasoning is off, it fails tool calls and just stops working with API call errors

Sign up or log in to comment