How to combine `thinking on/off` prompt with existing system prompt.

by michaelfeil - opened Apr 1, 2025

Apr 1, 2025

Whats a good system prompt format for which youe hav trained the model with

thinking = "on" # or "off"
# option 1:
[{"role": "system", "content": f"detailed thinking {thinking}. You are an expert in math."}, {"role": "user", "content": "Solve x*(sin(x)+2)=0"}]))
# option 2:
[{"role": "system", "content": f"detailed thinking {thinking}"}, {"role": "system", "content": f"You are an expert in math."}, {"role": "user", "content": "Solve x*(sin(x)+2)=0"}]))

Best
michaelfeil

einsteiner1983

NVIDIA org Apr 2, 2025

option 1 is better than option 2 but it was not trained with system prompts other than detailed thinking on/off, so it might be better to do:

[{"role": "system", "content": f"detailed thinking {thinking}."}, {"role": "user", "content": "You are an expert in math. Solve x*(sin(x)+2)=0"}]))

chjkh8113

Feb 6

Hey @michaelfeil , I had the same question and ran some tests against the NIM API. @einsteiner1983 is right — custom personas get ignored when thinking=ON.

Quick findings:

System Prompt	Thinking	Persona Works?
`"detailed thinking on"`	ON	❌ generic output
`"detailed thinking off. You are an expert..."`	OFF	✅ persona followed
`"detailed thinking on. You are an expert..."`	ON	❌ still generic

Prepend/append position doesn't matter. Putting persona in user message doesn't help either.

Also tested:

/think /no_think tags: completely ignored (system prompt wins)
Budget hints ("think briefly"): don't work — "minimal reasoning" actually produced 21% MORE thinking 🙃

Workaround: Two-phase approach — first call with thinking=ON for reasoning, second call with thinking=OFF + persona for formatting.

For budget control: vLLM ≥0.15 has max_think_tokens via PR #20859 if you're self-hosting.

Test scripts: https://github.com/chjkh8113/nemotron-community-testing

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment