Reasoning content leaks into `message.content` with JSON schema response format

#18
by Tikhonum - opened

I’m using an OpenAI-compatible Chat Completions API endpoint with structured output (response_format = json_schema).

Expected behavior:

  • The final JSON output should be returned in choices[0].message.content.
  • Reasoning/thinking text should be separated (e.g., in reasoning_content) and never mixed into content.

Actual behavior:

  • In some responses, reasoning text is injected directly into choices[0].message.content before or around the JSON payload.
  • This breaks JSON parsing, even though response_format is set and schema is valid.

Context:

  • API style: OpenAI Chat Completions
  • Structured output: enabled via response_format: { type: "json_schema", ... }
  • Thinking mode: enabled on the model side
  • Parsing flow expects message.content to be strict JSON

Could you please clarify whether this is expected for thinking-enabled models, and if there is a reliable way to enforce strict JSON-only message.content while keeping reasoning in a separate field?
Thanks!

Same issue here.

  1. enable_thinking: true
  2. Structured output requested in prompt.

While the above conditions both matched, the message.content will begin with <think>...</think> and mess the JSON parsing. Is it a template issue or what?

Sign up or log in to comment