MiMo-V2.5 reasoning process is not separated from content in Claude Code

by Kinfai - opened 8 days ago

When deploying MiMo-V2.5 via vLLM, its inference process is directly incorporated into the final output, rather than being isolated or hidden as is typically the case.

In contrast, deploying Qwen3.5-122B-A10B-FP8 using the same Docker image works correctly, and Claude Code is able to properly collapse these "thinking" code blocks.

CompactAI

6 days ago

The model may have broken out of its CoT, then began to think again.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment