Think token
Idk is it a some sort of bug or not, why <think> (as well as </think>)are not a single token? It splits into <th and ink>.
Tokenizer also contains only:
{
"rank": 48250,
"token_bytes": "PHRo",
"token_str": "<th"
},
it makes harder to extract reasoning tokens in stream mode for consumer applications.
Ask the model a question related to the <think> token. For example: "Write a regex to parse <think></think> tags. Provide an example." The model's thinking output might contain <think> </think> tags before the real </think> tag, which could cause parsing issues during streaming.
As a solution, the reasoning output should be wrapped in separate <think> and </think> tokens, while any internal thought processes can still use <th for html tag. If you have any better ideas let me know.
Btw, the Qwen3 model has the opposite problem. It has special <think></think> tokens, but it doesn't have regular tokens to show when it's just thinking, like <th.
Here's what the <think> and </think> tokens look like:
{
"id": 151667,
"content": "<think>",
...
},
{
"id": 151668,
"content": "</think>",
...
}