`.cw` is output at the end of the reasoning content

#33
by owao - opened

Please see https://huggingface.co/unsloth/Qwen3.5-27B-GGUF/discussions/16
At first I started the discussion there because I suspected a quantization issue. But @turboderp informed the behavior is also observable with the unquantized model.
I think it is worth investigating.

By the way, thank you some much to the Qwen team for releasing this model, it is amazing!

Yep, I've also seen this in EXL3 quants (various betrates), unquantized model running in the EXL3 framework (mixed-mode FP16, BF16 and FP32), and in Transformers with default settings. I've confirmed it in the 9B, 27B, 35B-A3B and 122B-A10B variants, haven't tested the other variants.

Here's an easy way to reproduce:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

with torch.inference_mode():
    model_id = "/mnt/str/models/qwen3.5-9b/hf"
    model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
    tokenizer = AutoTokenizer.from_pretrained(model_id)

    inputs = tokenizer.encode(
        tokenizer.apply_chat_template(
            [
                {"role": "system", "content": "You are a helpful AI assistant."},
                {"role": "user", "content": "Hello."},
            ],
            tokenize=False,
            add_generation_prompt=True
        ),
        add_special_tokens=True,
        return_tensors="pt"
    ).to(model.device)

    for _ in range(10):
        outputs = model.generate(input_ids=inputs, max_new_tokens=4096, do_sample=True)
        response = tokenizer.decode(outputs[0])
        print(outputs)
        print(response)
        if ".cw" in response:
            break

It usually only takes a couple of attempts before you see a response ending along the lines of:

    Actually, adding a bit more warmth is good. "Hello! πŸ‘‹ How's your day going? Let me know if there's anything I can help you with."

    Okay, I'll go with a friendly, open-ended greeting.cw
</think>

Hello! πŸ‘‹ How can I assist you today?<|im_end|>

The token IDs in question are always:

  6558  " Let"
   579  "'s"
  3165  " write"
   421  " that"
   508  ".c"     <---
    86  "w"      <---
   198  "\n"
248069  "</think>"
   271  "\n\n"
  9419  "Hello"

When sampling token 508, the raw top-10 probabilities look like:

0.538101553 :    508  ".c"
0.460262626 :     13  "."
0.000453809 :   6539  ".cl"
0.000347947 :  18208  ".cs"
0.000177714 :  47800  ".ct"
0.000050916 :    721  ".m"
0.000046359 :   6616  ".co"
0.000036673 :    890  ".l"
0.000033917 :  55167  ".ctrl"
0.000030641 :  73036  ".cf"

It seems to be confined to just before the </think> tag though, and it doesn't seem to affect the performance of the model overall.

Great, complete insights! Many tests I couldn't do locally! Thanks for the effort and time gathering all the useful info and bringing more visibility on this curious behavior @turboderp , really happy to see!

Maybe that would be interesting to see what's the prob of .c for a "passing" case. To see if the prob falls down significantly depending on context or if it always stays high. Did you observe something regarding this turbo?

Here are a few more appearances of token 508:

image
image
image

Here are some that don't sample the .cw, but it does show up with some small probability still, even in different tokenizations, which strongly hints at the training data (as opposed to a tokenization bug or computational issue.) Note the cw option right after ).:

image
image
image

The last one was writing the last line in italics, hence the *. Note that .c is still in the top five, it just apparently makes much more sense to finish the * span first.

Great examples compilation! I'm wondering if trying to poke one of the base models around those .c and cw could lead to some clues/insights on the eventual dataset contamination. I'll give a try and will update if I find something worth it.

Got a new one! .cs
Here: "Ok, generating output.cs"

data: {"id":"chatcmpl-a6f03d265c6eee3b","object":"chat.completion.chunk","created":1774978308,"model":"z-lab/Qwen3.5-27B-PARO","choices":[{"index":0,"delta":{"reasoning":" Okay"},"logprobs":null,"finish_reason":null,"token_ids":null}]}

data: {"id":"chatcmpl-a6f03d265c6eee3b","object":"chat.completion.chunk","created":1774978308,"model":"z-lab/Qwen3.5-27B-PARO","choices":[{"index":0,"delta":{"reasoning":","},"logprobs":null,"finish_reason":null,"token_ids":null}]}

data: {"id":"chatcmpl-a6f03d265c6eee3b","object":"chat.completion.chunk","created":1774978308,"model":"z-lab/Qwen3.5-27B-PARO","choices":[{"index":0,"delta":{"reasoning":" generating"},"logprobs":null,"finish_reason":null,"token_ids":null}]}

data: {"id":"chatcmpl-a6f03d265c6eee3b","object":"chat.completion.chunk","created":1774978308,"model":"z-lab/Qwen3.5-27B-PARO","choices":[{"index":0,"delta":{"reasoning":" output"},"logprobs":null,"finish_reason":null,"token_ids":null}]}

data: {"id":"chatcmpl-a6f03d265c6eee3b","object":"chat.completion.chunk","created":1774978308,"model":"z-lab/Qwen3.5-27B-PARO","choices":[{"index":0,"delta":{"reasoning":".cs"},"logprobs":null,"finish_reason":null,"token_ids":null}]}

data: {"id":"chatcmpl-a6f03d265c6eee3b","object":"chat.completion.chunk","created":1774978308,"model":"z-lab/Qwen3.5-27B-PARO","choices":[{"index":0,"delta":{"reasoning":"\n"},"logprobs":null,"finish_reason":null,"token_ids":null}]}

New one:

"Ready to write.be"

Maybe they are hiding a distress message.

Sign up or log in to comment