Model gets confused?

by ehrrh - opened Mar 16

Discussion

ehrrh

Mar 16

This comment has been hidden (marked as Resolved)

MetaphoricalCode

Owner Mar 18

Mid generation it randomly says "<|im_start|>user" and writes back the prompt or just writes back the prompt, this is on oobabooga's webui with exllamav3-0.0.25, tried updating to exllamav3-0.0.26 but got the same result, turboderp/Qwen3.5-35B-A3B-exl3:4.09bpw works fine (I've also been using Nemotron-3-Nano-30B-A3B-UD-Q4_K_XL.gguf, Qwen3-30B-A3B-Instruct-2507-UD-Q5_K_XL.gguf, Qwen3.5-27B-heretic-v2-IQ4_XS.gguf, Qwen3.5-27B-heretic-v2.i1-Q6_K.gguf, Qwen3.5-35B-A3B-heretic-v2.i1-Q4_K_M.gguf, shisa-v2.1-unphi4-14b_Q8_0.gguf and they all work fine)

I've only tested 5bpw quants, both hb6 and hb8. Can't say it was a thorough testing, but I didn't encounter such a problem after doing multiple responses and swipes on existing ~60k context chat. I've had some Chinese symbols leaking in, though, while Instruct didn't have that. Sorry, I have no idea what might cause your problem. Sounds like a wrong/broken template, but if you use chat completion, then that shouldn't be an issue.

If it is still relevant, perhaps you could get help in official Exllama Discord server: https://discord.gg/NSFwVuCjRq

ehrrh

Mar 22

Yeah I talked with turboderp and it was a bug in the exllamav3 kernel, will be fixed in 0.0.27, both presence and frequency penalty caused the problem.

ehrrh changed discussion status to closed Mar 22

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment