Broken output on koboldcpp

#1
by Dzondo58 - opened

Backend: koboldcpp-1.111

SHA256 checksumes of Gemma-4-31B-Cognitive-Unshackled-Q4_K_S.gguf matches.

Result:

Input: {"n": 1, "max_context_length": 12288, "max_length": 1024, "rep_pen": 1.1, "temperature": 1.1, "top_p": 0.95, "top_k": 64, "top_a": 0, "typical": 1, "tfs": 1, "rep_pen_range": 360, "rep_pen_slope": 0.7, "sampler_order": [6, 0, 1, 3, 4, 2, 5], "memory": "", "trim_stop": true, "genkey": "KCPP4434", "min_p": 0, "dynatemp_range": 0, "dynatemp_exponent": 1, "smoothing_factor": 0, "smoothing_curve": 1, "nsigma": 0, "banned_tokens": [], "render_special": false, "logprobs": false, "presence_penalty": 0, "logit_bias": {}, "adaptive_target": -1, "adaptive_decay": 0.9, "stop_sequence": ["<turn|>\n<|turn>user", "<turn|>\n<|turn>model\n<|channel>thought\n<channel|>"], "use_default_badwordsids": false, "bypass_eos": false, "prompt": "<turn|>\n<|turn>user\nHello<turn|>\n<|turn>model\n<|channel>thought\n<channel|>"}

Processing Prompt (10 / 10 tokens)
Generating (1024 / 1024 tokens)
[06:40:15] CtxLimit:1039/12288, Amt:1024/1024, Init:0.00s, Process:0.04s (277.78T/s), Generate:44.42s (23.05T/s), Total:44.46s
Output: '’’ l’’’ a’ l’ l’ l la’ l’ and l’ l’ l’ l’ l’ l l’ l' l’ and l’ l’ l’ l’ l’ l' l’ and l’ l’ l’ l’ l’ l’ l’ l’ and l’ l’ l’ l’ l’’ and’’’’’’ l’’ l’ l’ l’ l’’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ and l’ l’ l’ l’ l’ l’ l’ l’ and l’ l’ l’ l’ l’ l’ l’ l’ and la’ l’ l’ l’ l’ l’ l’ l’ and l’ l’ l’ l’ l’ l’ l’ l’ and and l’ l’ l’ l’ l’ l’ l’ l’ and l’ l’ l’ l’ l’ l’ l’ l’ and l’ l’ l’ l’ l’ l’ l’ l’ and l’ l’ l’ l’ l’ l’ l’ l’ and l’ l’ l’ l’ l’ l’ l’ l’ and l’ l’ l’ l’ l’ l’ l’ l’ and and’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ l’ and’ a’ l’ l’ l’ l’ l’ l’ l’ l’ l’ and’ a l’ l’ l’ l’ l’ l’ la’ and' l’ l’ l’ l’ l’ and' and' l’ l’ l’ and' l’ l’ l’ and’ l’ l’ l’ and’ l’ l’ l’ and l’ l’ l’ l’ l’ l’ l’ l’ and l’ l’ l’ and' and and l’ l’ and’ l’ l’ l’ and' l’ l’ and’ l’ l’ and l’ l’ and’ l’ l’ and l’ l’ and’ and’ and’’’’ l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l’ l’ l l’l’ l’ l’ a’ l’’ a’ l’’ la l l l and as a result of the above, I am unable to provide you with a helpful response.’’’'’’’ and as a result of the above, I cannot be’’’’ and as a result of the above’’ and as a result of the above'’ and as a result of the above’' and as a result of the above’ and as a result of the above’' and as a result of the above and as a result’' and as a result’'’’’ and as a result’'’’’’’ and as a result and as a result and as a result’ and as a result and as a result’ and as a result and as a result’ and’ as a result and as a result’'’’ and as a result and as a result’ l

I briefly tested in Silly Tavern on llama.cpp backend with chat completion (not text completion). How often it breaks ?

s1arsky changed discussion status to closed
s1arsky changed discussion status to open

Ok. I am 6.5 k context in RP and it works without problems on my side. (I don't use thinking)

It turned out the problem was on my side. I tested it on LM Studio and it worked fine. It seems koboldcpp still has some problems with Gemma 4 support despite the hotfixes.

Dzondo58 changed discussion status to closed

Sign up or log in to comment