Metadata Error: Circular Jinja Template Loop
The GGUF metadata for Apriel-1.6-15b-Thinker triggers a circular error in llama-server. It asks for --jinja and then immediately fails, recommending --no-jinja. This forces the model into a raw-completion loop (e.g., tennis trivia hallucinations) instead of the assistant role. EVIDENCE: (initial prompt) Hello! (response) and "carlton"? A question about the 1974 US Open (tennis) – who won? The dataset is from a trivia quiz.Okay, let's tackle this query. The user asks: "'apriel' and 'carlton'?" Probably they are referring to two players in the (response interrupted)
Using Llama.cpp version: 8298 (f90bd1dd8) built with GNU 15.2.0 for Windows AMD64
Sorry, this is our decensored / abliterated apriel 1.6b Thinker. My model seems normal. Jinja works perfectly.
If you are using Official Apriel 1.6, then yes, you're having issues with Jinja, so it will automatically use ChatML instead.
I recommend that you try the decensored version of this GGUF. According to UGI (Uncensored) benchmarks, this model has shown improved performance compared to the official version.
Thank you for your comment and suggestions. I updated to Llama-server 8589. The circular error is cleared, but llama still is not able to parse the jinja in the model Apriel-1.6-15b-Thinker-Magic_alpha-decensored_MPOA-Q4_K_M.gguf. I tried chatml but it does not process the special tokens, see below example. I tried the apriel chat template but apparently Llama has deleted this.
User:
Hello!
Assistant:
Here are my reasoning steps:
The user just says "Hello!" So we respond appropriately. There's no instruction beyond that.
We should be mindful of the system messages: The conversation is between a human and an assistant, with instructions to answer politely etc. We can ask how we can help or say hello back.
Given typical guidelines for friendly responses:
User said Hello! Should reply "Hello!" maybe also follow up asking if they need assistance.
We must not break any policies; there's no special context needed.
Thus respond: "Hello! How can I assist you today?" Possibly mention the name? Not required.
Will keep it short.
<|im_end|>
Here are my reasoning steps:
The user says Hello!. We should greet back. The conversation is simple, we just answer with a greeting and ask how to help.
\boxed{Hello! How can I assist you today?}
Sorry for the inconvenience.
Yes, the “parsing thinking” issue is an inherent flaw in this model. I’ve tried tweaking the Jinja templates for this model, but unfortunately nothing has worked.
Words from its “thinking mode” always leak into the responses.
Not kidding I’m stuck here. If you have a solution, it would be a huge help to me and many other users of Apriel 1.6.
There might be a bug in Llama.cpp. Well, there are definitely bugs in Llama and its jinja parser is not heavy duty. Have you tried using a high-powered AI assistant? Claude maybe? Gemini can also deal with this sort of thing. You have to keep it from wandering. I do not personally do JSON or BASH or PYTHON, but as I have 60 years programming experience, if an AI can look over the syntax and make some suggestions, I can generally get something together. I have not got into model manipulation and making these things. Do other versions of Apriel have the same issue? If you want to send me the template, I'll take a look. If you have a template from another version that works, that'd be handy, but not required. I have no way to test it, I'd just be looking over syntax.