If I have the very old original version gguf, do I still need to update now?

#6
by johnlaborxxx - opened

Hi @llmfan46 , coming from https://huggingface.co/llmfan46/gemma-4-31B-it-uncensored-heretic-GGUF/discussions/5 discussion.

I have the very very old version you initially uploaded, basically the previous version in the post above.

Is there any reason for me to update to the latest version you uploaded? I know you revert back the tokenizer to 5.5.0, but I dont know if there is any more changes on the llama.cpp you are using for the latest version. Thanks.

Hi @llmfan46 , coming from https://huggingface.co/llmfan46/gemma-4-31B-it-uncensored-heretic-GGUF/discussions/5 discussion.

I have the very very old version you initially uploaded, basically the previous version in the post above.

Is there any reason for me to update to the latest version you uploaded? I know you revert back the tokenizer to 5.5.0, but I dont know if there is any more changes on the llama.cpp you are using for the latest version. Thanks.

Is the original version working well for you? Any issues?

I dont see any issues thus far, the only reason I am asking, is due to this post:
https://huggingface.co/llmfan46/gemma-4-26B-A4B-it-ultra-uncensored-heretic/discussions/1

which leads to
https://huggingface.co/llmfan46/gemma-4-31B-it-uncensored-heretic-GGUF/discussions/5

I think the original version works fine on sillytavern 1.17 without issues through text completion.
But those two post makes me question my setup.
I am not familiar with llama.cpp directly, as I mainly use koboldcpp, which internally use llama.cpp as upstream.
So I wonder if the new gguf version created with new llama.cpp fix would improve the generation further.

If this is a dumb question please ignore me πŸ˜„

Thanks!

No it's a totally fine question, I think you should stick to the original version, or you can just keep the old version (do not delete it) and just download the newest up to date version and test it side by side if you see it's better or worse compared to the old version, the only thing you will be missing out on the new version is the official updated chat_template.jinja and the accompanied updated tokenizer_config.json, that's it.

Got it, I just took a look.
If I use koboldcpp lite GUI, they already updated their own chat template.
If I use ST in front of koboldcpp, with text completion I already have a instruct template with <BOS> token inserted, and with chat completion I am using a custom made template through ST discord channel.

So I guess I dont really need those updated jinja template and tokenizer config.
I will definitely let you know if I see any issues. It is such a new model and everyday there are changes being committed and it is overwhelming.

Thanks!

johnlaborxxx changed discussion status to closed

Sign up or log in to comment