Best dude so far! Better than others..
Thanks bruh! Yours are better than 90%! Finally! Somebody debloated the damn tokenizer loved you for that! You are better than unsloth and bartowski in my book if someone fixes the tokenizer and minds it that's a good ass dude a legend in my book! Hope ya keep adding more guffs loved all of them your guffs
Hello JanjanJean,
It's nice to "meet" you and I must say; wow... Thank you so much for the praise!
Your message came as quite a surprise to me personally, but also a significant motivational boost.
Funny enough I was really disatisfied so far with the GGUF Quants I have released for one specific reason I didn't embed the tokenizer chat template quite how I had wanted to compared to newly written ones that I will be pushing as remade Quant GGUF files for every release I have currently made available. The goal is that hopefully end users and community members, such as yourself, would no longer need to attempt any sort of manual setup in order to use the models to their very impressive potential.
Personally I have found the MistralAI models to be simply amazing, but I am also a bit partial to DeepSeek-R1-Qwen3-8B (due to reasoning, and a very good job at it) as well as Qwen3 models (non-VL). That being said though - over the last couple months I've found that some pulls from HuggingFace would result in a poorly performing model and as such I have spent January going over what I have found to be (in my mind) a very streamlined and high quality tokenizer chat template that I wrote using inspiration from a few other's templates but with it being a complete rewrite so as to provide something unique rather than re-hashing other's work.
Anyway: My apologies for such a long winded reply, I merely wanted to point out that I look forward to seeing you and others here on this site check out the overhauled Quant GGUF files of my releases over the course of the next couple days as I replace the ones currently released after a little more testing.
Wishing you a great week,
- Jon Z (EnlistedGhost)
Hello JanjanJean,
It's nice to "meet" you and I must say; wow... Thank you so much for the praise!
Your message came as quite a surprise to me personally, but also a significant motivational boost.Funny enough I was really disatisfied so far with the GGUF Quants I have released for one specific reason I didn't embed the tokenizer chat template quite how I had wanted to compared to newly written ones that I will be pushing as remade Quant GGUF files for every release I have currently made available. The goal is that hopefully end users and community members, such as yourself, would no longer need to attempt any sort of manual setup in order to use the models to their very impressive potential.
Personally I have found the MistralAI models to be simply amazing, but I am also a bit partial to DeepSeek-R1-Qwen3-8B (due to reasoning, and a very good job at it) as well as Qwen3 models (non-VL). That being said though - over the last couple months I've found that some pulls from HuggingFace would result in a poorly performing model and as such I have spent January going over what I have found to be (in my mind) a very streamlined and high quality tokenizer chat template that I wrote using inspiration from a few other's templates but with it being a complete rewrite so as to provide something unique rather than re-hashing other's work.
Anyway: My apologies for such a long winded reply, I merely wanted to point out that I look forward to seeing you and others here on this site check out the overhauled Quant GGUF files of my releases over the course of the next couple days as I replace the ones currently released after a little more testing.
Wishing you a great week,
- Jon Z (EnlistedGhost)
Thanks for the new Q4K_M update I had tested all your quants for deep coding to roleplay with my costum app.. i loved the Q4K_M and Q3K_XL .
Stil tho loved the Q5K_XL still the thing takes things too literally haha very balanced for vram to quality..
Keep it up tho loved how you balance the weights and focus on the outputs Q8_0.
And thanks for the pixtral too tho 12B. I wonder if you can do the ablitersted one of the 14B I saw it had less robotic tone and less refusals..
Well anyways thanks! I tested all your guffs from qwen to the 24B for coding and stuff thy are the bomb my 4TB hdd is literally full of your guffs lol!
My apologies for getting back to you so much later, but if you are interested - have a look at the recently updated GGUF files and give the modelcard (main "page") a quick read through if possible as it goes over the new changes such as a Tokenizer Chat Template that I am finally happy with (lol).
Thank you once again for your kind words of support! They truly do mean a lot - I look forward to hopefully hearing further positive feendback and/or altering the release until it bears only positive usage by anyone who decides to download and run it!
Your requested Reasoning/Thinking conversions are up next, so if you're still looking for those to be completed they soon will be
Wishing you a wonderful end of February,
- Jon Z (EnlistedGhost)
PS (EDIT): Please let me know if there's a good email that you can be reached at? - I can supply you with a system prompt that will acheive your request without need for ablation
My apologies for getting back to you so much later, but if you are interested - have a look at the recently updated GGUF files and give the modelcard (main "page") a quick read through if possible as it goes over the new changes such as a Tokenizer Chat Template that I am finally happy with (lol).
Thank you once again for your kind words of support! They truly do mean a lot - I look forward to hopefully hearing further positive feendback and/or altering the release until it bears only positive usage by anyone who decides to download and run it!
Your requested Reasoning/Thinking conversions are up next, so if you're still looking for those to be completed they soon will be
Wishing you a wonderful end of February,
- Jon Z (EnlistedGhost)
PS (EDIT): Please let me know if there's a good email that you can be reached at? - I can supply you with a system prompt that will acheive your request without need for ablation
Thanks! Also been using alot of them new updates of yours.. the one that I loved was that weird Q4K_XL tho had the right balance ha also used this new lower version of it been using and collecting your new updates and testing them out. Would be cool too if you also have like 12B Rocicante reasoning or any reasoning models that I could collect.
Thanks! Also been using alot of them new updates of yours.. the one that I loved was that weird Q4K_XL tho had the right balance ha also used this new lower version of it been using and collecting your new updates and testing them out. Would be cool too if you also have like 12B Rocicante reasoning or any reasoning models that I could collect.
I don't mean to sound like a broken record but - thank you once again π
Due to "popular demand" π (your stated preference of the previous 9.41GB Q4_K_XL over the recent 10.1GB Q4_K_XL) I have replaced the recent 10.1GB Q4_K_XL GGUF Quant file with the previous Q4_K_XL GGUF Quant file, but altered with the recently updated chat_template which offers faster, more-reliable, and streamlined inferencing with the Quantized GGUF model file. This way the quant adheres to my current updates while offering the mix that you had prefered.
As to why the older GGUF Quant file (now once again the current GGUF Quant file) felt better although it was smaller is actually quite interesting. With the most recent 10.1GB Q4_K_XL GGUF file I had gone heavy handed with Q5_K mixing more of that Quant size over Q4_K in a nearly 70/30 mix of Q5_K/Q4_K which resulted in a larger file than the first released 9.41GB Q4_K_XL GGUF Quant file. The first iteration was a more balanced blend of nearly 50/50 Q4_K/Q6_K instead of the "almost a Q5_K_XS" that was just now replaced by the altered Q4_K_XL that was first made available.
I agree with you however that the previous Q4_K_XL was indeed producing higher quality interactions and responses so to "make a long story short" the previous Q4_K_XL was slightly altered to include the new chat_template and then re-uploaded as it now has replaced the one that I had recently made available. Please definitely give it a pull and let me know if it's the old friend you remember from before! Going forward I will ensure that any of the Q4_K_XL Quants I create have this Q4_K/Q6_K blend.
To quickly touch on your request for a (or a few) reasoning models(s) request - I would like to inform you that I have gone ahead and pulled the safetensors for the following models and will be converting them tonight (within the next 12-16 hours if things work the way they should:
- Mag-Mell-Reasoner-12B
- Ministral-3-Reasoning (3B, 8B, and 14B)
- Rework of my current DeepSeek-R1 (8B and 14B)
- Rocinante-X-12B-v1 (a more recent Rocinante that seems to be decently liked)
- Rocinante-12B-v1.1
- MagMell-GnR-Roc-12b (Mag-Mell-GunsnRoses-Rocinante mergkit)
Additionally I've had interesting results with altering the embedded system prompt within a model's chat_template by including a request for the model to "think" (reason) and may be posting reasoning altered versions of non-reasoning models (we will see how that goes though).
PS: I couldn't find a 12B Rocinante Reasoning model, seems like they are all just Racinante models as a standard non-thinking LLM. If you have a link to a reasoning version I will happily include it in the conversion list.
EDIT (March 6th 2026): My apologies lol I will be converting and quanting the listed models shortly. I wound up falling asleep the other night lol. Anyway expect more releases shortly.