Ollama Support
Thank you for providing the quantized model files. The official Ollama repository already includes this model, but I noticed that the UD-Q4_K_XL version you provided occupies less space. I would like to download your GGUF file and run it on Ollama. Could you please provide the corresponding Ollama model template and relevant parameter settings?
It does not 100% work in Ollama as of this moment due to potential chat template incompatibility issues. If you are using Ollama, just use Ollama's one
It does not 100% work in Ollama as of this moment due to potential chat template incompatibility issues. If you are using Ollama, just use Ollama's one
OK, thanks.
I was able to make it work with this modelfile
FROM hf.co/unsloth/GLM-4.7-Flash-GGUF:Q4_K_XL
RENDERER glm-4.7
PARSER glm-4.7
TEMPLATE "[gMASK]<sop>{{ if .System }}<|system|>
{{ .System }}{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}{{ end }}<|assistant|>
{{ .Response }}"
PARAMETER stop <|user|>
PARAMETER temperature 1.0
PARAMETER top_p 0.95
PARAMETER min_p 0.01
PARAMETER repeat_penalty 1.0
And then execute something like this:ollama create GLM-4.7-Flash-GGUF:Q4_K_XL -f glm47-flash.modelfile
I think basically the RENDERER glm-4.7 and PARSER glm-4.7 are important to be added.
I was able to make it work with this modelfile
FROM hf.co/unsloth/GLM-4.7-Flash-GGUF:Q4_K_XL RENDERER glm-4.7 PARSER glm-4.7 TEMPLATE "[gMASK]<sop>{{ if .System }}<|system|> {{ .System }}{{ end }}{{ if .Prompt }}<|user|> {{ .Prompt }}{{ end }}<|assistant|> {{ .Response }}" PARAMETER stop <|user|> PARAMETER temperature 1.0 PARAMETER top_p 0.95 PARAMETER min_p 0.01 PARAMETER repeat_penalty 1.0And then execute something like this:
ollama create GLM-4.7-Flash-GGUF:Q4_K_XL -f glm47-flash.modelfileI think basically the
RENDERER glm-4.7andPARSER glm-4.7are important to be added.
Does this template support tool use? This model has strong capabilities for multi-turn tool use, so it would be helpful to provide an appropriate Ollama template.
I have no clue. Try it out. It's basically the original template of this Unsloth GGUF enriched by the RENDER, PARSER, temperature, top_p, min_p and repeat_penalty properties.