Ollama Support

#15

by yqchen-sci - opened Jan 25

Jan 25

Thank you for providing the quantized model files. The official Ollama repository already includes this model, but I noticed that the UD-Q4_K_XL version you provided occupies less space. I would like to download your GGUF file and run it on Ollama. Could you please provide the corresponding Ollama model template and relevant parameter settings?

shimmyshimmer

Unsloth AI org Jan 25

It does not 100% work in Ollama as of this moment due to potential chat template incompatibility issues. If you are using Ollama, just use Ollama's one

yqchen-sci

Jan 27

It does not 100% work in Ollama as of this moment due to potential chat template incompatibility issues. If you are using Ollama, just use Ollama's one

OK, thanks.

distahl

Jan 31

•

edited Jan 31

I was able to make it work with this modelfile

FROM hf.co/unsloth/GLM-4.7-Flash-GGUF:Q4_K_XL

RENDERER glm-4.7
PARSER glm-4.7

TEMPLATE "[gMASK]<sop>{{ if .System }}<|system|>
{{ .System }}{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}{{ end }}<|assistant|>
{{ .Response }}"

PARAMETER stop <|user|>

PARAMETER temperature 1.0
PARAMETER top_p 0.95
PARAMETER min_p 0.01
PARAMETER repeat_penalty 1.0

And then execute something like this:
ollama create GLM-4.7-Flash-GGUF:Q4_K_XL -f glm47-flash.modelfile

I think basically the RENDERER glm-4.7 and PARSER glm-4.7 are important to be added.

yqchen-sci

Jan 31

I was able to make it work with this modelfile
FROM hf.co/unsloth/GLM-4.7-Flash-GGUF:Q4_K_XL

RENDERER glm-4.7
PARSER glm-4.7

TEMPLATE "[gMASK]<sop>{{ if .System }}<|system|>
{{ .System }}{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}{{ end }}<|assistant|>
{{ .Response }}"

PARAMETER stop <|user|>

PARAMETER temperature 1.0
PARAMETER top_p 0.95
PARAMETER min_p 0.01
PARAMETER repeat_penalty 1.0
And then execute something like this:
ollama create GLM-4.7-Flash-GGUF:Q4_K_XL -f glm47-flash.modelfile

I think basically the RENDERER glm-4.7 and PARSER glm-4.7 are important to be added.

Does this template support tool use? This model has strong capabilities for multi-turn tool use, so it would be helpful to provide an appropriate Ollama template.

distahl

Jan 31

I have no clue. Try it out. It's basically the original template of this Unsloth GGUF enriched by the RENDER, PARSER, temperature, top_p, min_p and repeat_penalty properties.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment