Error when pulling with Ollama

#37

by KingyWolf - opened 23 days ago

•

Hello,
I allways het this error when trying to pull this model with Ollama.
Am I doing something wrong?
ollama run hf.co/HauhauCS/Qwen3.5-9B-Uncensored-HauhauCS-Aggressive:Q8_0
pulling manifest
pulling 99e7f2201c00: 100% ▕██████████████████████████████████████████████████████████▏ 9.5 GB
pulling 05f662501f8b: 100% ▕██████████████████████████████████████████████████████████▏ 921 MB
pulling 31ccbc936038: 100% ▕██████████████████████████████████████████████████████████▏ 481 B
verifying sha256 digest
writing manifest
success
Error: 500 Internal Server Error: unable to load model: G:\LLM\Ollama_Models\blobs\sha256-99e7f2201c0046b05d2825e4d8be6a2efad2b87b071cd55d37bdd9fbe201a58b

GPU NVIDIA RTX 4060 Ti
16GB
ollama version is 0.19.0

Kolojoo

23 days ago

Try LM Studio

KingyWolf

23 days ago

Yes, it works in LMStudio, I know, but if your whole environment is build around Ollama, I rather have it working inside Ollama...
I just want to know if there is a reason it's not working in Ollama or if it's something I do wrong.
It's the first model that I have problems with. And if it doesn't work with Ollama, maybe the author can put it in the description, to let others know.

Kolojoo

23 days ago

I agree with you and I don’t know exactly, but my best guess is that it’s an incompatibility between this specific Hugging Face GGUF build and Ollama’s current model loader. The model likely uses a newer or slightly modified Qwen3.5 architecture (e.g. different metadata fields, KV head settings, or tokenizer config) that Ollama doesn’t fully support yet, while LM Studio (via llama.cpp) is more tolerant with these variations. So the model downloads fine, but Ollama fails during initialization when parsing the architecture or loading tensors into memory.

AdamBaverman

17 days ago

Short version

If ollama run hf.co/... fails on Windows 11 with:

Error: 500 Internal Server Error: unable to load model

this worked for me:

Download the main .gguf model file manually from Hugging Face in the browser.
Create a Modelfile that points directly to that local .gguf.
Import it with ollama create.
Run the local model instead of hf.co/....

Commands:

cd D:\AI\ollama
notepad Modelfile
ollama create hauhau-qwen35 -f .\Modelfile
ollama list
ollama show hauhau-qwen35
ollama run hauhau-qwen35

Example Modelfile:

FROM D:\AI\ollama\Qwen3.5-9B-Uncensored-HauhauCS-Aggressive-BF16.gguf
PARAMETER temperature 1
PARAMETER top_k 20
PARAMETER top_p 0.95
PARAMETER presence_penalty 1.5

That solved the issue for me.

Full version

I had this problem on Windows 11 when trying to run the model directly from Hugging Face with:

ollama run hf.co/...

Ollama downloaded the files successfully, but then failed with:

Error: 500 Internal Server Error: unable to load model

What worked for me was not using hf.co/... directly.

Instead, I downloaded the actual .gguf model file manually from the Hugging Face page, saved it locally, and created a local Ollama model from that file.

Steps

Download the main .gguf file manually from Hugging Face using your browser.

Example path:

D:\AI\ollama\Qwen3.5-9B-Uncensored-HauhauCS-Aggressive-BF16.gguf

Create a Modelfile like this:

FROM D:\AI\ollama\Qwen3.5-9B-Uncensored-HauhauCS-Aggressive-BF16.gguf
PARAMETER temperature 1
PARAMETER top_k 20
PARAMETER top_p 0.95
PARAMETER presence_penalty 1.5

Run:

cd D:\AI\ollama
ollama create hauhau-qwen35 -f .\Modelfile
ollama list
ollama show hauhau-qwen35
ollama run hauhau-qwen35

Result

After that, the model loaded correctly and also worked fine in the Ollama UI.

Note

For me, the direct hf.co/... route failed with the 500 error, but importing the local .gguf through Modelfile worked immediately.

If someone else hits the same issue on Windows, this workaround may help.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment