Error when pulling with Ollama
Hello,
I allways het this error when trying to pull this model with Ollama.
Am I doing something wrong?
ollama run hf.co/HauhauCS/Qwen3.5-9B-Uncensored-HauhauCS-Aggressive:Q8_0
pulling manifest
pulling 99e7f2201c00: 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 9.5 GB
pulling 05f662501f8b: 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 921 MB
pulling 31ccbc936038: 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 481 B
verifying sha256 digest
writing manifest
success
Error: 500 Internal Server Error: unable to load model: G:\LLM\Ollama_Models\blobs\sha256-99e7f2201c0046b05d2825e4d8be6a2efad2b87b071cd55d37bdd9fbe201a58b
GPU NVIDIA RTX 4060 Ti
16GB
ollama version is 0.19.0
Try LM Studio
Yes, it works in LMStudio, I know, but if your whole environment is build around Ollama, I rather have it working inside Ollama...
I just want to know if there is a reason it's not working in Ollama or if it's something I do wrong.
It's the first model that I have problems with. And if it doesn't work with Ollama, maybe the author can put it in the description, to let others know.
I agree with you and I donβt know exactly, but my best guess is that itβs an incompatibility between this specific Hugging Face GGUF build and Ollamaβs current model loader. The model likely uses a newer or slightly modified Qwen3.5 architecture (e.g. different metadata fields, KV head settings, or tokenizer config) that Ollama doesnβt fully support yet, while LM Studio (via llama.cpp) is more tolerant with these variations. So the model downloads fine, but Ollama fails during initialization when parsing the architecture or loading tensors into memory.
Short version
If ollama run hf.co/... fails on Windows 11 with:
Error: 500 Internal Server Error: unable to load model
this worked for me:
- Download the main
.ggufmodel file manually from Hugging Face in the browser. - Create a
Modelfilethat points directly to that local.gguf. - Import it with
ollama create. - Run the local model instead of
hf.co/....
Commands:
cd D:\AI\ollama
notepad Modelfile
ollama create hauhau-qwen35 -f .\Modelfile
ollama list
ollama show hauhau-qwen35
ollama run hauhau-qwen35
Example Modelfile:
FROM D:\AI\ollama\Qwen3.5-9B-Uncensored-HauhauCS-Aggressive-BF16.gguf
PARAMETER temperature 1
PARAMETER top_k 20
PARAMETER top_p 0.95
PARAMETER presence_penalty 1.5
That solved the issue for me.
Full version
I had this problem on Windows 11 when trying to run the model directly from Hugging Face with:
ollama run hf.co/...
Ollama downloaded the files successfully, but then failed with:
Error: 500 Internal Server Error: unable to load model
What worked for me was not using hf.co/... directly.
Instead, I downloaded the actual .gguf model file manually from the Hugging Face page, saved it locally, and created a local Ollama model from that file.
Steps
- Download the main
.gguffile manually from Hugging Face using your browser.
Example path:
D:\AI\ollama\Qwen3.5-9B-Uncensored-HauhauCS-Aggressive-BF16.gguf
- Create a
Modelfilelike this:
FROM D:\AI\ollama\Qwen3.5-9B-Uncensored-HauhauCS-Aggressive-BF16.gguf
PARAMETER temperature 1
PARAMETER top_k 20
PARAMETER top_p 0.95
PARAMETER presence_penalty 1.5
- Run:
cd D:\AI\ollama
ollama create hauhau-qwen35 -f .\Modelfile
ollama list
ollama show hauhau-qwen35
ollama run hauhau-qwen35
Result
After that, the model loaded correctly and also worked fine in the Ollama UI.
Note
For me, the direct hf.co/... route failed with the 500 error, but importing the local .gguf through Modelfile worked immediately.
If someone else hits the same issue on Windows, this workaround may help.