How do I load i.e. gpt-oss-20b-UD-Q6_K_XL in llama.cpp?

#24

by RobertGro - opened Aug 16, 2025

Aug 16, 2025

Every time I try to run the model with "llama-server" or "llama-cli" I get

gguf_init_from_file_impl: tensor 'blk.0.ffn_down_exps.weight' has invalid ggml type 39 (NONE)
gguf_init_from_file_impl: failed to read tensor info
llama_model_load: error loading model: llama_model_loader: failed to load model from ....\LLM\gpt-oss-20b-UD-Q6_K_XL.gguf
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '....\LLM\gpt-oss-20b-UD-Q6_K_XL.gguf'
main: error: unable to load model

and

gguf_init_from_file_impl: tensor 'blk.0.ffn_down_exps.weight' has invalid ggml type 39 (NONE)
gguf_init_from_file_impl: failed to read tensor info
llama_model_load: error loading model: llama_model_loader: failed to load model from gpt-oss-20b-UD-Q6_K_XL.gguf
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model 'gpt-oss-20b-UD-Q6_K_XL.gguf'
srv load_model: failed to load model, 'gpt-oss-20b-UD-Q6_K_XL.gguf'
srv operator(): operator(): cleaning up before exit...
main: exiting due to model loading error

Anyone knows how to fix this? Is this an issue related to llamacpp https://github.com/ggml-org/llama.cpp or maybe the UD quant dynamic (https://www.unsloth.ai/blog/dynamic-v2)?

Thanks in advance.

Gnaar

Aug 17, 2025

llama-server -hf unsloth/gpt-oss-20b-GGUF:gpt-oss-20b-Q6_K.gguf and then any other flags you need is probably the easiest way

Gnaar

Aug 17, 2025

Nvm I didnt read your comment properly :D Have you rebuild llama.cpp recently?! That's the first thing I'd try here due to "has invalid ggml type 39 (NONE)"

RobertGro

Aug 17, 2025

Not yet. Thanks for answering though. Waiting for an update on llama.cpp to support the UD quant version. Llama.cpp is updated on my windows host automatically imo (installed via winget) and now I probably have to wait before manually build the repo. I just try it later again and if it's working I'm gonna close this thread.

Gnaaromat

Aug 17, 2025

I just ran the model you tried (gpt-oss-20b-UD-Q6_K_XL.gguf) and had no issues (apart from the result being pretty bad in terms of code analysis lol

If you are getting llama.cpp server from winget, you may have to wait till that repo is updated. I always manually build from the master branch once a week and didn't have any issues as I said.

Here are my options if you wanna try getting into that(though it's linux but you could just as easily do this in a WSL terminal - it would probably take even longer but... ):

mkdir build
cd build
cmake ..
-DLLAMA_CURL=ON
-DBUILD_SHARED_LIBS=ON
-DGGML_CUDA=ON
-DGGML_CUDA_FA=ON
-DGGML_CUDA_FA_ALL_QUANTS=ON
-DGGML_CUDA_F16=ON
-DGGML_CUDA_GRAPHS=ON
-DLLAMA_ENABLE_MTMD=1
cmake --build . --config Release

Couple of those flags aren't needed but whatever. The longest part is always CUDA compilation.

RobertGro

Aug 17, 2025

I just ran the model you tried (gpt-oss-20b-UD-Q6_K_XL.gguf) and had no issues (apart from the result being pretty bad in terms of code analysis lol

If you are getting llama.cpp server from winget, you may have to wait till that repo is updated. I always manually build from the master branch once a week and didn't have any issues as I said.

Here are my options if you wanna try getting into that(though it's linux but you could just as easily do this in a WSL terminal - it would probably take even longer but... ):

mkdir build
cd build
cmake ..
-DLLAMA_CURL=ON
-DBUILD_SHARED_LIBS=ON
-DGGML_CUDA=ON
-DGGML_CUDA_FA=ON
-DGGML_CUDA_FA_ALL_QUANTS=ON
-DGGML_CUDA_F16=ON
-DGGML_CUDA_GRAPHS=ON
-DLLAMA_ENABLE_MTMD=1
cmake --build . --config Release

Couple of those flags aren't needed but whatever. The longest part is always CUDA compilation.

Thank you. Indeed that has been solved by issueing "winget upgrade --all" in an elevated powershell window. An outdated llama.cpp version (and several other apps) was the case. Closing this due to the model is running now.

RobertGro changed discussion status to closed Aug 17, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment