Why are there two ggufs for each precision level?
The first is an empty gguf containing only meta data, and the other is the one that actually contains tensors. Is there a specific reason for this? Can I combine them into one?
@Jianqiao1 Yes, you can combine them into one if you want. I split them like that in case I need to provide a metadata update like following an upstream chat template fix from the model provider for instance, that way it's easier to update just that first split and it also means people who want to download the model only need to re-download that first split instead of downloading the entire model again.
Locally you are free to combine them, if I did have an update to distribute you might have to split it then re-combine.
@Jianqiao1 Yes, you can combine them into one if you want. I split them like that in case I need to provide a metadata update like following an upstream chat template fix from the model provider for instance, that way it's easier to update just that first split and it also means people who want to download the model only need to re-download that first split instead of downloading the entire model again.
Locally you are free to combine them, if I did have an update to distribute you might have to split it then re-combine.
OK, got it. Thanks so much!
Thanks for your effor. Can you also please reupload merged gguf files because merging multiple gguf files and running them in LM Studio seems to cause problems in LM Studio:
llama_model_load: error loading model: missing tensor 'blk.0.ffn_gate_exps.weight'
llama_model_load_from_file_impl: failed to load model
2026-03-20 23:37:01 [DEBUG]
common_init_from_params: failed to load model 'C:\Users\user1.cache\lm-studio\models\AesSedai\Qwen3.5-35B-A3B-GGUF\Qwen3.5-35B-A3B-Q4_K_M-Merged.gguf'
srv load_model: failed to load model, 'C:\Users\user1.cache\lm-studio\models\AesSedai\Qwen3.5-35B-A3B-GGUF\Qwen3.5-35B-A3B-Q4_K_M-Merged.gguf': error loading model: missing tensor 'blk.0.ffn_gate_exps.weight'
2026-03-20 23:37:01 [DEBUG]
[LLMProcess] Failed to load model _0x54111c [Error]: Failed to load model.
at _0x5573b9.loadModel (C:\Program Files\LM Studio\resources\app.webpack\lib\llmworker.js:1:611743)
at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
at async _0x5573b9.handleMessage (C:\Program Files\LM Studio\resources\app.webpack\lib\llmworker.js:1:603939) {
cause: 'Failed to load model',
suggestion: undefined,
errorData: undefined,
data: undefined,
displayData: undefined,
title: 'Failed to load model.'
}
@abdussamed41 I updated the quants with the fused up + gate, that error looks like you were maybe mixing an old unfused shared with one of the new fused shards? Try downloading and merging both shards from Q4_K_M.
I say this because the new quants with fused up + gate don't have a ffn_gate_exps tensor like your error says, it's now supposed to be ffn_up_gate_exps.
I deleted the ggufs I downloaded over the library of LM Studio and downloaded the models manually over huggingface, put them in a folder under models and now LM Studio is able to run the model without merging as well.....
Thanks
I deleted the ggufs I downloaded over the library of LM Studio and downloaded the models manually over huggingface, put them in a folder under models and now LM Studio is able to run the model without merging as well.....
Just FYI under linux, merging files is easy with the 'cat' command.
cat myfile_00001.gguf myfile2_00002.gguf > myfile.gguf
@BingoBird you don't want to cat the ggufs together like that, these were split using llama-gguf-split so you would want to use that to merge them back together into one file, eg:
$ ./build/bin/llama-gguf-split --merge /path/to/myfile_00001.gguf /path/to/merged.gguf