im getting this error when running q4km

#1
by daniel7789 - opened

im getting this error using llama.cpp missing tensor 'blk.64.ssm_conv1d.weight'

Are you using the standard llama.cpp? Give it a try with the turbo-tan/llama.cpp-tq3 fork. I think it's because of the MTP layer which i don't think it's supported by the upstream llama.cpp for qwen3.5 models.

Are you using the standard llama.cpp? Give it a try with the turbo-tan/llama.cpp-tq3 fork. I think it's because of the MTP layer which i don't think it's supported by the upstream llama.cpp for qwen3.5 models.

i tried that too, it showed the exact same error.

Can also confirm it's missing the 'blk.64.ssm_conv1d.weight' tensor. Tried on both TQ3 and vanilla llama.cpp and got same error

Hey sorry to everyone. I'm experimenting with MTP and i uploaded the models with MTP. Now the Q4_K model should be without MTP and should work on default llama.cpp (not tested). You could try to redownload the file and test. If you want to experiment with MTP i also renamed the old model for you to play with.

runs great! Thanks for the re-upload

Please share the results, it's my first experiment with abliteration

Still experimenting with what I can/can't do so not much to report back on that

I am however only getting about 13-14 tps vs about 25 on the 4 bit quant on 1x 3090 - not sure if it's relevant or useful but thought I'd add

Are you using the TQ version with the turbotan llamacpp and comparing to standard llamacpp with q4? I'm running a local fork with a lot of modification and on DGX i'm now at 14.5tok/sec almost flat (vs 9-11tok/sec with default q4_k

Sign up or log in to comment