im getting this error when running q4km
im getting this error using llama.cpp missing tensor 'blk.64.ssm_conv1d.weight'
Are you using the standard llama.cpp? Give it a try with the turbo-tan/llama.cpp-tq3 fork. I think it's because of the MTP layer which i don't think it's supported by the upstream llama.cpp for qwen3.5 models.
Are you using the standard llama.cpp? Give it a try with the turbo-tan/llama.cpp-tq3 fork. I think it's because of the MTP layer which i don't think it's supported by the upstream llama.cpp for qwen3.5 models.
i tried that too, it showed the exact same error.
Can also confirm it's missing the 'blk.64.ssm_conv1d.weight' tensor. Tried on both TQ3 and vanilla llama.cpp and got same error
Hey sorry to everyone. I'm experimenting with MTP and i uploaded the models with MTP. Now the Q4_K model should be without MTP and should work on default llama.cpp (not tested). You could try to redownload the file and test. If you want to experiment with MTP i also renamed the old model for you to play with.
runs great! Thanks for the re-upload
Please share the results, it's my first experiment with abliteration
Still experimenting with what I can/can't do so not much to report back on that
I am however only getting about 13-14 tps vs about 25 on the 4 bit quant on 1x 3090 - not sure if it's relevant or useful but thought I'd add
Are you using the TQ version with the turbotan llamacpp and comparing to standard llamacpp with q4? I'm running a local fork with a lot of modification and on DGX i'm now at 14.5tok/sec almost flat (vs 9-11tok/sec with default q4_k