Does this support DSA/lighting attention?
#1
by segmond - opened
Thanks for the release. Does this support DSA/lighting attention? When you get the bandwidth can you also do the same for DeepSeek-Math-v2 since it's based on v3.2
Oh hey @segmond ! For now mainline llama.cpp works for this, so DSA is disabled until support is added.
I tried and it works well!
However wait a bit since they're still experimental since I had to cook up the chat template from scratch since there wasn't a jinka file
I would wait until maybe say the UD-Q2-K-XL gets uploaded then it'll be a go ahead
Thanks! I've been running it for a while now using a hack I saw in localllama. It's an amazing model, thanks again folks are going to love this.
segmond changed discussion status to closed