Can it be updated for the new, faster, llama.cpp implementation

#3
by juanml82 - opened

A recent update of llama.cpp makes Qwen 3 Next faster https://github.com/ggml-org/llama.cpp/pull/18683 , but requires updated ggufs. Unsloth already released regular updated ggufs, I wonder if you'd be able to abliterated this model with this update so it works faster

Due to the large number of models we have on hf.co, the available space has become limited. You can either quantify and upload these models to hf.co yourself or perform local testing.

Sign up or log in to comment