LM Studio RotorQuant KV cache setting missing
Hey,
sorry for probably being stupid but I just can't find the RotorQuant KV cache setting you are referencing. Where is it supposed to be? Im on Windows with LM Studio 0.4.11 (Build 1) with CUDA 12 llama.cpp (Windows) v2.13.0 runtime that uses llama.cpp release b8733 (commit d6f3030).
Can't see anything about RotorQuant KV cache in the logs while loading the model. The standard LM Studio settings doesn't seem to have that option either.
If you specifically want RotorQuant, you'd need to build the forked llama.cpp yourself from source and run it via CLI, bypassing LM Studio entirely. We will correct the instructions. The GGUF file itself is a standard GGUF. The weight quantization (Q5_K_M) works regardless of which KV cache strategy you use.
Hi! Thank you for sharing the Qwen3.5-27B-RotorQuant-GGUF quantization β it's a really interesting approach.
I'd love to learn how to create similar quantized GGUF versions myself. Could you point me in the right direction?
