block size [32]
#1
by bobchenyx - opened
https://github.com/ggml-org/llama.cpp/blob/master/gguf-py/gguf/constants.py#L3371
Due to the architecture and tensor shapes, only quantization options with a block size of 32 appear to be compatible at this point.