block size [32]

#1
by bobchenyx - opened
Moxin Organization org

https://github.com/ggml-org/llama.cpp/blob/master/gguf-py/gguf/constants.py#L3371

Due to the architecture and tensor shapes, only quantization options with a block size of 32 appear to be compatible at this point.

Sign up or log in to comment