Q8_0 and Q5_0_custom static quants for this merge. Also, an overall 4.8bpw quant for IK_Llama.cpp and Croco.cpp, targetting 48GB VRAM users.
Chat template
5-bit
8-bit
Base model