kaios-quantizations
Collection
all my quantizations, usually GGUF • 1 item • Updated
This repository contains a Q3_K_M GGUF quantization of miromind-ai/MiroThinker-1.7-mini.
MiroThinker-1.7-mini.Q3_K_M.ggufQ3_K_MUnsloth + llama.cpp14.71 GB (13.70 GiB)apache-2.0This quant is aimed at fitting a 16 GB GPU while keeping better quality than lower-bit options. In local testing, it loaded successfully with llama.cpp on an AMD Radeon RX 9060 XT through the Vulkan backend.
llama-cli \
-m MiroThinker-1.7-mini.Q3_K_M.gguf \
--device Vulkan1 \
-ngl 999 \
-c 128
If VRAM is tight on your setup, reduce context length first, then reduce GPU offload.
3-bit
Base model
miromind-ai/MiroThinker-1.7-mini