Q4_K_M / Q6_K quantizations verified on Ollama 0.20.2

#8
by hero775 - opened

Hi,

For anyone looking for E4B GGUFs that work on Ollama 0.20+, we have Q4_K_M and Q6_K quantizations built from official Google weights:

https://huggingface.co/batiai/gemma-4-E4B-it-GGUF

Quant Size Mac mini M4 (16GB) M4 Max (128GB)
Q4_K_M 5.0GB 57.1 t/s 84.0 t/s
Q6_K 5.8GB 45.0 t/s 77.4 t/s

Both fit on 16GB Mac. Tool calling and Korean verified.

ollama pull batiai/gemma4-e4b:q4

Quantized with llama.cpp build 400ac8e (latest, with Gemma 4 fixes). Not re-quantized from existing GGUFs.

Sign up or log in to comment