Q4_K_M / Q6_K quantizations verified on Ollama 0.20.2
#8
by hero775 - opened
Hi,
For anyone looking for E4B GGUFs that work on Ollama 0.20+, we have Q4_K_M and Q6_K quantizations built from official Google weights:
https://huggingface.co/batiai/gemma-4-E4B-it-GGUF
| Quant | Size | Mac mini M4 (16GB) | M4 Max (128GB) |
|---|---|---|---|
| Q4_K_M | 5.0GB | 57.1 t/s | 84.0 t/s |
| Q6_K | 5.8GB | 45.0 t/s | 77.4 t/s |
Both fit on 16GB Mac. Tool calling and Korean verified.
ollama pull batiai/gemma4-e4b:q4
Quantized with llama.cpp build 400ac8e (latest, with Gemma 4 fixes). Not re-quantized from existing GGUFs.