Q4_K_M / Q6_K quantizations verified on Ollama 0.20.2

by hero775 - opened 14 days ago

Hi,

For anyone looking for E4B GGUFs that work on Ollama 0.20+, we have Q4_K_M and Q6_K quantizations built from official Google weights:

Quant	Size	Mac mini M4 (16GB)	M4 Max (128GB)
Q4_K_M	5.0GB	57.1 t/s	84.0 t/s
Q6_K	5.8GB	45.0 t/s	77.4 t/s

Both fit on 16GB Mac. Tool calling and Korean verified.

ollama pull batiai/gemma4-e4b:q4

Quantized with llama.cpp build 400ac8e (latest, with Gemma 4 fixes). Not re-quantized from existing GGUFs.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment