Google Gemma 3 27B GGUF Quantized Models

This repository contains GGUF quantized versions of Google's Gemma 3 27B pretrained model, optimized for efficient deployment across various hardware configurations.

Quantization Results

Model	Size (GB)	Compression Ratio	Size Reduction
Q8_0	26.7 GB	53%	47%
Q6_K	20.6 GB	41%	59%
Q5_K	17.9 GB	36%	64%
Q4_K	15.4 GB	31%	69%
Q3_K	12.5 GB	25%	75%
Q2_K	9.8 GB	19%	81%

Quality vs Size Trade-offs

Q8_0: Near-lossless quality, minimal degradation compared to F16
Q6_K: Very good quality, slight degradation in some rare cases
Q5_K: Good quality, good balance between size and performance
Q4_K: Decent quality, noticeable degradation but still usable for most tasks
Q3_K: Reduced quality, more significant degradation
Q2_K: Heavily reduced quality, substantial degradation but smallest file size

Recommendations

For maximum quality: Use F16 or Q8_0
For balanced performance: Use Q5_K or Q6_K
For minimum size: Use Q2_K or Q3_K
For most use cases: Q5_K provides a good balance of quality and size
For extreme size constraints: Q2_K provides the smallest file size but with significant quality degradation

Usage with llama.cpp

These models can be used with llama.cpp and its various interfaces. Example:

# Running with llama-gemma3-cli.exe (adjust paths as needed)
./llama-gemma3-cli --model Google.Gemma-3-27b-pt.q5_k.gguf --ctx-size 4096 --temp 0.7 --prompt "Write a short story about a robot who discovers it has feelings."

License

This model is released under the same Gemma license as the original model.

Original Model Information

This quantized set is derived from Google's Gemma 3 27B pretrained model.

Model Specifications

Architecture: Gemma 3
Size Label: 27B
Type: Multimodal (text and image input, text output)
Context Length: 128K tokens
Training Data: 14 trillion tokens across web documents, code, mathematics, and images
Languages: Support for over 140 languages

Capabilities

Text generation and question answering
Image analysis and understanding
Reasoning and factuality
Multilingual support
STEM and code tasks

Benchmark Results

The pre-trained Gemma 3 27B model achieves the following benchmark results:

Reasoning and Factuality

HellaSwag: 85.6% (10-shot)
BoolQ: 82.4% (0-shot)
TriviaQA: 85.5% (5-shot)
Natural Questions: 36.1% (5-shot)
BIG-Bench Hard: 77.7% (few-shot)

STEM and Code

MMLU: 78.6% (5-shot)
MATH: 50.0% (4-shot)
GSM8K: 82.6% (8-shot)
HumanEval: 48.8% (0-shot)

Multilingual

Global-MMLU-Lite: 75.7%
XQuAD (all): 76.8%

Multimodal

DocVQA: 85.6% (val)
TextVQA: 68.6% (val)
MMMU: 56.1% (pre-trained)

Citation & Attribution

@article{gemma_2025,
    title={Gemma 3},
    url={https://goo.gle/Gemma3Report},
    publisher={Kaggle},
    author={Gemma Team},
    year={2025}
}

@misc{gemma3_quantization_2025,
    title={Quantized Versions of Google's Gemma 3 27B Model},
    author={Lex-au},
    year={2025},
    month={March},
    note={Quantized models (Q8_0, Q6_K, Q5_K, Q4_K, Q3_K, Q2_K) derived from Google's Gemma 3 27B},
    url={https://huggingface.co/lex-au}
}

Downloads last month: 48

GGUF

Model size

27B params

Architecture

gemma3

Hardware compatibility

2-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lex-au/Google.Gemma-3-27b-pt-GGUF

Base model

google/gemma-3-27b-pt

Quantized

(28)

this model

Collection including lex-au/Google.Gemma-3-27b-pt-GGUF

Gemma 3

Collection

Collection of quants for Google's Gemma 3 • 3 items • Updated Apr 18, 2025