Google Gemma 3 27B GGUF Quantized Models
This repository contains GGUF quantized versions of Google's Gemma 3 27B pretrained model, optimized for efficient deployment across various hardware configurations.
Quantization Results
| Model |
Size (GB) |
Compression Ratio |
Size Reduction |
| Q8_0 |
26.7 GB |
53% |
47% |
| Q6_K |
20.6 GB |
41% |
59% |
| Q5_K |
17.9 GB |
36% |
64% |
| Q4_K |
15.4 GB |
31% |
69% |
| Q3_K |
12.5 GB |
25% |
75% |
| Q2_K |
9.8 GB |
19% |
81% |
Quality vs Size Trade-offs
- Q8_0: Near-lossless quality, minimal degradation compared to F16
- Q6_K: Very good quality, slight degradation in some rare cases
- Q5_K: Good quality, good balance between size and performance
- Q4_K: Decent quality, noticeable degradation but still usable for most tasks
- Q3_K: Reduced quality, more significant degradation
- Q2_K: Heavily reduced quality, substantial degradation but smallest file size
Recommendations
- For maximum quality: Use F16 or Q8_0
- For balanced performance: Use Q5_K or Q6_K
- For minimum size: Use Q2_K or Q3_K
- For most use cases: Q5_K provides a good balance of quality and size
- For extreme size constraints: Q2_K provides the smallest file size but with significant quality degradation
Usage with llama.cpp
These models can be used with llama.cpp and its various interfaces. Example:
./llama-gemma3-cli --model Google.Gemma-3-27b-pt.q5_k.gguf --ctx-size 4096 --temp 0.7 --prompt "Write a short story about a robot who discovers it has feelings."
License
This model is released under the same Gemma license as the original model.
Original Model Information
This quantized set is derived from Google's Gemma 3 27B pretrained model.
Model Specifications
- Architecture: Gemma 3
- Size Label: 27B
- Type: Multimodal (text and image input, text output)
- Context Length: 128K tokens
- Training Data: 14 trillion tokens across web documents, code, mathematics, and images
- Languages: Support for over 140 languages
Capabilities
- Text generation and question answering
- Image analysis and understanding
- Reasoning and factuality
- Multilingual support
- STEM and code tasks
Benchmark Results
The pre-trained Gemma 3 27B model achieves the following benchmark results:
Reasoning and Factuality
- HellaSwag: 85.6% (10-shot)
- BoolQ: 82.4% (0-shot)
- TriviaQA: 85.5% (5-shot)
- Natural Questions: 36.1% (5-shot)
- BIG-Bench Hard: 77.7% (few-shot)
STEM and Code
- MMLU: 78.6% (5-shot)
- MATH: 50.0% (4-shot)
- GSM8K: 82.6% (8-shot)
- HumanEval: 48.8% (0-shot)
Multilingual
- Global-MMLU-Lite: 75.7%
- XQuAD (all): 76.8%
Multimodal
- DocVQA: 85.6% (val)
- TextVQA: 68.6% (val)
- MMMU: 56.1% (pre-trained)
Citation & Attribution
@article{gemma_2025,
title={Gemma 3},
url={https://goo.gle/Gemma3Report},
publisher={Kaggle},
author={Gemma Team},
year={2025}
}
@misc{gemma3_quantization_2025,
title={Quantized Versions of Google's Gemma 3 27B Model},
author={Lex-au},
year={2025},
month={March},
note={Quantized models (Q8_0, Q6_K, Q5_K, Q4_K, Q3_K, Q2_K) derived from Google's Gemma 3 27B},
url={https://huggingface.co/lex-au}
}