gemma-4-31B-it-F32-GGUF

Gemma-4-31B-it from Google is the flagship dense model in the Gemma 4 family, featuring 31 billion parameters optimized for workstation/server deployment with a massive 256K context window, support for text, images (variable aspect ratios/resolutions), and advanced agentic capabilities including step-by-step thinking modes, multilingual OCR/handwriting recognition, document/PDF parsing, UI/screen analysis, chart comprehension, and precise object detection with pointing. Designed to bridge edge and cloud performance, the instruction-tuned variant delivers frontier-level reasoning rivaling proprietary models 5-10x larger across coding, math, multilingual tasks (140+ languages), and multimodal workflows while maintaining Google's production-grade safety alignments for enterprise use. With Apache 2.0 licensing and optimizations for NVIDIA/AMD GPUs via vLLM/llama.cpp, it powers high-quality local inference on consumer hardware for autonomous agents, function calling, structured data extraction, and complex planning without cloud dependency—positioned above efficient MoE siblings for maximum output quality in reasoning-heavy applications.

Quick start with llama.cpp

llama-server -hf prithivMLmods/gemma-4-31B-it-F32-GGUF:F32

Model Files

File Name	Quant Type	File Size	File Link
gemma-4-31B-it.BF16.gguf	BF16	61.4 GB	Download
gemma-4-31B-it.F16.gguf	F16	61.4 GB	Download
gemma-4-31B-it.F32.gguf	F32	123 GB	Download
gemma-4-31B-it.Q8_0.gguf	Q8_0	32.6 GB	Download
gemma-4-31B-it.mmproj-bf16.gguf	mmproj-bf16	1.2 GB	Download
gemma-4-31B-it.mmproj-f16.gguf	mmproj-f16	1.2 GB	Download
gemma-4-31B-it.mmproj-f32.gguf	mmproj-f32	2.3 GB	Download
gemma-4-31B-it.mmproj-q8_0.gguf	mmproj-q8_0	810 MB	Download

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

Downloads last month: 7,919

GGUF

Model size

31B params

Architecture

gemma4

Hardware compatibility

8-bit

16-bit

32-bit

Model tree for prithivMLmods/gemma-4-31B-it-F32-GGUF

Base model

google/gemma-4-31B-it

Quantized

(107)

this model

Collection including prithivMLmods/gemma-4-31B-it-F32-GGUF

Gemma-4 F32 GGUF

Collection

Collection of Gemma-4 Quants • 4 items • Updated 10 days ago • 1