m51Lab-NorskGemma4-31B-GGUF

GGUF quantized versions of m51Lab-NorskGemma4-31B for local inference with llama.cpp, Ollama, LM Studio, and other GGUF-compatible tools.

NorEval score: 0.836 — Norway's top-scoring open-source model.

Model Params NorEval Avg
m51Lab-NorskGemma4-31B 31B 0.836
m51Lab-NorskMistral-119B 119B MoE 0.764
NorMistral-11B-thinking 11B 0.731

See full model card for complete benchmark details and training methodology.

Available Files

File Quant Size RAM needed Description
NorskGemma4-31B-Q4_K_M.gguf Q4_K_M 18 GB ~22 GB Recommended — good balance of quality and speed
NorskGemma4-31B-Q8_0.gguf Q8_0 31 GB ~35 GB High quality, near-lossless
NorskGemma4-31B-F16.gguf F16 58 GB ~62 GB Full precision, not quantized

Usage

llama.cpp

# Download
huggingface-cli download dervig/m51Lab-NorskGemma4-31B-GGUF \
  NorskGemma4-31B-Q4_K_M.gguf --local-dir .

# Run (GPU accelerated)
./llama-cli -m NorskGemma4-31B-Q4_K_M.gguf \
  -p "Kva er hovudstaden i Noreg?" \
  -n 256 -ngl 99 -c 4096 --jinja

Ollama

# Create Modelfile
cat > Modelfile << 'EOF'
FROM ./NorskGemma4-31B-Q4_K_M.gguf
PARAMETER temperature 0.7
PARAMETER num_ctx 4096
SYSTEM Du er ein hjelpsom norsk AI-assistent.
EOF

# Create and run
ollama create norskgemma4 -f Modelfile
ollama run norskgemma4 "Forklar kva Stortinget er."

LM Studio

  1. Download the Q4_K_M or Q8_0 file
  2. Place in your LM Studio models directory
  3. Select the model and start chatting in Norwegian

Performance

Tested with llama.cpp on 1x NVIDIA H100:

Quant Prompt speed Generation speed
Q4_K_M 138 t/s 66 t/s
Q8_0 202 t/s 56 t/s
F16 7 t/s 40 t/s

Credits

Built by m51.ai Lab. Based on Google Gemma 4 31B-it, fine-tuned with data from NbAiLab, evaluated on NorEval by LTG, University of Oslo.

Downloads last month
310
GGUF
Model size
31B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dervig/m51Lab-NorskGemma4-31B-GGUF

Quantized
(1)
this model

Paper for dervig/m51Lab-NorskGemma4-31B-GGUF