Recommended way to run this model:
llama-server -hf ggml-org/gemma-4-31B-it-GGUF
Then, access http://localhost:8080
Chat template
4-bit
8-bit
16-bit
Base model