How to Run GLM-5 Locally Guide! 🔥

pinned

by danielhanchen - opened Feb 12

Unsloth AI org Feb 12

•

Hey guys we made many tutorials on running our dynamic GGUFs, FP8 quants and how to use the model in Claude Code, OpenAI Codex, how to tool-call with the model and more!

Guide: https://unsloth.ai/docs/models/glm-5

glm-5-guide

danielhanchen pinned discussion Feb 12

phakio

Feb 13

getting a solid 12 tk/s on my server with q4_k_xl at 60k context! as always thanks for the quick quants, this model is great !

Butterfly-314

Feb 13

getting a solid 12 tk/s on my server with q4_k_xl at 60k context! as always thanks for the quick quants, this model is great !

What kind of hardware are you running?

phakio

Feb 13

What kind of hardware are you running?

3x3090, 1x4090, Intel QYFS / ASUS Sage W790E Motherboard with 512GB DDR5

jessikat29

Mar 10

•

edited Mar 10

getting a solid 12 tk/s on my server with q4_k_xl at 60k context! as always thanks for the quick quants, this model is great !

There is GLM-5-NVFP4 under 440GB which is about the same size and may run much much faster than 12 tk/s

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment