Can you also make zai-org/GLM-4.7-Flash same as you made this model?

#6
by dibu28 - opened

Can you also make zai-org/GLM-4.7-Flash same way as you made this model?

I'm getting great parformance with your byteshape/Qwen3-30B-A3B-Instruct-2507-GGUF with just RTX 2060 12GB more then 68 tps speed.
I wonder if it is possible to get same quality and performance for zai-org/GLM-4.7-Flash as you didi for this model?

ByteShape org

Thanks for sharing your experience, glad it’s running so well on your setup.
GLM-4.7-Flash is on our radar, and with many great models released recently, we’re aiming to release a well-evaluated selection of them.

running on my Dell Laptop, only with CPU

.\llama-server.exe -m Qwen3-30B-A3B-Instruct-2507-Q3_K_S-2.70bpw.gguf -c 8192 -fa on -ngl 0

530 tokens in prompt > prefill in 8 seconds, 65 tokens/second
5 tokens reply > generation speed 27 tokens/second

PREFILL
Picture1

GENERATION
Picture2

Sign up or log in to comment