Can you also make zai-org/GLM-4.7-Flash same as you made this model?

by dibu28 - opened Jan 20

Jan 20

Can you also make zai-org/GLM-4.7-Flash same way as you made this model?

I'm getting great parformance with your byteshape/Qwen3-30B-A3B-Instruct-2507-GGUF with just RTX 2060 12GB more then 68 tps speed.
I wonder if it is possible to get same quality and performance for zai-org/GLM-4.7-Flash as you didi for this model?

Ali93H

ByteShape org Jan 20

Thanks for sharing your experience, glad it’s running so well on your setup.
GLM-4.7-Flash is on our radar, and with many great models released recently, we’re aiming to release a well-evaluated selection of them.

FM-1976

Jan 24

running on my Dell Laptop, only with CPU

.\llama-server.exe -m Qwen3-30B-A3B-Instruct-2507-Q3_K_S-2.70bpw.gguf -c 8192 -fa on -ngl 0

530 tokens in prompt > prefill in 8 seconds, 65 tokens/second
5 tokens reply > generation speed 27 tokens/second

PREFILL

GENERATION

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment