Not supported on llama.cpp

#11

by RealBiggly - opened Feb 27

Discussion

RealBiggly

Feb 27

I have the latest llama.cpp but it says not supported.

jeffwadsworth

Mar 3

I am running the latest version of llama.cpp from March 3, 2026 and it runs...but incredibly slowly, so something is up. I can run GLM 5, 4bit unsloth at 3 t/s, but this model at 8bit gets like 2 t/2 with the same flags.

oPnf4fMoKMz4VFq

Mar 6

•

edited Mar 6

@jeffwadsworth The performance you see sounds reasonable to me? 27B at 8bit is almost 30% more data than 40B (active in GLM 5) at 4bit.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment