amazing model

#5
by Tugay31 - opened

what a great model for local ai. thank you qwen&alibaba.

Awesome for local coding with Zed and Goose.

It fits in VRAM using Q3_K_M on a RTX5060 (16GB), I get 48k context and 20+ tokens/s on llama.cpp. It's tight but can get useful work done, it makes every token count (very little re-reading or repetition). I've been using gpt-oss 20B and GLM 4.7 Flash (Q4), but after a day of testing this is better.

amazing model

Sign up or log in to comment