amazing model
#5
by Tugay31 - opened
what a great model for local ai. thank you qwen&alibaba.
Awesome for local coding with Zed and Goose.
It fits in VRAM using Q3_K_M on a RTX5060 (16GB), I get 48k context and 20+ tokens/s on llama.cpp. It's tight but can get useful work done, it makes every token count (very little re-reading or repetition). I've been using gpt-oss 20B and GLM 4.7 Flash (Q4), but after a day of testing this is better.
amazing model