Coding parameters used for Goose and Zed

by dugrema - opened Feb 25

Feb 25

The model card recommends this for coding tasks:

Thinking mode for precise coding tasks (e.g. WebDev): temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0

I found that for the UD Q4 version, the model gets stuck in thinking on simple prompts like "Hi" or after a few steps on more complex topics. I just increased the presence penalty to 0.1 and it seemed to work well enough. I had it fix an old unfinished Python implementation of Pacman with Zed. This is the first medium-size model (<100B params) I use that does not struggle at all with Zed's edit_file tool. It works REALLY well.

Full params for coding on llama.cpp (latest from main branch build: 8148, that fixes the template warnings):

-fitt 100 --fit-ctx 131072 --temp 0.6 --top-k 20 --top-p 0.95 --min-p 0.0 --repeat-penalty 1.0 --presence-penalty 0.1 --batch-size 2048 --ubatch-size 512 --n-predict 20000

That gives me 30-40 tok/sec on a RTX-5060. I figure having a repeat penalty in code is not ideal and could have side-effects. Right now I just get the occasional stoppage in the middle of a task and I don't feel like calling-up Ralph.

Does anyone have any better parameter combination for coding?

lawlietr

Feb 25

•

edited Feb 25

Same issue here, but the happened on 35BA3B UD-Q4_K_XL
I had to switch to Qwen3 Coder Next to continue the rest of the work.

dugrema

Feb 25

I found that UD-Q6_K_XL works properly with --presence-penalty 0.0 as recommended by Qwen. That's what I'll go with. I get about 30 tokens/s on a RTX-5060 with 16G vram and these settings:

-fitt 100 --fit-ctx 65536 --temp 0.6 --top-k 20 --top-p 0.95 --min-p 0.0 --repeat-penalty 1.0 --presence-penalty 0.0 --ubatch-size 256 --n-predict 20000

That leaves 23 MOE layers overflowing to RAM, from what I see that's about an additional 16G used by llama.cpp.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment