DavidAU/gemma-4-31B-it-The-DECKARD-HERETIC-UNCENSORED-Thinking

gemma-4-26B-A4B-it

by Vividly3492 - opened 15 days ago

Could you throw these tunes (and possibly the Opus finetune) at the 26B MoE model? Allegedly it has much higher performance on strix halo/spark systems than the 31B dense model. Fwiw, the 31B dense model is already outperforming Qwen 3.5.

Separately, and this may be a local optimization problem, but gemma-4 absolutely crawls once going over the 30k input token range in a way that other models like GPT-OSS /Qwen3.5 didn't. Haven't had a chance to test the vanilla gemma4, so this is more of an anecdotal "FYI" than a bug report.

DavidAU

Owner 14 days ago

Working on a Gemma 4 19B-A4B right now [fine tune] ; will be out later today providing it "pans out" okay.
26B / 21B - A4Bs are on the horizon.

RE: Crawls ; make sure your llamacpp / ai app is up to date ; as there have been major changes to address these issues in the past few days.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment