E4B version too?

by eleius - opened 12 days ago

Discussion

eleius

12 days ago

Please :-)

wangzhang

Owner 12 days ago

for sure!

zecanard

12 days ago

Heads up from LLMfan46:

created GGUFs of gemma-4-E4B-it from transformers 5.5.0, 5.5.2 and 5.5.3 do not work and fail to load in LM Studio, only transformers 5.5.1 work with this model, no idea why because Gemma 4 31B and Gemma 4 26B GGUFs work no issue with transformers 5.5.3

Not sure if it affects E4B as well, though.

wangzhang

Owner 12 days ago

https://huggingface.co/wangzhang/gemma-4-E4B-it-abliterated is alive prefectly!

wangzhang

Owner 12 days ago

Thanks for the heads up! FYI we haven't released any GGUFs for this — only the safetensors at https://huggingface.co/wangzhang/gemma-4-E4B-it-abliterated, which loads fine. That said, the issue LLMfan46 is describing almost certainly isn't a transformers version problem — it's an LM Studio llama.cpp runtime problem. The default GGUF runtime in LM Studio is often still pinned to llama.cpp 2.8.0, which doesn't know the gemma4 architecture at all. Switching the runtime to 2.10.1+ in LM Studio settings fixes it (see lmstudio-ai/lmstudio-bug-tracker#1728 and #1742). Also avoid the CUDA 13.2 runtime — it loads but outputs garbage. The reason 31B/26B "just work" is they don't hit the E4B-specific (MatFormer / per-layer-embedding) code paths in the converter, so older runtimes tolerate them. For E4B he'll want to either re-quant from a fresh llama.cpp master or grab Unsloth's prebuilt unsloth/gemma-4-E4B-it-GGUF.

eleius changed discussion status to closed 5 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment