Gemma-4-26B-A4B-it-JANG_2L

JANG-quantized Gemma-4 MoE for Apple Silicon. Created by Jinho Jang — eric@jangq.ai.

See the full JANGQ-AI collection for all profiles.

Loading

from mlx_lm import load, generate
model, tokenizer = load("JANGQ-AI/Gemma-4-26B-A4B-it-JANG_2L")
print(generate(model, tokenizer, "Hello", max_tokens=256))

Stock mlx_lm picks up the multi-stop-token list ([1, 106, 50]) automatically from generation_config.json — no manual configuration required.

Downloads last month
1,260
Safetensors
Model size
3B params
Tensor type
U32
·
F16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support