LetheanNetwork/lemer-mlx-8bit
Gemma 4 E2B in MLX format, 8-bit quantized, converted from
LetheanNetwork/lemer's
bf16 safetensors via mlx_lm.convert. Higher-precision sibling of
LetheanNetwork/lemer-mlx
(which is 4-bit). For the LEK-merged variant see
lthn/lemer.
Variants in this family
| Repo | Format | Bits | Use case |
|---|---|---|---|
LetheanNetwork/lemer |
safetensors + gguf Q4_K_M | bf16 / 4 | Source weights + llama.cpp/Ollama |
LetheanNetwork/lemer-mlx |
mlx | 4 | Apple Silicon default |
LetheanNetwork/lemer-mlx-8bit |
mlx | 8 | This repo — higher precision |
LetheanNetwork/lemer-mlx-bf16 |
mlx | bf16 | Full-precision reference |
Usage
from mlx_lm import load, generate
model, tokenizer = load("LetheanNetwork/lemer-mlx-8bit")
response = generate(
model, tokenizer,
prompt=tokenizer.apply_chat_template(
[{"role": "user", "content": "Hello"}],
add_generation_prompt=True,
enable_thinking=True,
),
max_tokens=512,
)
Provenance
- Source:
LetheanNetwork/lemerbf16 safetensors (=google/gemma-4-E2B-it) - Converter:
mlx_lm.convert(mlx-lm — LM Studio / Apple ML Research) - Quant: 8-bit group quantization, ~8.5 bits/weight effective
- License: Apache 2.0 (Gemma Terms of Use)
License
Apache 2.0, subject to the Gemma Terms of Use.
- Downloads last month
- 26
Model size
1B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
8-bit
Model tree for LetheanNetwork/lemer-mlx-8bit
Base model
google/gemma-4-E2B-it