Model Card for Gemma-3-Gaia-PT-BR-4b-it-mlx
This model is a MLX conversion and 4-bit quantization of the CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it model. This enables fast inference on Apple Silicon, while keeping the parent model evaluations.
Usage
uv add mlx mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("RGarrido03/Gemma-3-Gaia-PT-BR-4b-it-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
Conversion
The conversion was performed using the mlx_lm.convert Python module using the following options:
- Quantization: 4 bits
- dtype:
bfloat16
uv run mlx_lm.convert --hf-path CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it -q --mlx-path Gemma-3-Gaia-PT-BR-4b-it-mlx --dtype bfloat16
Model Card Authors
André Ribeiro @andreribeiro87
Rúben Garrido @RGarrido03
- Downloads last month
- 52
Model size
0.6B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for RGarrido03/Gemma-3-Gaia-PT-BR-4b-it-mlx
Base model
google/gemma-3-4b-pt Finetuned
CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it