Model Card for Gemma-3-Gaia-PT-BR-4b-it-mlx

This model is a MLX conversion and 4-bit quantization of the CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it model. This enables fast inference on Apple Silicon, while keeping the parent model evaluations.

Usage

uv add mlx mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("RGarrido03/Gemma-3-Gaia-PT-BR-4b-it-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Conversion

The conversion was performed using the mlx_lm.convert Python module using the following options:

Quantization: 4 bits
dtype: bfloat16

uv run mlx_lm.convert --hf-path CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it -q --mlx-path Gemma-3-Gaia-PT-BR-4b-it-mlx --dtype bfloat16

Model Card Authors

André Ribeiro @andreribeiro87

Rúben Garrido @RGarrido03

Downloads last month: 52

Safetensors

Model size

0.6B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for RGarrido03/Gemma-3-Gaia-PT-BR-4b-it-mlx

Base model

google/gemma-3-4b-pt

Finetuned

CEIA-UFG/Gemma-3-Gaia-PT-BR-4b-it

Quantized

(17)

this model