Gemma 3 1B Instruct for RK3588

This version of Gemma 3 has been converted to run on the RK3588 NPU using mixed w8a8 and w8a8_g128 quantisation and rkllm-toolkit v1.2.1.

Compatible with RKLLM runtime version: 1.2.x

Useful links:

Official RKLLM GitHub

RockhipNPU Reddit

EZRKNN-LLM

Pretty much anything by these folks: marty1885 and happyme531

Conversion Python script

Based on instructions from airockchip/rknn-llm #240

`gemma-3-conversion.py`

from rkllm.api import RKLLM
from transformers import Gemma3ForCausalLM, AutoTokenizer
import safetensors
import torch

# Unsloth version
modelpath = 'unsloth/gemma-3-1b-it'

model = Gemma3ForCausalLM.from_pretrained(modelpath, device_map='cpu', torch_dtype=torch.bfloat16).eval()
tokenizer = AutoTokenizer.from_pretrained(modelpath)

model.save_pretrained('llm')
tokenizer.save_pretrained('llm')

del model
model = None
del tokenizer
tokenizer = None

modelpath = 'llm'
savepath = 'llm/gemma-3-1b-it-w8a8.rkllm'

llm = RKLLM()

ret = llm.load_huggingface(model=modelpath, device='cpu')
if ret != 0:
    print('Load model failed!')
    exit(ret)


ret = llm.build(
        do_quantization=True, 
        optimization_level=0, 
        quantized_dtype='w8a8', 
        # hybrid ratio of 25% gives a good balance
        hybrid_rate=0.25,
        max_context=4096 * 4,
        quantized_algorithm='normal', 
        target_platform='rk3588', 
        num_npu_core=3, 
        extra_qparams=None, 
        dataset=None
        )
if ret != 0:
    print('Build model failed!')
    exit(ret)

ret = llm.export_rkllm(savepath)
if ret != 0:
    print('Export model failed!')
    exit(ret)

Downloads last month: 1

Safetensors

Model size

1.0B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for whaoyang/unsloth-gemma-3-1b-it-rk3588-1.2.1

Base model

google/gemma-3-1b-pt

Finetuned

google/gemma-3-1b-it

Finetuned

unsloth/gemma-3-1b-it

Finetuned

(490)

this model