TranslateGemma-4b-it for RK3588

Dont work (yet), so don't use :)

Converted TranslateGemma-4b-it model optimized for Rockchip RK3588 NPU using RKLLM-Toolkit v1.2.3.

Model Information

Parameter	Value
Base Model	google/translategemma-4b-it
Target Platform	RK3588 / RK3588S (NPU)
Quantization	w8a8 (8-bit weights, 8-bit activations)
Optimization Level	0 (no precision optimization)
NPU Cores	3
Context Length	4096 tokens
Model Size	~2.5 GB
Conversion Toolkit	RKLLM v1.2.3
Load Dtype	float16
Device Used	CUDA (GPU)

Files

translategemma-4b-it_w8a8_RK3588_*.rkllm - Converted model for RK3588 NPU

Requirements

RK3588 or RK3588S SoC (Orange Pi 5, Orange Pi 5 Plus, etc.)
RKLLM Runtime v1.2.3+
16GB RAM (minimum 12GB)
Ubuntu/Armbian Linux OS

Usage with RKLLama

Install RKLLama

git clone https://github.com/NotPunchnox/rkllama.git
cd rkllama
python -m pip install .

Pull Model

rkllama_client pull crimsonmythos/translategemma-4b-it_w8a8_RK3588/translategemma-4b-it_w8a8_RK3588_*.rkllm/translategemma:4b

Run Server

rkllama_server --models ~/RKLLAMA/models

Chat

rkllama_client run translategemma:4b

API Endpoints

Ollama API

curl http://localhost:8080/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "translategemma:4b",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": false
  }'

OpenAI Compatible

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "translategemma:4b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Performance (Estimated)

Prefill: 200-300 tokens/sec
Decode: 20-50 tokens/sec
Hardware: Orange Pi 5 Pro (RK3588S, 16GB RAM)

Conversion Parameters

quantized_dtype: w8a8
optimization_level: 0
quantized_algorithm: normal
num_npu_core: 3
target_platform: RK3588
max_context: 4096
dtype: float16
device: cuda

License

Model: TranslateGemma (Apache 2.0) by Google
Conversion: RKLLM (GPL 3.0)

References

RKLLM: https://github.com/airockchip/rknn-llm
RKLLama: https://github.com/NotPunchnox/rkllama
TranslateGemma: https://huggingface.co/google/translategemma-4b-it


**Key additions (YAML metadata at top):**
- `language: en` - Language identifier
- `library_name: rkllm` - Specifies this is an RKLLM model
- `license: apache-2.0` - Base model license
- `base_model: google/translategemma-4b-it` - Links to original model
- `tags:` - Tags for filtering (rkllm, rk3588, npu, quantization, w8a8, translation, multilingual)
- `pipeline_tag: text-generation` - Task type for discoverability

Downloads last month: 7

Model tree for crimsonmythos/translategemma-4b-it_w8a8_RK3588

Base model

google/translategemma-4b-it

Finetuned

(17)

this model