TranslateGemma-4b-it for RK3588

Dont work (yet), so don't use :)

Converted TranslateGemma-4b-it model optimized for Rockchip RK3588 NPU using RKLLM-Toolkit v1.2.3.

Model Information

Parameter Value
Base Model google/translategemma-4b-it
Target Platform RK3588 / RK3588S (NPU)
Quantization w8a8 (8-bit weights, 8-bit activations)
Optimization Level 0 (no precision optimization)
NPU Cores 3
Context Length 4096 tokens
Model Size ~2.5 GB
Conversion Toolkit RKLLM v1.2.3
Load Dtype float16
Device Used CUDA (GPU)

Files

  • translategemma-4b-it_w8a8_RK3588_*.rkllm - Converted model for RK3588 NPU

Requirements

  • RK3588 or RK3588S SoC (Orange Pi 5, Orange Pi 5 Plus, etc.)
  • RKLLM Runtime v1.2.3+
  • 16GB RAM (minimum 12GB)
  • Ubuntu/Armbian Linux OS

Usage with RKLLama

Install RKLLama

git clone https://github.com/NotPunchnox/rkllama.git
cd rkllama
python -m pip install .

Pull Model

rkllama_client pull crimsonmythos/translategemma-4b-it_w8a8_RK3588/translategemma-4b-it_w8a8_RK3588_*.rkllm/translategemma:4b

Run Server

rkllama_server --models ~/RKLLAMA/models

Chat

rkllama_client run translategemma:4b

API Endpoints

Ollama API

curl http://localhost:8080/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "translategemma:4b",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": false
  }'

OpenAI Compatible

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "translategemma:4b",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Performance (Estimated)

  • Prefill: 200-300 tokens/sec
  • Decode: 20-50 tokens/sec
  • Hardware: Orange Pi 5 Pro (RK3588S, 16GB RAM)

Conversion Parameters

quantized_dtype: w8a8
optimization_level: 0
quantized_algorithm: normal
num_npu_core: 3
target_platform: RK3588
max_context: 4096
dtype: float16
device: cuda

License

  • Model: TranslateGemma (Apache 2.0) by Google
  • Conversion: RKLLM (GPL 3.0)

References


**Key additions (YAML metadata at top):**
- `language: en` - Language identifier
- `library_name: rkllm` - Specifies this is an RKLLM model
- `license: apache-2.0` - Base model license
- `base_model: google/translategemma-4b-it` - Links to original model
- `tags:` - Tags for filtering (rkllm, rk3588, npu, quantization, w8a8, translation, multilingual)
- `pipeline_tag: text-generation` - Task type for discoverability
Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for crimsonmythos/translategemma-4b-it_w8a8_RK3588

Finetuned
(17)
this model