TranslateGemma-4b-it for RK3588
Dont work (yet), so don't use :)
Converted TranslateGemma-4b-it model optimized for Rockchip RK3588 NPU using RKLLM-Toolkit v1.2.3.
Model Information
| Parameter | Value |
|---|---|
| Base Model | google/translategemma-4b-it |
| Target Platform | RK3588 / RK3588S (NPU) |
| Quantization | w8a8 (8-bit weights, 8-bit activations) |
| Optimization Level | 0 (no precision optimization) |
| NPU Cores | 3 |
| Context Length | 4096 tokens |
| Model Size | ~2.5 GB |
| Conversion Toolkit | RKLLM v1.2.3 |
| Load Dtype | float16 |
| Device Used | CUDA (GPU) |
Files
translategemma-4b-it_w8a8_RK3588_*.rkllm- Converted model for RK3588 NPU
Requirements
- RK3588 or RK3588S SoC (Orange Pi 5, Orange Pi 5 Plus, etc.)
- RKLLM Runtime v1.2.3+
- 16GB RAM (minimum 12GB)
- Ubuntu/Armbian Linux OS
Usage with RKLLama
Install RKLLama
git clone https://github.com/NotPunchnox/rkllama.git
cd rkllama
python -m pip install .
Pull Model
rkllama_client pull crimsonmythos/translategemma-4b-it_w8a8_RK3588/translategemma-4b-it_w8a8_RK3588_*.rkllm/translategemma:4b
Run Server
rkllama_server --models ~/RKLLAMA/models
Chat
rkllama_client run translategemma:4b
API Endpoints
Ollama API
curl http://localhost:8080/api/chat \
-H "Content-Type: application/json" \
-d '{
"model": "translategemma:4b",
"messages": [{"role": "user", "content": "Hello"}],
"stream": false
}'
OpenAI Compatible
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "translategemma:4b",
"messages": [{"role": "user", "content": "Hello"}]
}'
Performance (Estimated)
- Prefill: 200-300 tokens/sec
- Decode: 20-50 tokens/sec
- Hardware: Orange Pi 5 Pro (RK3588S, 16GB RAM)
Conversion Parameters
quantized_dtype: w8a8
optimization_level: 0
quantized_algorithm: normal
num_npu_core: 3
target_platform: RK3588
max_context: 4096
dtype: float16
device: cuda
License
- Model: TranslateGemma (Apache 2.0) by Google
- Conversion: RKLLM (GPL 3.0)
References
- RKLLM: https://github.com/airockchip/rknn-llm
- RKLLama: https://github.com/NotPunchnox/rkllama
- TranslateGemma: https://huggingface.co/google/translategemma-4b-it
**Key additions (YAML metadata at top):**
- `language: en` - Language identifier
- `library_name: rkllm` - Specifies this is an RKLLM model
- `license: apache-2.0` - Base model license
- `base_model: google/translategemma-4b-it` - Links to original model
- `tags:` - Tags for filtering (rkllm, rk3588, npu, quantization, w8a8, translation, multilingual)
- `pipeline_tag: text-generation` - Task type for discoverability
- Downloads last month
- 7
Model tree for crimsonmythos/translategemma-4b-it_w8a8_RK3588
Base model
google/translategemma-4b-it