FunctionGemma Robot Actions (Multilingual)
A fine-tuned FunctionGemma 270M model that converts natural language into structured robot action and emotion function calls. Supports 6 languages with 98% accuracy at ~59ms on NVIDIA Jetson AGX Thor.
Supported Languages
🇬🇧 English · 🇨🇳 中文 · 🇯🇵 日本語 · 🇫🇷 Français · 🇩🇪 Deutsch · 🇪🇸 Español
Example
Input: "Can you shake hands with me?" → robot_action(shake_hand) + show_emotion(happy)
Input: "跟我握手" → robot_action(shake_hand) + show_emotion(happy)
Input: "握手してください" → robot_action(shake_hand) + show_emotion(happy)
Input: "Serrez-moi la main" → robot_action(shake_hand) + show_emotion(happy)
Input: "Gib mir die Hand" → robot_action(shake_hand) + show_emotion(happy)
Input: "Dame la mano" → robot_action(shake_hand) + show_emotion(happy)
Input: "我今天心情不好" → robot_action(stand_still) + show_emotion(sad)
Input: "あれは何ですか?" → robot_action(stand_still) + show_emotion(confused)
Input: "Raconte-moi une blague" → robot_action(stand_still) + show_emotion(think)
Supported Actions
| Action | Description |
|---|---|
shake_hand |
Handshake gesture |
face_wave |
Wave hello / goodbye |
hands_up |
Raise both hands |
stand_still |
Stay idle (default for general conversation) |
show_hand |
Show open hand / present card for payment |
do_payment |
Do the payment / do the payment |
down_payment |
Finished the payment |
Supported Emotions
| Emotion | Animation |
|---|---|
happy |
Happy.riv |
sad |
Sad.riv |
excited |
Excited.riv |
confused |
Confused.riv |
curious |
Curious.riv |
think |
Think.riv |
Constrained decoding uses 2 forward passes instead of 33 autoregressive steps, achieving ~18x speedup over standard model.generate().
Training Details
| Parameter | Value |
|---|---|
| Base model | google/functiongemma-270m-it |
| Method | LoRA (rank 8, alpha 16) |
| Training data | ~6,000 examples (545 English + ~5,450 multilingual) |
| Languages | English, Chinese, Japanese, French, German, Spanish |
| Epochs | 3 |
| Learning rate | 2e-4 |
| Batch size | 4 (effective 16 with gradient accumulation) |
| Max sequence length | 512 |
| Precision | bf16 |
| Hardware | NVIDIA RTX 5070 Ti (16 GB) |
Multilingual training data was generated using Claude API — 2 natural phrasings per language per English prompt, resulting in diverse and natural expressions rather than literal translations.
Usage
Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model = AutoModelForCausalLM.from_pretrained(
"OpenmindAGI/functiongemma-finetuned-g1-multilingual",
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("OpenmindAGI/functiongemma-finetuned-g1-multilingual")
model.eval()
Citation
@misc{openmindagi-functiongemma-multilingual,
title={FunctionGemma Robot Actions (Multilingual)},
author={OpenmindAGI},
year={2025},
url={https://huggingface.co/OpenmindAGI/functiongemma-finetuned-g1-multilingual}
}
License
Fine-tuned from google/functiongemma-270m-it under Apache 2.0.
- Downloads last month
- 656
Model tree for OpenmindAGI/functiongemma-finetuned-g1-multilingual
Base model
google/functiongemma-270m-it