translategemma-4b-it-f32-GGUF
TranslateGemma-4b-it from Google is a lightweight 4 billion-parameter instruction-tuned translation model from the TranslateGemma family, built on Gemma 3 architecture and trained on TPUv4p/v5p/v5e hardware using supervised fine-tuning (SFT) on human/Gemini-synthesized parallel corpora across 500+ language pairs followed by RL optimization with MetricX-QE and AutoMQM reward models, enabling high-fidelity text-to-text and image-to-text translation across 55 languages (including Spanish, French, Chinese, Hindi, Arabic, and low-resource ones) with a 2K-128K token context window and 896x896 image resolution support (256 tokens/image). Optimized for edge/mobile deployment on laptops/desktops/cloud with half the parameters of baselines yet rivaling 12B models (MetricX↓=5.32, COMET↑=81.6 on WMT24++ across 55 langs), it uses a highly opinionated JSON chat template requiring source_lang_code/target_lang_code and processes inputs like {"type": "text/image_url", "text"/"url": content} for outputs in target languages, democratizing state-of-the-art translation for developers via Transformers/vLLM. Available in 4B (mobile/edge), 12B (laptops), and 27B (cloud/H100) sizes under open license, it prioritizes efficiency, throughput, and low latency without quality compromise for multilingual/multimodal communication.
translategemma-4b-it [GGUF]
| File Name | Quant Type | File Size | File Link |
|---|---|---|---|
| translategemma-4b-it.IQ4_XS.gguf | IQ4_XS | 2.28 GB | Download |
| translategemma-4b-it.Q2_K.gguf | Q2_K | 1.73 GB | Download |
| translategemma-4b-it.Q3_K_L.gguf | Q3_K_L | 2.24 GB | Download |
| translategemma-4b-it.Q3_K_M.gguf | Q3_K_M | 2.1 GB | Download |
| translategemma-4b-it.Q3_K_S.gguf | Q3_K_S | 1.94 GB | Download |
| translategemma-4b-it.Q4_K_M.gguf | Q4_K_M | 2.49 GB | Download |
| translategemma-4b-it.Q4_K_S.gguf | Q4_K_S | 2.38 GB | Download |
| translategemma-4b-it.Q5_K_M.gguf | Q5_K_M | 2.83 GB | Download |
| translategemma-4b-it.Q5_K_S.gguf | Q5_K_S | 2.76 GB | Download |
| translategemma-4b-it.Q6_K.gguf | Q6_K | 3.19 GB | Download |
| translategemma-4b-it.Q8_0.gguf | Q8_0 | 4.13 GB | Download |
| translategemma-4b-it.f16.gguf | F16 | 7.77 GB | Download |
| translategemma-4b-it.f32.gguf | F32 | 15.5 GB | Download |
| translategemma-4b-it.mmproj-Q8_0.gguf | mmproj-Q8_0 | 591 MB | Download |
| translategemma-4b-it.mmproj-f16.gguf | mmproj-f16 | 851 MB | Download |
| translategemma-4b-it.mmproj-f32.gguf | mmproj-f32 | 1.67 GB | Download |
Quants Usage
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):
- Downloads last month
- 224
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
32-bit
Model tree for prithivMLmods/translategemma-4b-it-f32-GGUF
Base model
google/translategemma-4b-it