EraX-Translator-V1.0-mlx-8bit

MLX-VLM 8bit conversion of erax-ai/EraX-Translator-V1.0 for Apple Silicon.

Notes

  • Converted locally with mlx-vlm 0.4.0.
  • Quantization result: 9.139 bits per weight.
  • Source model architecture: Gemma3ForConditionalGeneration.
  • This checkpoint is intended for translation tasks and was tested here on Vietnamese translation.
  • Local smoke test passed with mlx-vlm text generation.
  • Local evaluation was run on 5 Vietnamese translation cases covering English, German, modern Chinese, and Classical Chinese.

Conversion

python3 -m mlx_vlm convert \
  --hf-path erax-ai/EraX-Translator-V1.0 \
  --mlx-path /path/to/EraX-Translator-V1.0-mlx-8bit \
  --dtype bfloat16 \
  --quantize --q-bits 8

Quick Start

python3 - <<'PY'
from mlx_vlm import load
from mlx_vlm.generate import generate

model, processor = load('/path/to/EraX-Translator-V1.0-mlx-8bit')
messages = [
    {"role": "system", "content": "Bạn là trợ lý dịch thuật nhiều ngôn ngữ. Chỉ trả về bản dịch chính xác, không giải thích thêm."},
    {"role": "user", "content": "The weather is nice today, but the traffic is terrible.\n\nDịch sang tiếng Việt."},
]
prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
result = generate(model, processor, prompt, verbose=False, max_tokens=128, temperature=0.2, top_p=0.95, top_k=64)
print(result.text)
PY

Validation Summary

  • Case count: 5
  • Avg generation speed: 64.37 tok/s
  • Avg wall time: 2.83s
  • Max peak memory: 6.31 GB
  • Avg similarity to reference set: 0.2025
  • Preamble leakage: 0 cases

Translation Quality

Observed locally:

  • Good fit for everyday web chat, product copy, and normal news translation into Vietnamese.
  • Much faster and lighter than the bf16 variant.
  • Slightly weaker than bf16 on harder German phrasing, Classical Chinese, and strict name fidelity.

Caution

This model is a translation-tuned checkpoint. It is not intended as a general-purpose coding or math model, and difficult literary or historical material may still require human review.

Downloads last month
31
Safetensors
Model size
2B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for vanch007/EraX-Translator-V1.0-mlx-8bit

Quantized
(4)
this model