--- tags: - translation - hy-mt - quant - 2bit language: - multilingual base_model: - tencent/HY-MT1.5-1.8B ---

AngelSlim

Dedicated to building a more intuitive, comprehensive, and efficient LLMs compression toolkit.

📱 Android Demo   |    📣 GGUF   |    ✒️ AngelSlim Report   |    📖 Documentation   |    🤗 AngelSlim   |    💬 WeChat

model_scores
Hy-MT1.5-1.8B translation quality scores. Source: HY-MT1.5 Technical Report

## 📣 Latest News - [26/04/29] We have released **Hy-MT1.5-1.8B-2bit (574MB)** and **Hy-MT1.5-1.8B-1.25bit (440MB)**, on-device translation models supporting 33 languages, with both weights and GGUF formats available. We also have made an [Android Demo](https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-2bit-GGUF/resolve/main/Hy-MT-demo.apk?download=true) for you to try out. We invite you to give it a spin! 🔥🔥🔥 - [26/02/09] We have released HY-1.8B-2Bit, 2-bit on-device large language model. - [26/01/13] We have released v0.3. We support the training and deployment of Eagle3 for all-scale LLMs/VLMs/Audio models. And we released **Sherry**, the hardware-efficient 1.25-bit quantization algorithm [[Paper]](https://arxiv.org/abs/2601.07892) | [[Code]](https://github.com/Tencent/AngelSlim/tree/sherry/Sherry) For more detailed information, please refer to [[AngelSlim]](https://github.com/Tencent/AngelSlim) and [[HY-MT]](https://github.com/Tencent-Hunyuan/HY-MT) ## 🌟 Hy-MT1.5-1.8B-2bit Key Features - **World-Class Translation Quality** Hy-MT1.5-1.8B-2bit is built upon the Hy-MT1.5-1.8B foundation model, a specialized translation model developed by Tencent Hunyuan Team through a holistic multi-stage training pipeline integrating MT-oriented pre-training, supervised fine-tuning, on-policy distillation, and reinforcement learning. The base model natively supports **33 languages**, **5 dialects/minority languages**, and **1,056 translation directions**. With only 1.8B parameters, it comprehensively outperforms much larger open-source models (e.g., Tower-Plus-72B, Qwen3-32B) and mainstream commercial translation APIs (e.g., Microsoft Translator, Doubao Translator). For full details, please refer to the [HY-MT1.5 Technical Report](https://arxiv.org/abs/2512.24092). - **Ultra-Compact 2-bit Quantization** Hy-MT1.5-1.8B-2bit employs industry-leading Stretched Elastic Quantization (SEQ) to quantize model weights to `{-1.5, -0.5, 0.5, 1.5}`, combined with quantization-aware distillation. This compresses the original 3.3GB FP16 model down to just **574MB** while maintaining near-lossless translation quality that surpasses models hundreds of GBs in size. The quantization details are described in the [AngelSlim Technical Report](https://arxiv.org/abs/2602.21233). - **On-Device Deployment** Optimized for Arm SME2-capable mobile devices (e.g., Apple M4, vivo x300), the 2-bit model enables fast, fully offline translation directly on your phone, no internet connection required. Your data never leaves the device, ensuring complete privacy. ## 📈 Translation Benchmarks Performance comparison of different model sizes on the Flores-200 Chinese-Foreign mutual translation benchmark:

flores_model_size
Performance of different model sizes on the Flores-200 Chinese-Foreign mutual translation benchmark.

## ⚡ Speed Demo Speed comparison of the 2-bit model on SME2 and Neon kernels:

sme2_2bit_speed
Speed comparison of the 2-bit model on SME2 and Neon kernels.

## 📱 Demo We provide a ready-to-use Android demo APK for offline translation. The app features a **background word extraction mode** that works across any app on your phone — browse emails, webpages, or chat messages and get instant translations without switching apps. No network required, no data collection, one-time download for permanent use. **Download Demo:** https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT-demo.apk ### Translation Demo

app_demo
Demo device: Snapdragon 865, 8GB RAM.

### Background Word Extraction Mode

demo2
Demo device: Snapdragon 7+ Gen 2, 16GB RAM.

## 📥 Download Links - 2-bit model weights: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-2bit - 2-bit model GGUF: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-2bit-GGUF - 1.25-bit model weights: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit - 1.25-bit model GGUF: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit-GGUF - Demo: https://huggingface.co/AngelSlim/Hy-MT1.5-1.8B-1.25bit-GGUF/resolve/main/Hy-MT-demo.apk ## 📄 Technical Reports - HY-MT1.5 Technical Report: https://arxiv.org/abs/2512.24092 - AngelSlim Technical Report: https://arxiv.org/abs/2602.21233 - Sherry Paper: https://arxiv.org/abs/2601.07892 ## 📝 License The code for this project is open-sourced under the [License for AngelSlim](LICENSE). ## 🔗 Citation ```bibtex @article{angelslim2026, title={AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression}, author={Hunyuan AI Infra Team}, journal={arXiv preprint arXiv:2602.21233}, year={2026} } @misc{zheng2025hymt, title={HY-MT1.5 Technical Report}, author={Mao Zheng and Zheng Li and Tao Chen and Mingyang Song and Di Wang}, year={2025}, eprint={2512.24092}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2512.24092}, } ``` ## 💬 Technical Discussion * AngelSlim is continuously iterating and new features will be released soon. If you have any questions or suggestions, please open an issue on [GitHub Issues](https://github.com/Tencent/AngelSlim/issues) or join our [WeChat discussion group](https://github.com/Tencent/AngelSlim/blob/main/docs/source/assets/angel_slim_wechat.png?raw=true).