upload Hy-MT2 files

Browse files

Files changed (6) hide show

.gitattributes +1 -0
HY_MT2_0_Report.pdf +3 -0
LICENSE.txt +1 -1
README.md +16 -20
README_CN.md +5 -11
imgs/main_result.png +2 -2

.gitattributes CHANGED Viewed

@@ -36,3 +36,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 tokenizer.json filter=lfs diff=lfs merge=lfs -text
 HY_MT2_0_Technical_Report.pdf filter=lfs diff=lfs merge=lfs -text
 imgs/main_result.png filter=lfs diff=lfs merge=lfs -text

 tokenizer.json filter=lfs diff=lfs merge=lfs -text
 HY_MT2_0_Technical_Report.pdf filter=lfs diff=lfs merge=lfs -text
 imgs/main_result.png filter=lfs diff=lfs merge=lfs -text
+HY_MT2_0_Report.pdf filter=lfs diff=lfs merge=lfs -text

HY_MT2_0_Report.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:47d84a5a737994ffe0bd3a74d255698273c292a3acc9dda3902a38ea2440afaf
+size 2326558

LICENSE.txt CHANGED Viewed

@@ -14,7 +14,7 @@ f.	“Materials” shall mean, collectively, Tencent’s proprietary Tencent HY
 g.	“Model Derivatives” shall mean all: (i) modifications to Tencent HY or any Model Derivative of Tencent HY; (ii) works based on Tencent HY or any Model Derivative of Tencent HY; or (iii) any other machine learning model which is created by transfer of patterns of the weights, parameters, operations, or Output of Tencent HY or any Model Derivative of Tencent HY, to that model in order to cause that model to perform similarly to Tencent HY or a Model Derivative of Tencent HY, including distillation methods, methods that use intermediate data representations, or methods based on the generation of synthetic data Outputs by Tencent HY or a Model Derivative of Tencent HY for training that model. For clarity, Outputs by themselves are not deemed Model Derivatives.
 h.	“Output” shall mean the information and/or content output of Tencent HY or a Model Derivative that results from operating or otherwise using Tencent HY or a Model Derivative, including via a Hosted Service.
 i.	“Tencent,” “We” or “Us” shall mean the applicable entity or entities in the Tencent corporate family that own(s) intellectual property or other rights embodied in or utilized by the Materials.
-j.	“Tencent HY” shall mean the large language models, text/image/video/audio/3D generation models, and multimodal large language models and their software and algorithms, including trained model weights, parameters (including optimizer states), machine-learning model code, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing made publicly available by Us, including, without limitation to, Tencent Hy-MT2-1.8B released at https://huggingface.co/tencent/Hy-MT2-1.8B, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-1.8B; Tencent Hy-MT2-7B released at https://huggingface.co/tencent/Hy-MT2-7B, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-7B; Tencent Hy-MT2-30B-A3B released at https://huggingface.co/tencent/Hy-MT2-30B-A3B, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-30B-A3B; Tencent Hy-MT2-1.8B-FP8 released at https://huggingface.co/tencent/Hy-MT2-1.8B-FP8, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-1.8B-FP8; Tencent Hy-MT2-7B-FP8 released at https://huggingface.co/tencent/Hy-MT2-7B-FP8, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-7B-FP8; Tencent Hy-MT2-30B-A3B-FP8 released at https://huggingface.co/tencent/Hy-MT2-30B-A3B-FP8, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-30B-A3B-FP8; Hy-MT2-1.8B-GGUF released at https://huggingface.co/tencent/Hy-MT2-1.8B-GGUF, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-1.8B-GGUF; Hy-MT2-7B-GGUF released at https://huggingface.co/tencent/Hy-MT2-7B-GGUF, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-7B-GGUF.
 k.	“Tencent HY Works” shall mean: (i) the Materials; (ii) Model Derivatives; and (iii) all derivative works thereof.
 l.	“Territory” shall mean the worldwide territory, excluding the territory of the European Union.
 m.	“Third Party” or “Third Parties” shall mean individuals or legal entities that are not under common control with Us or You.

 g.	“Model Derivatives” shall mean all: (i) modifications to Tencent HY or any Model Derivative of Tencent HY; (ii) works based on Tencent HY or any Model Derivative of Tencent HY; or (iii) any other machine learning model which is created by transfer of patterns of the weights, parameters, operations, or Output of Tencent HY or any Model Derivative of Tencent HY, to that model in order to cause that model to perform similarly to Tencent HY or a Model Derivative of Tencent HY, including distillation methods, methods that use intermediate data representations, or methods based on the generation of synthetic data Outputs by Tencent HY or a Model Derivative of Tencent HY for training that model. For clarity, Outputs by themselves are not deemed Model Derivatives.
 h.	“Output” shall mean the information and/or content output of Tencent HY or a Model Derivative that results from operating or otherwise using Tencent HY or a Model Derivative, including via a Hosted Service.
 i.	“Tencent,” “We” or “Us” shall mean the applicable entity or entities in the Tencent corporate family that own(s) intellectual property or other rights embodied in or utilized by the Materials.
+j.	“Tencent HY” shall mean the large language models, text/image/video/audio/3D generation models, and multimodal large language models and their software and algorithms, including trained model weights, parameters (including optimizer states), machine-learning model code, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing made publicly available by Us, including, without limitation to, Tencent Hy-MT2-1.8B released at https://huggingface.co/tencent/Hy-MT2-1.8B, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-1.8B; Tencent Hy-MT2-7B released at https://huggingface.co/tencent/Hy-MT2-7B, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-7B; Tencent Hy-MT2-30B-A3B released at https://huggingface.co/tencent/Hy-MT2-30B-A3B, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-30B-A3B; Tencent Hy-MT2-1.8B-FP8 released at https://huggingface.co/tencent/Hy-MT2-1.8B-FP8, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-1.8B-FP8; Tencent Hy-MT2-7B-FP8 released at https://huggingface.co/tencent/Hy-MT2-7B-FP8, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-7B-FP8; Tencent Hy-MT2-30B-A3B-FP8 released at https://huggingface.co/tencent/Hy-MT2-30B-A3B-FP8, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-30B-A3B-FP8; Hy-MT2-1.8B-GGUF released at https://huggingface.co/tencent/Hy-MT2-1.8B-GGUF, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-1.8B-GGUF; Hy-MT2-7B-GGUF released at https://huggingface.co/tencent/Hy-MT2-7B-GGUF, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-7B-GGUF; Hy-MT2-1.8B-2bit-GGUF released at https://huggingface.co/tencent/Hy-MT2-1.8B-2bit-GGUF, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-1.8B-2bit-GGUF; Hy-MT2-1.8B-2bit-GGUF released at https://huggingface.co/tencent/Hy-MT2-1.8B-1.25bit-GGUF, https://modelscope.cn/models/Tencent-Hunyuan/Hy-MT2-1.8B-1.25bit-GGUF.
 k.	“Tencent HY Works” shall mean: (i) the Materials; (ii) Model Derivatives; and (iii) all derivative works thereof.
 l.	“Territory” shall mean the worldwide territory, excluding the territory of the European Union.
 m.	“Third Party” or “Third Parties” shall mean individuals or legal entities that are not under common control with Us or You.

README.md CHANGED Viewed

@@ -17,22 +17,16 @@
 </div>
 <p align="center">
-    🖥️&nbsp;<a href="https://aistudio.tencent.com/"><b>Official Website</b></a>&nbsp;&nbsp;|&nbsp;&nbsp;
     💬&nbsp;<a href="https://github.com/Tencent-Hunyuan/Hy-MT2"><b>GitHub</b></a>&nbsp;&nbsp;|&nbsp;&nbsp;
     🪡&nbsp;<a href="https://github.com/Tencent/AngelSlim/tree/main"><b>AngelSlim</b></a></p>
 ## Model Introduction
-**Hy-MT2** is a multilingual machine translation model series covering both Dense and MoE architectures. It includes three fast-thinking models: **Hy-MT2-1.8B, 7B, and 30B-A3B**. The series supports translation among 33 languages and 5 ethnic minority languages / Chinese dialects, as well as multilingual instruction following. The series also provides **1.25-bit extreme quantized versions** based on AngelSlim. Among them, the 1.8B model requires only 440 MB of storage and runs 1.5x faster than traditional 4-bit inference on the Apple A15 chip.
-Evaluation results show that Hy-MT2 performs strongly across multiple scenarios:
-* **General Translation (FLORES-200)**: The average performance of the three models reaches 89.9%, 97.9%, and 98.6% of **Gemini 3.1 Pro (Think)**, respectively. Among them, the 7B and A3B models outperform **DeepSeek-V4-Pro**, while the 1.8B model achieves better overall performance than commercial APIs such as Microsoft Translator.
-* **Real-World Scenarios and Professional Domains (WildMTBench/DomainMTBench)**: The GEMBA scores of the three models reach more than 96%–99% of Gemini 3.1 Pro (Think), and all of them outperform larger open-source models.
-* **Translation Instruction Following (IFMTBench)**: The models significantly outperform open-source models of the same scale, while the A3B model approaches the performance of Gemini 3.1 Pro (Think).
-In summary, Hy-MT2 is an efficient and powerful translation model series designed for complex real-world scenarios.
 In this release, we also open-source [IFMTBench](./IFMTBench/README.md), a benchmark for evaluating translation instruction-following capabilities.
@@ -40,7 +34,7 @@ We also welcome everyone to use our released Hy-MT2-Translator Skill, which make
 ## News
-* 2026.5.21  We open-sourced **Hy-MT2-1.8B**/**Hy-MT2-7B**/**Hy-MT2-30B-A3B** on HuggingFace and ModelScope.
 * 2025.12.30 We open-sourced **HY-MT1.5-1.8B** and **HY-MT1.5-7B** on HuggingFace and ModelScope.
 * 2025.9.1 We open-sourced **Hunyuan-MT-7B** and **Hunyuan-MT-Chimera-7B** on HuggingFace and ModelScope.
@@ -50,21 +44,23 @@ We also welcome everyone to use our released Hy-MT2-Translator Skill, which make
 <img src="imgs/main_result.png" width = "100%" />
 </div>
-For more experimental results and analysis, please refer to our [technical report](./HY_MT2_0_Technical_Report.pdf).
 &nbsp;
 ## Model Links
 | Model Name  | Description | Download Link |
 | ----------- | ----------- |-----------
-| Hy-MT2-1.8B  | Hunyuan 1.8B translation model |🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B)|
-| Hy-MT2-1.8B-FP8 | Hunyuan 1.8B translation model, FP8 quantization    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-FP8)|
-| Hy-MT2-1.8B-GGUF | Hunyuan 1.8B translation model, llama.cpp    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-GGUF)|
-| Hy-MT2-7B | Hunyuan 7B translation model    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B)|
-| Hy-MT2-7B-FP8 | Hunyuan 7B translation model, FP8 quantization     | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B-FP8)|
-| Hy-MT2-7B-GGUF | Hunyuan 7B translation model, llama.cpp    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B-GGUF)|
-| Hy-MT2-30B-A3B | Hunyuan 30B-A3B translation model    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-30B-A3B)|
-| Hy-MT2-30B-A3B-FP8 | Hunyuan 30B-A3B translation model, FP8 quantization     | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-30B-A3B-FP8)|

 </div>
 <p align="center">
+    🖥️&nbsp;<a href="https://aistudio.tencent.com/llm/en?tabIndex=0"><b>Official Website</b></a>&nbsp;&nbsp;|&nbsp;&nbsp;
     💬&nbsp;<a href="https://github.com/Tencent-Hunyuan/Hy-MT2"><b>GitHub</b></a>&nbsp;&nbsp;|&nbsp;&nbsp;
     🪡&nbsp;<a href="https://github.com/Tencent/AngelSlim/tree/main"><b>AngelSlim</b></a></p>
 ## Model Introduction
+Hy-MT2 is a family of “fast-thinking” multilingual translation models designed for complex real-world scenarios. It includes three model sizes: 1.8B, 7B, and 30B-A3B (MoE), all of which support translation among 33 languages and effectively follow translation instructions in multiple languages.
+For on-device deployment, AngelSlim 1.25-bit extreme quantization reduces the storage requirement of the 1.8B model to only 440 MB and improves inference speed by 1.5x.
+Multi-dimensional evaluations show that Hy-MT2 delivers outstanding performance across general, real-world business, domain-specific, and instruction-following translation tasks. The 7B and 30B-A3B models outperform open-source models such as DeepSeek-V4-Pro and Kimi K2.6 in fast-thinking mode, while the lightweight 1.8B model also surpasses mainstream commercial APIs from providers such as Microsoft and Doubao overall.
 In this release, we also open-source [IFMTBench](./IFMTBench/README.md), a benchmark for evaluating translation instruction-following capabilities.
 ## News
+* 2026.5.21  We open-sourced **Hy-MT2-1.8B**/**Hy-MT2-7B**/**Hy-MT2-30B-A3B**/**IFMTBench** on HuggingFace and ModelScope.
 * 2025.12.30 We open-sourced **HY-MT1.5-1.8B** and **HY-MT1.5-7B** on HuggingFace and ModelScope.
 * 2025.9.1 We open-sourced **Hunyuan-MT-7B** and **Hunyuan-MT-Chimera-7B** on HuggingFace and ModelScope.
 <img src="imgs/main_result.png" width = "100%" />
 </div>
+For more experimental results and analysis, please refer to our [report](./HY_MT2_0_Report.pdf).
 &nbsp;
 ## Model Links
 | Model Name  | Description | Download Link |
 | ----------- | ----------- |-----------
+| Hy-MT2-1.8B  | Hy 1.8B translation model |🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B)|
+| Hy-MT2-1.8B-FP8 | Hy 1.8B translation model, FP8 quantization    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-FP8)|
+| Hy-MT2-1.8B-GGUF | Hy 1.8B translation model, llama.cpp    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-GGUF)|
+| Hy-MT2-1.8B-2bit-GGUF | Hy 1.8B translation model, llama.cpp, 2bit    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-2bit-GGUF)|
+| Hy-MT2-1.8B-1.25bit-GGUF | Hy 1.8B translation model, llama.cpp, 1.25bit    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-1.25bit-GGUF)|
+| Hy-MT2-7B | Hy 7B translation model    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B)|
+| Hy-MT2-7B-FP8 | Hy 7B translation model, FP8 quantization     | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B-FP8)|
+| Hy-MT2-7B-GGUF | Hy 7B translation model, llama.cpp    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B-GGUF)|
+| Hy-MT2-30B-A3B | Hy 30B-A3B translation model    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-30B-A3B)|
+| Hy-MT2-30B-A3B-FP8 | Hy 30B-A3B translation model, FP8 quantization     | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-30B-A3B-FP8)|

README_CN.md CHANGED Viewed

@@ -17,7 +17,7 @@
 </div>
 <p align="center">
-    🖥️&nbsp;<a href="https://aistudio.tencent.com/"><b>官方网站</b></a>&nbsp;&nbsp;|&nbsp;&nbsp;
     💬&nbsp;<a href="https://github.com/Tencent-Hunyuan/Hy-MT2"><b>GitHub</b></a>&nbsp;&nbsp;|&nbsp;&nbsp;
     🪡&nbsp;<a href="https://github.com/Tencent/AngelSlim/tree/main"><b>AngelSlim</b></a></p>
@@ -25,15 +25,7 @@
 ## 模型介绍
-**Hy-MT2**是涵盖Dense和MoE架构的多语言机器翻译模型系列，包含 **Hy-MT2-1.8B、7B 和 30B-A3B** 三款快思考模型，支持33种语言互译和5种民汉/方言，支持多语言指令遵循。该系列提供基于AngelSlim的**1.25-bit极致量化版**，其中1.8B模型仅需440MB存储，在苹果A15芯片上比传统4-bit推理快1.5倍。
-评测结果表明，Hy-MT2 在多场景下表现出众：
-* **通用翻译（FLORES-200）**：三款模型平均性能分别达到 **Gemini 3.1 Pro (Think)** 的 89.9%、97.9% 和 98.6%。其中 7B 和 A3B 性能超越 **DeepSeek-V4-Pro**，1.8B 综合表现超越微软翻译等商业 API。
-* **真实场景与专业领域（WildMTBench/DomainMTBench）**：三款模型 GEMBA 评分达 Gemini 3.1 Pro (Think) 的 96%~99% 以上，且均优于更大规模的开源模型。
-* **翻译指令遵循（IFMTBench）**：大幅超越同规模开源模型，A3B 性能逼近 Gemini 3.1 Pro (Think)。
-总之，Hy-MT2 是一个面向真实复杂场景、高效且强大的翻译模型系列。
 同时，本次我们也开源了一个针对翻译指令遵循能力的评测集[IFMTBench](./IFMTBench/README_zh.md)。
@@ -51,7 +43,7 @@
 <img src="imgs/main_result.png" width = "100%" />
 </div>
-更多的实验效果和分析可以参考我们的[技术报告](./HY_MT2_0_Technical_Report.pdf)。
 &nbsp;
@@ -61,6 +53,8 @@
 | Hy-MT2-1.8B  | 混元1.8B翻译模型 |🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B)|
 | Hy-MT2-1.8B-FP8 | 混元1.8B翻译模型，fp8量化    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-FP8)|
 | Hy-MT2-1.8B-GGUF | 混元1.8B翻译模型， llama.cpp    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-GGUF)|
 | Hy-MT2-7B | 混元7B翻译模型    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B)|
 | Hy-MT2-7B-FP8 | 混元7B翻译模型，fp8量化     | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B-FP8)|
 | Hy-MT2-7B-GGUF | 混元7B翻译模型， llama.cpp    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B-GGUF)|

 </div>
 <p align="center">
+    🖥️&nbsp;<a href="https://aistudio.tencent.com/llm/zh?tabIndex=0"><b>官方网站</b></a>&nbsp;&nbsp;|&nbsp;&nbsp;
     💬&nbsp;<a href="https://github.com/Tencent-Hunyuan/Hy-MT2"><b>GitHub</b></a>&nbsp;&nbsp;|&nbsp;&nbsp;
     🪡&nbsp;<a href="https://github.com/Tencent/AngelSlim/tree/main"><b>AngelSlim</b></a></p>
 ## 模型介绍
+Hy-MT2 是一款面向真实复杂场景的“快思考”多语言翻译模型家族，涵盖 1.8B、7B 和 30B-A3B（MoE）三种体量，支持 33 种语言互译并具备强大的多语言指令遵循能力。在端侧部署上，得益于 AngelSlim 1.25-bit 极端量化，其 1.8B 模型仅需 440MB 存储空间，推理速度显著提升 1.5 倍。多维度评测表明，Hy-MT2 在通用、真实业务、专业领域及指令遵循等翻译任务中表现卓越：7B 和 30B-A3B 模型性能不仅超越了 DeepSeek-V4-Pro、Kimi K2.6 等开源模型在快思考模式下的表现，轻量级 1.8B 模型亦在整体上超越了微软和豆包等主流商业 API。
 同时，本次我们也开源了一个针对翻译指令遵循能力的评测集[IFMTBench](./IFMTBench/README_zh.md)。
 <img src="imgs/main_result.png" width = "100%" />
 </div>
+更多的实验效果和分析可以参考我们的[报告](./HY_MT2_0_Report.pdf)。
 &nbsp;
 | Hy-MT2-1.8B  | 混元1.8B翻译模型 |🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B)|
 | Hy-MT2-1.8B-FP8 | 混元1.8B翻译模型，fp8量化    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-FP8)|
 | Hy-MT2-1.8B-GGUF | 混元1.8B翻译模型， llama.cpp    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-GGUF)|
+| Hy-MT2-1.8B-2bit-GGUF | 混元1.8B翻译模型， llama.cpp, 2bit    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-2bit-GGUF)|
+| Hy-MT2-1.8B-1.25bit-GGUF | 混元1.8B翻译模型， llama.cpp, 1.25bit    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-1.8B-1.25bit-GGUF)|
 | Hy-MT2-7B | 混元7B翻译模型    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B)|
 | Hy-MT2-7B-FP8 | 混元7B翻译模型，fp8量化     | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B-FP8)|
 | Hy-MT2-7B-GGUF | 混元7B翻译模型， llama.cpp    | 🤗 [Model](https://huggingface.co/tencent/Hy-MT2-7B-GGUF)|

imgs/main_result.png CHANGED Viewed

Git LFS Details

SHA256: 21424944ee1f03fb9ae6217dcd49eeff69a8687fdcb1df69efa4bfbd7405de9b
Pointer size: 132 Bytes
Size of remote file: 5.52 MB

Git LFS Details

SHA256: b87606817fec4fbdaca939de433e4d2a8e92d65493f44222aaaefb381f5a8c6e
Pointer size: 132 Bytes
Size of remote file: 3.84 MB