Commit ·
a6bf338
0
Parent(s):
Duplicate from tencent/Hy3-preview
Browse filesCo-authored-by: TencentOpen <TencentOpen@users.noreply.huggingface.co>
This view is limited to 50 files because it contains too many changes. See raw diff
- .gitattributes +39 -0
- LICENSE +78 -0
- README.md +262 -0
- README_CN.md +259 -0
- assets/bench_agent_overview_v3.jpg +3 -0
- assets/bench_claw_agent.png +0 -0
- assets/bench_claw_agent2.jpg +3 -0
- assets/bench_context.jpg +3 -0
- assets/bench_stem.jpg +3 -0
- assets/logo-en.png +0 -0
- assets/logo-zh.png +0 -0
- chat_template.jinja +195 -0
- config.json +46 -0
- generation_config.json +10 -0
- model-00001-of-00112.safetensors +3 -0
- model-00002-of-00112.safetensors +3 -0
- model-00003-of-00112.safetensors +3 -0
- model-00004-of-00112.safetensors +3 -0
- model-00005-of-00112.safetensors +3 -0
- model-00006-of-00112.safetensors +3 -0
- model-00007-of-00112.safetensors +3 -0
- model-00008-of-00112.safetensors +3 -0
- model-00009-of-00112.safetensors +3 -0
- model-00010-of-00112.safetensors +3 -0
- model-00011-of-00112.safetensors +3 -0
- model-00012-of-00112.safetensors +3 -0
- model-00013-of-00112.safetensors +3 -0
- model-00014-of-00112.safetensors +3 -0
- model-00015-of-00112.safetensors +3 -0
- model-00016-of-00112.safetensors +3 -0
- model-00017-of-00112.safetensors +3 -0
- model-00018-of-00112.safetensors +3 -0
- model-00019-of-00112.safetensors +3 -0
- model-00020-of-00112.safetensors +3 -0
- model-00021-of-00112.safetensors +3 -0
- model-00022-of-00112.safetensors +3 -0
- model-00023-of-00112.safetensors +3 -0
- model-00024-of-00112.safetensors +3 -0
- model-00025-of-00112.safetensors +3 -0
- model-00026-of-00112.safetensors +3 -0
- model-00027-of-00112.safetensors +3 -0
- model-00028-of-00112.safetensors +3 -0
- model-00029-of-00112.safetensors +3 -0
- model-00030-of-00112.safetensors +3 -0
- model-00031-of-00112.safetensors +3 -0
- model-00032-of-00112.safetensors +3 -0
- model-00033-of-00112.safetensors +3 -0
- model-00034-of-00112.safetensors +3 -0
- model-00035-of-00112.safetensors +3 -0
- model-00036-of-00112.safetensors +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
assets/bench_agent_overview_v3.jpg filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
assets/bench_claw_agent2.jpg filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
assets/bench_context.jpg filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
assets/bench_stem.jpg filter=lfs diff=lfs merge=lfs -text
|
LICENSE
ADDED
|
@@ -0,0 +1,78 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
TENCENT HY COMMUNITY LICENSE AGREEMENT
|
| 2 |
+
Tencent Hy3 preview Release Date: April 23, 2026
|
| 3 |
+
THIS LICENSE AGREEMENT DOES NOT APPLY IN THE EUROPEAN UNION, UNITED KINGDOM AND SOUTH KOREA AND IS EXPRESSLY LIMITED TO THE TERRITORY, AS DEFINED BELOW.
|
| 4 |
+
By clicking to agree or by using, reproducing, modifying, distributing, performing or displaying any portion or element of the Tencent Hy Works, including via any Hosted Service, You will be deemed to have recognized and accepted the content of this Agreement, which is effective immediately.
|
| 5 |
+
1. DEFINITIONS.
|
| 6 |
+
a. “Acceptable Use Policy” shall mean the policy made available by Tencent as set forth in the Exhibit A.
|
| 7 |
+
b. “Agreement” shall mean the terms and conditions for use, reproduction, distribution, modification, performance and displaying of Tencent Hy Works or any portion or element thereof set forth herein.
|
| 8 |
+
c. “Documentation” shall mean the specifications, manuals and documentation for Tencent Hy made publicly available by Tencent.
|
| 9 |
+
d. “Hosted Service” shall mean a hosted service offered via an application programming interface (API), web access, or any other electronic or remote means.
|
| 10 |
+
e. “Licensee,” “You” or “Your” shall mean a natural person or legal entity exercising the rights granted by this Agreement and/or using the Tencent Hy Works for any purpose and in any field of use.
|
| 11 |
+
f. “Materials” shall mean, collectively, Tencent’s proprietary Tencent Hy and Documentation (and any portion thereof) as made available by Tencent under this Agreement.
|
| 12 |
+
g. “Model Derivatives” shall mean all: (i) modifications to Tencent Hy or any Model Derivative of Tencent Hy; (ii) works based on Tencent Hy or any Model Derivative of Tencent Hy; or (iii) any other machine learning model which is created by transfer of patterns of the weights, parameters, operations, or Output of Tencent Hy or any Model Derivative of Tencent Hy, to that model in order to cause that model to perform similarly to Tencent Hy or a Model Derivative of Tencent Hy, including distillation methods, methods that use intermediate data representations, or methods based on the generation of synthetic data Outputs by Tencent Hy or a Model Derivative of Tencent Hy for training that model. For clarity, Outputs by themselves are not deemed Model Derivatives.
|
| 13 |
+
h. “Output” shall mean the information and/or content output of Tencent Hy or a Model Derivative that results from operating or otherwise using Tencent Hy or a Model Derivative, including via a Hosted Service.
|
| 14 |
+
i. “Tencent,” “We” or “Us” shall mean the applicable entity or entities in the Tencent corporate family that own(s) intellectual property or other rights embodied in or utilized by the Materials.
|
| 15 |
+
j. “Tencent Hy” shall mean the large language models, text/image/video/audio/3D generation models, and multimodal large language models and their software and algorithms, including trained model weights, parameters (including optimizer states), machine-learning model code, inference-enabling code, training-enabling code, fine-tuning enabling code and other elements of the foregoing made publicly available by Us, including, without limitation to, Tencent Hy3 preview released at [https://huggingface.co/tencent/Hy3-preview; https://github.com/Tencent-Hunyuan/Hy3-preview].
|
| 16 |
+
k. “Tencent Hy Works” shall mean: (i) the Materials; (ii) Model Derivatives; and (iii) all derivative works thereof.
|
| 17 |
+
l. “Territory” shall mean the worldwide territory, excluding the territory of the European Union, United Kingdom and South Korea.
|
| 18 |
+
m. “Third Party” or “Third Parties” shall mean individuals or legal entities that are not under common control with Us or You.
|
| 19 |
+
n. “including” shall mean including but not limited to.
|
| 20 |
+
2. GRANT OF RIGHTS.
|
| 21 |
+
We grant You, for the Territory only, a non-exclusive, non-transferable and royalty-free limited license under Tencent’s intellectual property or other rights owned by Us embodied in or utilized by the Materials to use, reproduce, distribute, create derivative works of (including Model Derivatives), and make modifications to the Materials, only in accordance with the terms of this Agreement and the Acceptable Use Policy, and You must not violate (or encourage or permit anyone else to violate) any term of this Agreement or the Acceptable Use Policy.
|
| 22 |
+
3. DISTRIBUTION.
|
| 23 |
+
You may, subject to Your compliance with this Agreement, distribute or make available to Third Parties the Tencent Hy Works, exclusively in the Territory, provided that You meet all of the following conditions:
|
| 24 |
+
a. You must provide all such Third Party recipients of the Tencent Hy Works or products or services using them a copy of this Agreement;
|
| 25 |
+
b. You must cause any modified files to carry prominent notices stating that You changed the files;
|
| 26 |
+
c. You are encouraged to: (i) publish at least one technology introduction blogpost or one public statement expressing Your experience of using the Tencent Hy Works; and (ii) mark the products or services developed by using the Tencent Hy Works to indicate that the product/service is “Powered by Tencent Hy”; and
|
| 27 |
+
d. All distributions to Third Parties (other than through a Hosted Service) must be accompanied by a “Notice” text file that contains the following notice: “Tencent Hy is licensed under the Tencent Hy Community License Agreement, Copyright © 2026 Tencent. All Rights Reserved. The trademark rights of “Tencent Hy” are owned by Tencent or its affiliate.”
|
| 28 |
+
e. In the event that You use, integrate, implement, or otherwise deploy the Tencent Hy Works, in whole or in part, to provide, enable, or support any service, product, or functionality to third parties, You shall clearly, accurately, and prominently disclose to all end users the full legal name and entity of the actual provider of such service, product, or functionality. You shall expressly and conspicuously state that Tencent is not affiliated with, associated with, sponsoring, or endorsing any such service, product, or functionality. You shall not use or display any name, logo, trademark, trade name, or other indicia of Tencent in any manner that could be construed as, or be likely to create, confusion, deception, or a false impression regarding any relationship, affiliation, sponsorship, or endorsement by Tencent.
|
| 29 |
+
You may add Your own copyright statement to Your modifications and, except as set forth in this Section and in Section 5, may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Model Derivatives as a whole, provided Your use, reproduction, modification, distribution, performance and display of the work otherwise complies with the terms and conditions of this Agreement (including as regards the Territory). If You receive Tencent Hy Works from a Licensee as part of an integrated end user product, then this Section 3 of this Agreement will not apply to You.
|
| 30 |
+
4. ADDITIONAL COMMERCIAL TERMS.
|
| 31 |
+
If, on the Tencent Hy version release date, the monthly active users of all products or services made available by or for Licensee is greater than 100 million monthly active users in the preceding calendar month, You must request a license from Tencent, which Tencent may grant to You in its sole discretion, and You are not authorized to exercise any of the rights under this Agreement unless or until Tencent otherwise expressly grants You such rights.
|
| 32 |
+
5. RULES OF USE.
|
| 33 |
+
a. Your use of the Tencent Hy Works must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Tencent Hy Works, which is hereby incorporated by reference into this Agreement. You must include the use restrictions referenced in these Sections 5(a) and 5(b) as an enforceable provision in any agreement (e.g., license agreement, terms of use, etc.) governing the use and/or distribution of Tencent Hy Works and You must provide notice to subsequent users to whom You distribute that Tencent Hy Works are subject to the use restrictions in these Sections 5(a) and 5(b).
|
| 34 |
+
b. You must not use the Tencent Hy Works or any Output or results of the Tencent Hy Works to improve any other AI model (other than Tencent Hy or Model Derivatives thereof).
|
| 35 |
+
c. You must not use, reproduce, modify, distribute, or display the Tencent Hy Works, Output or results of the Tencent Hy Works outside the Territory. Any such use outside the Territory is unlicensed and unauthorized under this Agreement.
|
| 36 |
+
6. INTELLECTUAL PROPERTY.
|
| 37 |
+
a. Subject to Tencent’s ownership of Tencent Hy Works made by or for Tencent and intellectual property rights therein, conditioned upon Your compliance with the terms and conditions of this Agreement, as between You and Tencent, You will be the owner of any derivative works and modifications of the Materials and any Model Derivatives that are made by or for You.
|
| 38 |
+
b. No trademark licenses are granted under this Agreement, and in connection with the Tencent Hy Works, Licensee may not use any name or mark owned by or associated with Tencent or any of its affiliates, except as required for reasonable and customary use in describing and distributing the Tencent Hy Works. Tencent hereby grants You a license to use “Tencent Hy” (the “Mark”) in the Territory solely as required to comply with the provisions of Section 3(c), provided that You comply with any applicable laws related to trademark protection. All goodwill arising out of Your use of the Mark will inure to the benefit of Tencent.
|
| 39 |
+
c. If You commence a lawsuit or other proceedings (including a cross-claim or counterclaim in a lawsuit) against Us or any person or entity alleging that the Materials or any Output, or any portion of any of the foregoing, infringe any intellectual property or other right owned or licensable by You, then all licenses granted to You under this Agreement shall terminate as of the date such lawsuit or other proceeding is filed. You will defend, indemnify and hold harmless Us from and against any claim by any Third Party arising out of or related to Your or the Third Party’s use or distribution of the Tencent Hy Works.
|
| 40 |
+
d. Tencent claims no rights in Outputs You generate. You and Your users are solely responsible for Outputs and their subsequent uses.
|
| 41 |
+
7. DISCLAIMERS OF WARRANTY AND LIMITATIONS OF LIABILITY.
|
| 42 |
+
a. We are not obligated to support, update, provide training for, or develop any further version of the Tencent Hy Works or to grant any license thereto.
|
| 43 |
+
b. UNLESS AND ONLY TO THE EXTENT REQUIRED BY APPLICABLE LAW, THE Tencent Hy WORKS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED “AS IS” WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES OF ANY KIND INCLUDING ANY WARRANTIES OF TITLE, MERCHANTABILITY, NONINFRINGEMENT, COURSE OF DEALING, USAGE OF TRADE, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING, REPRODUCING, MODIFYING, PERFORMING, DISPLAYING OR DISTRIBUTING ANY OF THE Tencent Hy WORKS OR OUTPUTS AND ASSUME ANY AND ALL RISKS ASSOCIATED WITH YOUR OR A THIRD PARTY’S USE OR DISTRIBUTION OF ANY OF THE Tencent Hy WORKS OR OUTPUTS AND YOUR EXERCISE OF RIGHTS AND PERMISSIONS UNDER THIS AGREEMENT.
|
| 44 |
+
c. TO THE FULLEST EXTENT PERMITTED BY APPLICABLE LAW, IN NO EVENT SHALL TENCENT OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT, NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, FOR ANY DAMAGES, INCLUDING ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, EXEMPLARY, CONSEQUENTIAL OR PUNITIVE DAMAGES, OR LOST PROFITS OF ANY KIND ARISING FROM THIS AGREEMENT OR RELATED TO ANY OF THE Tencent Hy WORKS OR OUTPUTS, EVEN IF TENCENT OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.
|
| 45 |
+
8. SURVIVAL AND TERMINATION.
|
| 46 |
+
a. The term of this Agreement shall commence upon Your acceptance of this Agreement or access to the Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein.
|
| 47 |
+
b. We may terminate this Agreement if You breach any of the terms or conditions of this Agreement. Upon termination of this Agreement, You must promptly delete and cease use of the Tencent Hy Works. Sections 6(a), 6(c), 7 and 9 shall survive the termination of this Agreement.
|
| 48 |
+
9. GOVERNING LAW AND JURISDICTION.
|
| 49 |
+
a. This Agreement and any dispute arising out of or relating to it will be governed by the laws of the Hong Kong Special Administrative Region of the People’s Republic of China, without regard to conflict of law principles, and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement.
|
| 50 |
+
b. Exclusive jurisdiction and venue for any dispute arising out of or relating to this Agreement will be a court of competent jurisdiction in the Hong Kong Special Administrative Region of the People’s Republic of China, and Tencent and Licensee consent to the exclusive jurisdiction of such court with respect to any such dispute.
|
| 51 |
+
|
| 52 |
+
EXHIBIT A
|
| 53 |
+
ACCEPTABLE USE POLICY
|
| 54 |
+
|
| 55 |
+
Tencent reserves the right to update this Acceptable Use Policy from time to time.
|
| 56 |
+
Last modified: December 30, 2025
|
| 57 |
+
|
| 58 |
+
Tencent endeavors to promote safe and fair use of its tools and features, including Tencent Hy. You agree not to use Tencent Hy or Model Derivatives:
|
| 59 |
+
1. Outside the Territory;
|
| 60 |
+
2. In any way that violates any applicable national, federal, state, local, international or any other law or regulation;
|
| 61 |
+
3. To harm Yourself or others;
|
| 62 |
+
4. To repurpose or distribute output from Tencent Hy or any Model Derivatives to harm Yourself or others;
|
| 63 |
+
5. To override or circumvent the safety guardrails and safeguards We have put in place;
|
| 64 |
+
6. For the purpose of exploiting, harming or attempting to exploit or harm minors in any way;
|
| 65 |
+
7. To generate or disseminate verifiably false information and/or content with the purpose of harming others or influencing elections;
|
| 66 |
+
8. To generate or facilitate false online engagement, including fake reviews and other means of fake online engagement;
|
| 67 |
+
9. To intentionally defame, disparage or otherwise harass others;
|
| 68 |
+
10. To generate and/or disseminate malware (including ransomware) or any other content to be used for the purpose of harming electronic systems;
|
| 69 |
+
11. To generate or disseminate personal identifiable information with the purpose of harming others;
|
| 70 |
+
12. To generate or disseminate information (including images, code, posts, articles), and place the information in any public context (including –through the use of bot generated tweets), without expressly and conspicuously identifying that the information and/or content is machine generated;
|
| 71 |
+
13. To impersonate another individual without consent, authorization, or legal right;
|
| 72 |
+
14. To make high-stakes automated decisions in domains that affect an individual’s safety, rights or wellbeing (e.g., law enforcement, migration, medicine/health, management of critical infrastructure, safety components of products, essential services, credit, employment, housing, education, social scoring, or insurance);
|
| 73 |
+
15. In a manner that violates or disrespects the social ethics and moral standards of other countries or regions;
|
| 74 |
+
16. To perform, facilitate, threaten, incite, plan, promote or encourage violent extremism or terrorism;
|
| 75 |
+
17. For any use intended to discriminate against or harm individuals or groups based on protected characteristics or categories, online or offline social behavior or known or predicted personal or personality characteristics;
|
| 76 |
+
18. To intentionally exploit any of the vulnerabilities of a specific group of persons based on their age, social, physical or mental characteristics, in order to materially distort the behavior of a person pertaining to that group in a manner that causes or is likely to cause that person or another person physical or psychological harm;
|
| 77 |
+
19. For military purposes;
|
| 78 |
+
20. To engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or other professional practices.
|
README.md
ADDED
|
@@ -0,0 +1,262 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: other
|
| 3 |
+
library_name: transformers
|
| 4 |
+
---
|
| 5 |
+
<p align="left">
|
| 6 |
+
<a href="https://huggingface.co/tencent/Hy3-preview/blob/main/README_CN.md">中文</a> | English
|
| 7 |
+
</p>
|
| 8 |
+
<br>
|
| 9 |
+
|
| 10 |
+
<p align="center">
|
| 11 |
+
<img src="assets/logo-en.png" width="400"/> <br>
|
| 12 |
+
</p>
|
| 13 |
+
|
| 14 |
+
<div align="center" style="line-height: 1;">
|
| 15 |
+
|
| 16 |
+
|
| 17 |
+
[](#license)
|
| 18 |
+
|
| 19 |
+
[](https://huggingface.co/tencent/Hy3-preview)
|
| 20 |
+
|
| 21 |
+
[](https://modelscope.cn/models/Tencent-Hunyuan/Hy3-preview)
|
| 22 |
+
|
| 23 |
+
[](https://cnb.cool/ai-models/tencent/Hy3-preview)
|
| 24 |
+
|
| 25 |
+
[](https://ai.gitcode.com/tencent_hunyuan/Hy3-preview)
|
| 26 |
+
|
| 27 |
+
</div>
|
| 28 |
+
|
| 29 |
+
<p align="center">
|
| 30 |
+
🖥️ <a href="https://aistudio.tencent.com/"><b>Official Website</b></a> |
|
| 31 |
+
💬 <a href="https://github.com/Tencent-Hunyuan/Hy3-preview"><b>GitHub</b></a></p>
|
| 32 |
+
|
| 33 |
+
---
|
| 34 |
+
|
| 35 |
+
## Table of Contents
|
| 36 |
+
|
| 37 |
+
- [Model Introduction](#model-introduction)
|
| 38 |
+
- [Highlights](#highlights)
|
| 39 |
+
- [Benchmark Results](#benchmark-results)
|
| 40 |
+
- [STEM & Reasoning](#stem--reasoning)
|
| 41 |
+
- [Context Learning & Instruction Following](#context-learning--instruction-following)
|
| 42 |
+
- [Code & Agent](#code--agent)
|
| 43 |
+
- [News](#news)
|
| 44 |
+
- [Model Links](#model-links)
|
| 45 |
+
- [Quickstart](#quickstart)
|
| 46 |
+
- [Deployment](#deployment)
|
| 47 |
+
- [vLLM](#vllm)
|
| 48 |
+
- [SGLang](#sglang)
|
| 49 |
+
- [Training](#training)
|
| 50 |
+
- [Quantization](#quantization)
|
| 51 |
+
- [License](#license)
|
| 52 |
+
- [Contact Us](#contact-us)
|
| 53 |
+
|
| 54 |
+
---
|
| 55 |
+
|
| 56 |
+
## Model Introduction
|
| 57 |
+
|
| 58 |
+
**Hy3 preview** is a 295B-parameter Mixture-of-Experts (MoE) model with 21B active parameters and 3.8B MTP layer parameters, developed by the Tencent Hy Team. Hy3 preview is the first model trained on our rebuilt infrastructure, and the strongest we've shipped so far. It improves significantly on complex reasoning, instruction following, context learning, coding, and agent tasks.
|
| 59 |
+
|
| 60 |
+
|
| 61 |
+
| Property | Value |
|
| 62 |
+
|:---|:---|
|
| 63 |
+
| Architecture | Mixture-of-Experts (MoE) |
|
| 64 |
+
| Total Parameters | 295B |
|
| 65 |
+
| Activated Parameters | 21B |
|
| 66 |
+
| MTP Layer Parameters | 3.8B |
|
| 67 |
+
| Number of Layers (excluding MTP layer) | 80 |
|
| 68 |
+
| Number of MTP Layers | 1 |
|
| 69 |
+
| Attention Heads | 64 (GQA, 8 KV heads, head dim 128) |
|
| 70 |
+
| Hidden Size | 4096 |
|
| 71 |
+
| Intermediate Size | 13312 |
|
| 72 |
+
| Context Length | 256K |
|
| 73 |
+
| Vocabulary Size | 120832 |
|
| 74 |
+
| Number of Experts | 192 experts, top-8 activated |
|
| 75 |
+
| Supported Precisions | BF16 |
|
| 76 |
+
|
| 77 |
+
## Highlights
|
| 78 |
+
|
| 79 |
+
- **STEM & Reasoning** — Complex reasoning underpins everything else. Hy3 preview performs well on challenging STEM benchmarks like FrontierScience-Olympiad and IMOAnswerBench, and achieved excellent results in the Tsinghua Qiuzhen College Math PhD qualifying exam (Spring '26) and the China High School Biology Olympiad (CHSBO 2025), demonstrating generalizable reasoning capacity.
|
| 80 |
+
|
| 81 |
+
- **Context Learning & Instruction Following** — Real-world tasks require the ability to parse messy, lengthy contexts and follow complex rules. We built CL-bench and CL-bench-Life from our own business scenarios to innovatively measure context learning ability. Hy3 preview exhibits solid gains in both context learning and instruction following capabilities.
|
| 82 |
+
|
| 83 |
+
- **Code & Agent** — Coding and agents saw the biggest gains. With a rebuilt RL infrastructure and larger-scale training tasks, we posted competitive scores across mainstream coding agent benchmarks (SWE-bench Verified, Terminal-Bench 2.0) and search agent benchmarks (BrowseComp, WideSearch).
|
| 84 |
+
|
| 85 |
+
## Benchmark Results
|
| 86 |
+
|
| 87 |
+
### Pre-trained Model Performance
|
| 88 |
+
|
| 89 |
+
| Category | Benchmark (Metric) | # Shots | Kimi-K2 BASE | DeepSeek-V3 BASE | GLM-4.5 BASE | Hy3 preview-Base |
|
| 90 |
+
|---|---|---|---|---|---|---|
|
| 91 |
+
| | #ActivatedParams | - | 32B | 37B | 32B | 21B |
|
| 92 |
+
| | #TotalParams | - | 1043B | 671B | 355B | 295B |
|
| 93 |
+
| **English** | MMLU | 5-shot | **88.24** | 87.68 | 87.73 | 87.42 |
|
| 94 |
+
| | MMLU-Pro | 5-shot | **65.98** | 63.98 | 63.67 | 65.76 |
|
| 95 |
+
| | MMLU-Redux | 5-shot | **87.18** | 86.81 | 86.56 | 86.86 |
|
| 96 |
+
| | ARC-Challenge | 0-shot | **96.66** | 94.65 | 96.32 | 95.99 |
|
| 97 |
+
| | DROP | 5-shot | 86.40 | **86.50** | 82.90 | 85.50 |
|
| 98 |
+
| | PIQA | 4-shot | **84.93** | 84.22 | 84.71 | 84.39 |
|
| 99 |
+
| | SuperGPQA | 5-shot | 51.10 | 46.17 | 49.64 | **51.60** |
|
| 100 |
+
| | SimpleQA | 5-shot | **34.37** | 26.15 | 29.26 | 26.47 |
|
| 101 |
+
| **Code** | MBPP-plus | 3-shot | **81.35** | 75.47 | 78.05 | 78.71 |
|
| 102 |
+
| | CRUXEval-I | 3-shot | 68.01 | 67.79 | 68.51 | **71.19** |
|
| 103 |
+
| | CRUXEval-O | 3-shot | 69.62 | **71.00** | 67.75 | 68.38 |
|
| 104 |
+
| | LiveCodeBench-v6 | 1-shot | 30.86 | 29.31 | 27.43 | **34.86** |
|
| 105 |
+
| **Math** | GSM8K | 4-shot | 93.46 | 88.15 | 90.06 | **95.37** |
|
| 106 |
+
| | MATH | 4-shot | 71.20 | 59.37 | 61.00 | **76.28** |
|
| 107 |
+
| | CMath | 4-shot | 90.83 | 85.50 | 89.33 | **91.17** |
|
| 108 |
+
| **Chinese** | C-Eval | 5-shot | **91.51** | 90.35 | 85.84 | 89.80 |
|
| 109 |
+
| | CMMLU | 5-shot | **90.72** | 87.90 | 86.46 | 89.61 |
|
| 110 |
+
| | Chinese-simpleQA | 5-shot | **74.58** | 68.72 | 68.49 | 69.73 |
|
| 111 |
+
| **Multilingual** | MMMLU | 5-shot | 77.63 | 79.54 | 79.26 | **80.15** |
|
| 112 |
+
| | INCLUDE | 5-shot | 75.66 | 77.86 | 76.27 | **78.64** |
|
| 113 |
+
|
| 114 |
+
### Instruct Model Performance
|
| 115 |
+
|
| 116 |
+
#### STEM & Reasoning
|
| 117 |
+
|
| 118 |
+
Complex reasoning underpins everything else. Hy3 preview performs well on challenging STEM benchmarks like FrontierScience-Olympiad and IMOAnswerBench. It also achieved excellent results in the Tsinghua Qiuzhen College Math PhD qualifying exam (Spring '26) and the China High School Biology Olympiad (CHSBO 2025), demonstrating a high degree of generalizable reasoning capacity.
|
| 119 |
+
|
| 120 |
+
<p align="center"><img src="assets/bench_stem.jpg" width="800" alt="STEM & Reasoning benchmarks"/></p>
|
| 121 |
+
|
| 122 |
+
#### Context Learning & Instruction Following
|
| 123 |
+
|
| 124 |
+
Real-world tasks require the ability to parse messy, lengthy contexts and follow complex rules. We built CL-bench and CL-bench-Life from our own business scenarios to innovatively measure context learning ability. Hy3 preview exhibits solid gains in both context learning and instruction following capabilities.
|
| 125 |
+
|
| 126 |
+
<p align="center"><img src="assets/bench_context.jpg" width="800" alt="Context Learning & Instruction Following benchmarks"/></p>
|
| 127 |
+
|
| 128 |
+
#### Code & Agent
|
| 129 |
+
|
| 130 |
+
Coding and agents saw the biggest gains. With a rebuilt RL infrastructure and larger-scale training tasks, we posted competitive scores across mainstream coding agent benchmarks (SWE-bench Verified, Terminal-Bench 2.0) and search agent benchmarks (BrowseComp, WideSearch).
|
| 131 |
+
|
| 132 |
+
<p align="center"><img src="assets/bench_agent_overview_v3.jpg" width="800" alt="Agent benchmarks overview"/></p>
|
| 133 |
+
|
| 134 |
+
Coding is about whether a model can execute in a development environment. Search is about whether it can find and combine information from the open web. Both matter for complex agent scenarios like OpenClaw. Hy3 preview scores well on ClawEval and WildClawBench — a sign that its agent capabilities are becoming practical.
|
| 135 |
+
|
| 136 |
+
<p align="center"><img src="assets/bench_claw_agent.png" width="800" alt="Claw Agent benchmarks"/></p>
|
| 137 |
+
|
| 138 |
+
Beyond public benchmarks, we built internal evaluation sets to test the model in real development scenarios. On Hy-Backend (backend-focused tasks), Hy-Vibe Bench (real-user dev workflows), and Hy-SWE Max, Hy3 preview scores competitively against other open-source models.
|
| 139 |
+
|
| 140 |
+
<p align="center"><img src="assets/bench_claw_agent2.jpg" width="800" alt="Internal benchmarks"/></p>
|
| 141 |
+
|
| 142 |
+
## News
|
| 143 |
+
|
| 144 |
+
|
| 145 |
+
* **[2026-04-23]** 🔥 We open-source **Hy3 preview** model weights on [Hugging Face](https://huggingface.co/tencent/Hy3-preview), [ModelScope](https://modelscope.cn/models/Tencent-Hunyuan/Hy3-preview), and [GitCode](https://ai.gitcode.com/tencent_hunyuan/Hy3-preview).
|
| 146 |
+
|
| 147 |
+
## Model Links
|
| 148 |
+
|
| 149 |
+
|
| 150 |
+
| Model Name | Description | Hugging Face | ModelScope | GitCode |
|
| 151 |
+
|:---|:---|:---:|:---:|:---:|
|
| 152 |
+
| Hy3 preview | Instruct model | 🤗 [Model](https://huggingface.co/tencent/Hy3-preview) | [Model](https://modelscope.cn/models/Tencent-Hunyuan/Hy3-preview) | [Model](https://ai.gitcode.com/tencent_hunyuan/Hy3-preview) |
|
| 153 |
+
| Hy3 preview-Base | Pre-trained base model | 🤗 [Model](https://huggingface.co/tencent/Hy3-preview-Base) | [Model](https://modelscope.cn/models/Tencent-Hunyuan/Hy3-preview-Base) | [Model](https://ai.gitcode.com/tencent_hunyuan/Hy3-preview-Base) |
|
| 154 |
+
|
| 155 |
+
## Quickstart
|
| 156 |
+
|
| 157 |
+
Deploy Hy3 preview with [vLLM](#vllm) or [SGLang](#sglang) first, then call the OpenAI-compatible API:
|
| 158 |
+
|
| 159 |
+
```python
|
| 160 |
+
from openai import OpenAI
|
| 161 |
+
|
| 162 |
+
client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
|
| 163 |
+
|
| 164 |
+
response = client.chat.completions.create(
|
| 165 |
+
model="hy3-preview",
|
| 166 |
+
messages=[
|
| 167 |
+
{"role": "user", "content": "Hello! Can you briefly introduce yourself?"},
|
| 168 |
+
],
|
| 169 |
+
temperature=0.9,
|
| 170 |
+
top_p=1.0,
|
| 171 |
+
# reasoning_effort: "no_think" (default, direct response), "low", "high" (deep chain-of-thought)
|
| 172 |
+
extra_body={"chat_template_kwargs": {"reasoning_effort": "no_think"}},
|
| 173 |
+
)
|
| 174 |
+
print(response.choices[0].message.content)
|
| 175 |
+
```
|
| 176 |
+
|
| 177 |
+
> **Recommended parameters**: `temperature=0.9`, `top_p=1.0`.
|
| 178 |
+
>
|
| 179 |
+
> **Reasoning mode**: Set `reasoning_effort` to `"high"` for complex tasks (math, coding, reasoning) or `"no_think"` for direct responses.
|
| 180 |
+
|
| 181 |
+
See the [Deployment](#deployment) section below for how to start the API server.
|
| 182 |
+
|
| 183 |
+
## Deployment
|
| 184 |
+
|
| 185 |
+
Hy3-preview has 295B parameters in total. To serve it on 8 GPUs, we recommend using H20-3e or other GPUs with larger memory capacity.
|
| 186 |
+
|
| 187 |
+
### vLLM
|
| 188 |
+
|
| 189 |
+
Build vLLM from source:
|
| 190 |
+
```bash
|
| 191 |
+
uv venv --python 3.12 --seed --managed-python
|
| 192 |
+
source .venv/bin/activate
|
| 193 |
+
git clone https://github.com/vllm-project/vllm.git
|
| 194 |
+
cd vllm
|
| 195 |
+
uv pip install --editable . --torch-backend=auto
|
| 196 |
+
```
|
| 197 |
+
|
| 198 |
+
Start the vLLM server with MTP enabled:
|
| 199 |
+
|
| 200 |
+
```bash
|
| 201 |
+
vllm serve tencent/Hy3-preview \
|
| 202 |
+
--tensor-parallel-size 8 \
|
| 203 |
+
--speculative-config.method mtp \
|
| 204 |
+
--speculative-config.num_speculative_tokens 1 \
|
| 205 |
+
--tool-call-parser hy_v3 \
|
| 206 |
+
--reasoning-parser hy_v3 \
|
| 207 |
+
--enable-auto-tool-choice \
|
| 208 |
+
--served-model-name hy3-preview
|
| 209 |
+
```
|
| 210 |
+
|
| 211 |
+
### SGLang
|
| 212 |
+
|
| 213 |
+
Build SGLang from source:
|
| 214 |
+
```bash
|
| 215 |
+
git clone https://github.com/sgl-project/sglang
|
| 216 |
+
cd sglang
|
| 217 |
+
pip3 install pip --upgrade
|
| 218 |
+
pip3 install "transformers>=5.6.0"
|
| 219 |
+
pip3 install -e "python"
|
| 220 |
+
```
|
| 221 |
+
|
| 222 |
+
Launch SGLang server with MTP enabled:
|
| 223 |
+
|
| 224 |
+
```bash
|
| 225 |
+
python3 -m sglang.launch_server \
|
| 226 |
+
--model tencent/Hy3-preview \
|
| 227 |
+
--tp 8 \
|
| 228 |
+
--tool-call-parser hunyuan \
|
| 229 |
+
--reasoning-parser hunyuan \
|
| 230 |
+
--speculative-num-steps 1 \
|
| 231 |
+
--speculative-eagle-topk 1 \
|
| 232 |
+
--speculative-num-draft-tokens 2 \
|
| 233 |
+
--speculative-algorithm EAGLE \
|
| 234 |
+
--served-model-name hy3-preview
|
| 235 |
+
```
|
| 236 |
+
|
| 237 |
+
## Training
|
| 238 |
+
|
| 239 |
+
Hy3 preview provides a complete model training pipeline, supporting both full fine-tuning and LoRA fine-tuning, with DeepSpeed ZeRO configurations and LLaMA-Factory integration.
|
| 240 |
+
|
| 241 |
+
For detailed training documentation, please refer to: [Training Guide](./train/README.md)
|
| 242 |
+
|
| 243 |
+
## Quantization
|
| 244 |
+
|
| 245 |
+
We provide [AngelSlim](https://github.com/tencent/AngelSlim), a more accessible, comprehensive, and efficient toolkit for large model compression. AngelSlim supports a comprehensive suite of compression tools for large-scale multimodal models, including common quantization algorithms, low-bit quantization, and speculative sampling.
|
| 246 |
+
|
| 247 |
+
## License
|
| 248 |
+
|
| 249 |
+
|
| 250 |
+
Hy3 preview is released under the **Tencent Hy Community License Agreement**. See [LICENSE](./LICENSE) for details.
|
| 251 |
+
|
| 252 |
+
## Contact Us
|
| 253 |
+
|
| 254 |
+
If you would like to leave a message for our R&D and product teams, welcome to contact us. You can also reach us via email:
|
| 255 |
+
|
| 256 |
+
📧 **hunyuan_opensource@tencent.com**
|
| 257 |
+
|
| 258 |
+
---
|
| 259 |
+
|
| 260 |
+
<p align="center">
|
| 261 |
+
<i>Hy3 preview is developed by the Tencent Hy Team.</i>
|
| 262 |
+
</p>
|
README_CN.md
ADDED
|
@@ -0,0 +1,259 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<p align="left">
|
| 2 |
+
<a href="https://huggingface.co/tencent/Hy3-preview">English</a> | 中文
|
| 3 |
+
</p>
|
| 4 |
+
<br>
|
| 5 |
+
|
| 6 |
+
<p align="center">
|
| 7 |
+
<img src="assets/logo-zh.png" width="400"/> <br>
|
| 8 |
+
</p>
|
| 9 |
+
|
| 10 |
+
<div align="center" style="line-height: 1;">
|
| 11 |
+
|
| 12 |
+
|
| 13 |
+
[](#许可证)
|
| 14 |
+
|
| 15 |
+
[](https://huggingface.co/tencent/Hy3-preview)
|
| 16 |
+
|
| 17 |
+
[](https://modelscope.cn/models/Tencent-Hunyuan/Hy3-preview)
|
| 18 |
+
|
| 19 |
+
[](https://cnb.cool/ai-models/tencent/Hy3-preview)
|
| 20 |
+
|
| 21 |
+
[](https://ai.gitcode.com/tencent_hunyuan/Hy3-preview)
|
| 22 |
+
|
| 23 |
+
</div>
|
| 24 |
+
|
| 25 |
+
<p align="center">
|
| 26 |
+
🖥️ <a href="https://aistudio.tencent.com/"><b>官方网站</b></a> |
|
| 27 |
+
💬 <a href="https://github.com/Tencent-Hunyuan/Hy3-preview"><b>GitHub</b></a></p>
|
| 28 |
+
|
| 29 |
+
---
|
| 30 |
+
|
| 31 |
+
## 目录
|
| 32 |
+
|
| 33 |
+
- [模型介绍](#模型介绍)
|
| 34 |
+
- [亮点展示](#亮点展示)
|
| 35 |
+
- [评测结果](#评测结果)
|
| 36 |
+
- [复杂推理(STEM & Reasoning)](#复杂推理stem--reasoning)
|
| 37 |
+
- [上下文学习和指令遵循(Context Learning & Instruction Following)](#上下文学习和指令遵循context-learning--instruction-following)
|
| 38 |
+
- [代码和智能体(Code & Agent)](#代码和智能体code--agent)
|
| 39 |
+
- [新闻](#新闻)
|
| 40 |
+
- [模型链接](#模型链接)
|
| 41 |
+
- [快速开始](#快速开始)
|
| 42 |
+
- [推理和部署](#推理和部署)
|
| 43 |
+
- [vLLM](#使用-vllm-推理)
|
| 44 |
+
- [SGLang](#使用-sglang-推理)
|
| 45 |
+
- [模型训练](#模型训练)
|
| 46 |
+
- [量化工具](#量化工具)
|
| 47 |
+
- [许可证](#许可证)
|
| 48 |
+
- [联系我们](#联系我们)
|
| 49 |
+
|
| 50 |
+
---
|
| 51 |
+
|
| 52 |
+
## 模型介绍
|
| 53 |
+
|
| 54 |
+
**Hy3 preview** 是由腾讯混元团队研发的快慢思考融合的混合专家模型,总参数量 295B,激活参数 21B,MTP 层参数 3.8B。Hy3 preview 是我们重建后训练的第一个模型,也是混元迄今最智能的模型,在复杂推理、指令遵循、上下文学习、代码、智能体等能力及推理性能上实现了大幅的提升。
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
| 属性 | 值 |
|
| 58 |
+
|:---|:---|
|
| 59 |
+
| 架构 | 混合专家(MoE) |
|
| 60 |
+
| 总参数量 | 295B |
|
| 61 |
+
| 激活参数量 | 21B |
|
| 62 |
+
| MTP层参数量 | 3.8B |
|
| 63 |
+
| 层数(不含MTP层) | 80 |
|
| 64 |
+
| MTP层数 | 1 |
|
| 65 |
+
| 注意力头 | 64(GQA,8 个 KV 头,head dim 128) |
|
| 66 |
+
| 隐藏层维度 | 4096 |
|
| 67 |
+
| FFN 中间层维度 | 13312 |
|
| 68 |
+
| 上下文长度 | 256K |
|
| 69 |
+
| 词表大小 | 120832 |
|
| 70 |
+
| 专家数量 | 192 个专家,top-8 激活 |
|
| 71 |
+
| 支持精度 | BF16 |
|
| 72 |
+
|
| 73 |
+
## 亮点展示
|
| 74 |
+
|
| 75 |
+
- **复杂推理(STEM & Reasoning)** — 推理能力是模型解决各种问题的基础。在 FrontierScience-Olympiad、IMOAnswerBench 等高难度理工科推理任务中表现突出,并在最新的清华大学求真书院数学博资考(26春)和全国中学生生物学联赛(CHSBO 2025)中取得优异成绩,展现出可泛化的强推理能力。
|
| 76 |
+
|
| 77 |
+
- **上下文学习和指令遵循(Context Learning & Instruction Following)** — 在各种真实的生产与生活场景,理解杂乱冗长的上下文并遵从复杂多变的规则是模型的首要挑战。基于我们多种业务场景的灵感,我们提出了 CL-bench 和 CL-bench-Life 来创新性地评估模型的上下文学习能力,并在 Hy3 preview 显著地提升了模型上下文学习和指令遵循能力。
|
| 78 |
+
|
| 79 |
+
- **代码和智能体(Code & Agent)** — Hy3 preview 提升最为显著的方向。得益于预训练及强化学习框架的重建和强化学习任务规模的提升,我们以较快的速度在 SWE-Bench Verified、Terminal-Bench 2.0 等主流代码智能体基准以及 BrowseComp、WideSearch 等主流搜索智能体基准中取得了强竞争力的结果。
|
| 80 |
+
|
| 81 |
+
## 评测结果
|
| 82 |
+
|
| 83 |
+
### 预训练模型效果
|
| 84 |
+
|
| 85 |
+
| Category | Benchmark (Metric) | # Shots | Kimi-K2 BASE | DeepSeek-V3 BASE | GLM-4.5 BASE | Hy3 preview-Base |
|
| 86 |
+
|---|---|---|---|---|---|---|
|
| 87 |
+
| | #ActivatedParams | - | 32B | 37B | 32B | 21B |
|
| 88 |
+
| | #TotalParams | - | 1043B | 671B | 355B | 295B |
|
| 89 |
+
| **English** | MMLU | 5-shot | **88.24** | 87.68 | 87.73 | 87.42 |
|
| 90 |
+
| | MMLU-Pro | 5-shot | **65.98** | 63.98 | 63.67 | 65.76 |
|
| 91 |
+
| | MMLU-Redux | 5-shot | **87.18** | 86.81 | 86.56 | 86.86 |
|
| 92 |
+
| | ARC-Challenge | 0-shot | **96.66** | 94.65 | 96.32 | 95.99 |
|
| 93 |
+
| | DROP | 5-shot | 86.40 | **86.50** | 82.90 | 85.50 |
|
| 94 |
+
| | PIQA | 4-shot | **84.93** | 84.22 | 84.71 | 84.39 |
|
| 95 |
+
| | SuperGPQA | 5-shot | 51.10 | 46.17 | 49.64 | **51.60** |
|
| 96 |
+
| | SimpleQA | 5-shot | **34.37** | 26.15 | 29.26 | 26.47 |
|
| 97 |
+
| **Code** | MBPP-plus | 3-shot | **81.35** | 75.47 | 78.05 | 78.71 |
|
| 98 |
+
| | CRUXEval-I | 3-shot | 68.01 | 67.79 | 68.51 | **71.19** |
|
| 99 |
+
| | CRUXEval-O | 3-shot | 69.62 | **71.00** | 67.75 | 68.38 |
|
| 100 |
+
| | LiveCodeBench-v6 | 1-shot | 30.86 | 29.31 | 27.43 | **34.86** |
|
| 101 |
+
| **Math** | GSM8K | 4-shot | 93.46 | 88.15 | 90.06 | **95.37** |
|
| 102 |
+
| | MATH | 4-shot | 71.20 | 59.37 | 61.00 | **76.28** |
|
| 103 |
+
| | CMath | 4-shot | 90.83 | 85.50 | 89.33 | **91.17** |
|
| 104 |
+
| **Chinese** | C-Eval | 5-shot | **91.51** | 90.35 | 85.84 | 89.80 |
|
| 105 |
+
| | CMMLU | 5-shot | **90.72** | 87.90 | 86.46 | 89.61 |
|
| 106 |
+
| | Chinese-simpleQA | 5-shot | **74.58** | 68.72 | 68.49 | 69.73 |
|
| 107 |
+
| **Multilingual** | MMMLU | 5-shot | 77.63 | 79.54 | 79.26 | **80.15** |
|
| 108 |
+
| | INCLUDE | 5-shot | 75.66 | 77.86 | 76.27 | **78.64** |
|
| 109 |
+
|
| 110 |
+
### Instruct 模型效果
|
| 111 |
+
|
| 112 |
+
#### 复杂推理(STEM & Reasoning)
|
| 113 |
+
|
| 114 |
+
推理能力是模型解决各种问题的基础。Hy3 preview 在 FrontierScience-Olympiad、IMOAnswerBench 等高难度理工科推理任务中表现突出,并在最新的清华大学求真书院数学博资考(26春)和全国中学生生物学联赛(CHSBO 2025)中取得优异成绩,展现出可泛化的强推理能力。
|
| 115 |
+
|
| 116 |
+
<p align="center"><img src="assets/bench_stem.jpg" width="800" alt="STEM & Reasoning 评测结果"/></p>
|
| 117 |
+
|
| 118 |
+
#### 上下文学习和指令遵循(Context Learning & Instruction Following)
|
| 119 |
+
|
| 120 |
+
在各种真实的生产与生活场景,理解杂乱冗长的上下文并遵从复杂多变的规则是模型的首要挑战。基于我们多种业务场景的灵感,我们提出了 CL-bench 和 CL-bench-Life 来创新性地评估模型的上下文学习能力,并在 Hy3 preview 显著地提升了模型上下文学习和指令遵循能力。
|
| 121 |
+
|
| 122 |
+
<p align="center"><img src="assets/bench_context.jpg" width="800" alt="上下文学习和指令遵循评测结果"/></p>
|
| 123 |
+
|
| 124 |
+
#### 代码和智能体(Code & Agent)
|
| 125 |
+
|
| 126 |
+
代码和智能体是 Hy3 preview 提升最为显著的方向。得益于预训练及强化学习框架的重建和强化学习任务规模的提升,我们以较快的速度在 SWE-Bench Verified、Terminal-Bench 2.0 等主流代码智能体基准以及 BrowseComp、WideSearch 等主流搜索智能体基准中取得了强竞争力的结果。
|
| 127 |
+
|
| 128 |
+
<p align="center"><img src="assets/bench_agent_overview_v3.jpg" width="800" alt="Agent 评测总览"/></p>
|
| 129 |
+
|
| 130 |
+
在数字世界中,代码关注的是模型在开发环境中的执行能力,搜索则聚焦于开放信息空间中的检索、筛选与整合能力,两者共同决定了模型在复杂智能体场景(例如 OpenClaw)中是否真正具备可用性。Hy3 preview 在 ClawEval 和 WildClawBench 等评测中表现突出,进一步表明我们的智能体能力的全面与实用性。
|
| 131 |
+
|
| 132 |
+
<p align="center"><img src="assets/bench_claw_agent.png" width="800" alt="Claw Agent 评测"/></p>
|
| 133 |
+
|
| 134 |
+
除了公开榜单,我们进一步构建了多个内部的评测集,对模型在真实开发场景中的表现进行评估。结果表明,无论是在后端工程任务集 Hy-Backend,贴近真实用户开发交互的 Hy-Vibe Bench,还是高难度软件工程开发任务集 Hy-SWE Max 上,Hy3 preview 均体现出了强竞争力。
|
| 135 |
+
|
| 136 |
+
<p align="center"><img src="assets/bench_claw_agent2.jpg" width="800" alt="内部评测结果"/></p>
|
| 137 |
+
|
| 138 |
+
## 新闻
|
| 139 |
+
|
| 140 |
+
* **[2026-04-23]** 🔥 我们在 [Hugging Face](https://huggingface.co/tencent/Hy3-preview)、[ModelScope](https://modelscope.cn/models/Tencent-Hunyuan/Hy3-preview) 和 [GitCode](https://ai.gitcode.com/tencent_hunyuan/Hy3-preview) 开源了 **Hy3 preview** 模型权重。
|
| 141 |
+
|
| 142 |
+
## 模型链接
|
| 143 |
+
|
| 144 |
+
|
| 145 |
+
| 模型名 | 简介 | Hugging Face | ModelScope | GitCode |
|
| 146 |
+
|:---|:---|:---:|:---:|:---:|
|
| 147 |
+
| Hy3 preview | Instruct 模型 | 🤗 [Model](https://huggingface.co/tencent/Hy3-preview) | [Model](https://modelscope.cn/models/Tencent-Hunyuan/Hy3-preview) | [Model](https://ai.gitcode.com/tencent_hunyuan/Hy3-preview) |
|
| 148 |
+
| Hy3 preview-Base | 预训练基座模型 | 🤗 [Model](https://huggingface.co/tencent/Hy3-preview-Base) | [Model](https://modelscope.cn/models/Tencent-Hunyuan/Hy3-preview-Base) | [Model](https://ai.gitcode.com/tencent_hunyuan/Hy3-preview-Base) |
|
| 149 |
+
|
| 150 |
+
## 快速开始
|
| 151 |
+
|
| 152 |
+
建议先通过 [vLLM](#使用-vllm-推理) 或 [SGLang](#使用-sglang-推理) 部署服务,然后通过 OpenAI 兼容 API 调用:
|
| 153 |
+
|
| 154 |
+
```python
|
| 155 |
+
from openai import OpenAI
|
| 156 |
+
|
| 157 |
+
client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
|
| 158 |
+
|
| 159 |
+
response = client.chat.completions.create(
|
| 160 |
+
model="hy3-preview",
|
| 161 |
+
messages=[
|
| 162 |
+
{"role": "user", "content": "你好!请简单介绍一下你自己。"},
|
| 163 |
+
],
|
| 164 |
+
temperature=0.9,
|
| 165 |
+
top_p=1.0,
|
| 166 |
+
# reasoning_effort: "no_think"(默认,直接回复)、"low"、"high"(深度思维链)
|
| 167 |
+
extra_body={"chat_template_kwargs": {"reasoning_effort": "no_think"}},
|
| 168 |
+
)
|
| 169 |
+
print(response.choices[0].message.content)
|
| 170 |
+
```
|
| 171 |
+
|
| 172 |
+
> **推荐参数**:`temperature=0.9`,`top_p=1.0`。
|
| 173 |
+
>
|
| 174 |
+
> **推理模式**:复杂任务(数学、编程、推理)建议设置 `reasoning_effort="high"`,日常对话可使用默认的 `"no_think"` 直接回复。
|
| 175 |
+
|
| 176 |
+
具体部署方式请参考下方[推理和部署](#推理和部署)章节。
|
| 177 |
+
|
| 178 |
+
## 推理和部署
|
| 179 |
+
|
| 180 |
+
Hy3-preview 总参数量为 295B,当使用 8 张 GPU 时,建议使用 H20-3e 或其他有更大显存的卡型。
|
| 181 |
+
|
| 182 |
+
### vLLM
|
| 183 |
+
|
| 184 |
+
从源码构建 vLLM:
|
| 185 |
+
|
| 186 |
+
```bash
|
| 187 |
+
uv venv --python 3.12 --seed --managed-python
|
| 188 |
+
source .venv/bin/activate
|
| 189 |
+
git clone https://github.com/vllm-project/vllm.git
|
| 190 |
+
cd vllm
|
| 191 |
+
uv pip install --editable . --torch-backend=auto
|
| 192 |
+
```
|
| 193 |
+
|
| 194 |
+
启动 vLLM 服务,开启 MTP:
|
| 195 |
+
|
| 196 |
+
```bash
|
| 197 |
+
vllm serve tencent/Hy3-preview \
|
| 198 |
+
--tensor-parallel-size 8 \
|
| 199 |
+
--speculative-config.method mtp \
|
| 200 |
+
--speculative-config.num_speculative_tokens 1 \
|
| 201 |
+
--tool-call-parser hy_v3 \
|
| 202 |
+
--reasoning-parser hy_v3 \
|
| 203 |
+
--enable-auto-tool-choice \
|
| 204 |
+
--served-model-name hy3-preview
|
| 205 |
+
```
|
| 206 |
+
|
| 207 |
+
### SGLang
|
| 208 |
+
|
| 209 |
+
从源码构建 SGLang:
|
| 210 |
+
|
| 211 |
+
```bash
|
| 212 |
+
git clone https://github.com/sgl-project/sglang
|
| 213 |
+
cd sglang
|
| 214 |
+
pip3 install pip --upgrade
|
| 215 |
+
pip3 install "transformers>=5.6.0"
|
| 216 |
+
pip3 install -e "python"
|
| 217 |
+
```
|
| 218 |
+
|
| 219 |
+
启动 SGLang 服务,开启 MTP:
|
| 220 |
+
|
| 221 |
+
```bash
|
| 222 |
+
python3 -m sglang.launch_server \
|
| 223 |
+
--model tencent/Hy3-preview \
|
| 224 |
+
--tp 8 \
|
| 225 |
+
--tool-call-parser hunyuan \
|
| 226 |
+
--reasoning-parser hunyuan \
|
| 227 |
+
--speculative-num-steps 1 \
|
| 228 |
+
--speculative-eagle-topk 1 \
|
| 229 |
+
--speculative-num-draft-tokens 2 \
|
| 230 |
+
--speculative-algorithm EAGLE \
|
| 231 |
+
--served-model-name hy3-preview
|
| 232 |
+
```
|
| 233 |
+
|
| 234 |
+
## 模型训练
|
| 235 |
+
|
| 236 |
+
Hy3 preview 提供了完整的模型训练流程,支持全量微调和 LoRA 微调,同时支持 DeepSpeed ZeRO 多种配置以及 LLaMA-Factory 集成。
|
| 237 |
+
|
| 238 |
+
详细的训练文档请参考:[模型训练指南](./train/README_CN.md)
|
| 239 |
+
|
| 240 |
+
## 量化工具
|
| 241 |
+
|
| 242 |
+
我们提供了 [AngelSlim](https://github.com/tencent/AngelSlim)——一套易用、全面、高效的大模型压缩工具包,涵盖常用量化算法、低比特量化和投机采样等能力。
|
| 243 |
+
|
| 244 |
+
## 许可证
|
| 245 |
+
|
| 246 |
+
|
| 247 |
+
Hy3 preview 基于 **腾讯混元社区许可协议** 发布。详情请参阅 [LICENSE](./LICENSE)。
|
| 248 |
+
|
| 249 |
+
## 联系我们
|
| 250 |
+
|
| 251 |
+
如有问题或建议,欢迎通过邮件联系我们:
|
| 252 |
+
|
| 253 |
+
📧 **hunyuan_opensource@tencent.com**
|
| 254 |
+
|
| 255 |
+
---
|
| 256 |
+
|
| 257 |
+
<p align="center">
|
| 258 |
+
<i>Hy3 preview 由腾讯混元团队研发。</i>
|
| 259 |
+
</p>
|
assets/bench_agent_overview_v3.jpg
ADDED
|
Git LFS Details
|
assets/bench_claw_agent.png
ADDED
|
assets/bench_claw_agent2.jpg
ADDED
|
Git LFS Details
|
assets/bench_context.jpg
ADDED
|
Git LFS Details
|
assets/bench_stem.jpg
ADDED
|
Git LFS Details
|
assets/logo-en.png
ADDED
|
assets/logo-zh.png
ADDED
|
chat_template.jinja
ADDED
|
@@ -0,0 +1,195 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{#- ----------‑‑‑ special token variables ‑‑‑---------- -#}
|
| 2 |
+
{%- set bos_token = '<|hy_begin▁of▁sentence|>' %}
|
| 3 |
+
{%- set pad_token = '<|hy_▁pad▁|>' %}
|
| 4 |
+
{%- set user_token = '<|hy_User|>' %}
|
| 5 |
+
{%- set assistant_token = '<|hy_Assistant|>' %}
|
| 6 |
+
{%- set eos_token = '<|hy_eos|>' %}
|
| 7 |
+
{%- set think_begin_token = '<think>' %}
|
| 8 |
+
{%- set think_end_token = '</think>' %}
|
| 9 |
+
{%- set toolcalls_begin_token = '<tool_calls>' %}
|
| 10 |
+
{%- set toolcalls_end_token = '</tool_calls>' %}
|
| 11 |
+
{%- set toolcall_begin_token = '<tool_call>' %}
|
| 12 |
+
{%- set toolcall_end_token = '</tool_call>' %}
|
| 13 |
+
{%- set toolsep_token = '<tool_sep>' %}
|
| 14 |
+
{%- set argkey_begin_token = '<arg_key>' %}
|
| 15 |
+
{%- set argkey_end_token = '</arg_key>' %}
|
| 16 |
+
{%- set argvalue_begin_token = '<arg_value>' %}
|
| 17 |
+
{%- set argvalue_end_token = '</arg_value>' %}
|
| 18 |
+
{%- set toolresponses_begin_token = '<tool_responses>' %}
|
| 19 |
+
{%- set toolresponses_end_token = '</tool_responses>' %}
|
| 20 |
+
{%- set toolresponse_begin_token = '<tool_response>' %}
|
| 21 |
+
{%- set toolresponse_end_token = '</tool_response>' %}
|
| 22 |
+
{%- set reasoning_mode_token = '<|reasoning_mode|>' %}
|
| 23 |
+
{#- ----------‑‑‑ hyperparameters variables ‑‑‑---------- -#}
|
| 24 |
+
{%- if not add_generation_prompt is defined %}
|
| 25 |
+
{%- set add_generation_prompt = false %}
|
| 26 |
+
{%- endif %}
|
| 27 |
+
{%- if not interleaved_thinking is defined %}
|
| 28 |
+
{%- set interleaved_thinking = false %}
|
| 29 |
+
{%- endif %}
|
| 30 |
+
{%- if not tools %}
|
| 31 |
+
{%- set interleaved_thinking = false %}
|
| 32 |
+
{%- endif %}
|
| 33 |
+
{%- if not is_training is defined %}
|
| 34 |
+
{%- set is_training = false %}
|
| 35 |
+
{%- endif %}
|
| 36 |
+
{%- if not reasoning_effort is defined or reasoning_effort not in ['high', 'low', 'no_think'] %}
|
| 37 |
+
{%- set reasoning_effort = 'no_think' %}
|
| 38 |
+
{%- endif %}
|
| 39 |
+
|
| 40 |
+
{%- macro visible_text(content) -%}
|
| 41 |
+
{%- if content is string -%}
|
| 42 |
+
{{- content }}
|
| 43 |
+
{%- elif content is iterable and content is not mapping -%}
|
| 44 |
+
{%- for item in content -%}
|
| 45 |
+
{%- if item is mapping and item.type == 'text' -%}
|
| 46 |
+
{{- item.text }}
|
| 47 |
+
{%- elif item is string -%}
|
| 48 |
+
{{- item }}
|
| 49 |
+
{%- endif -%}
|
| 50 |
+
{%- endfor -%}
|
| 51 |
+
{%- elif content is none -%}
|
| 52 |
+
{{- '' }}
|
| 53 |
+
{%- else -%}
|
| 54 |
+
{{- content }}
|
| 55 |
+
{%- endif -%}
|
| 56 |
+
{%- endmacro -%}
|
| 57 |
+
|
| 58 |
+
{%- set ns = namespace(last_user_index=-1) %}
|
| 59 |
+
{%- set sp_ns = namespace(system_prompt='', is_first_sp=true) %}
|
| 60 |
+
{%- for message in messages %}
|
| 61 |
+
{%- if message['role'] == 'system' %}
|
| 62 |
+
{%- set sp_ns.system_prompt = sp_ns.system_prompt + visible_text(message['content']) %}
|
| 63 |
+
{%- endif %}
|
| 64 |
+
{%- if message['role'] == 'user' %}
|
| 65 |
+
{%- set ns.last_user_index = loop.index0 %}
|
| 66 |
+
{%- endif %}
|
| 67 |
+
{%- endfor %}
|
| 68 |
+
{%- if reasoning_effort is defined and reasoning_effort is string and reasoning_effort != '' and not tools %}
|
| 69 |
+
{%- set sp_ns.system_prompt = sp_ns.system_prompt + reasoning_mode_token + 'reasoning_effort:' + reasoning_effort %}
|
| 70 |
+
{%- endif %}
|
| 71 |
+
{{- bos_token }}
|
| 72 |
+
{{- sp_ns.system_prompt }}
|
| 73 |
+
{%- if tools %}
|
| 74 |
+
{%- if sp_ns.system_prompt != '' %}
|
| 75 |
+
{{- '\n\n# Tools\n\nYou may call one or more functions to assist with the user query.' }}
|
| 76 |
+
{%- else %}
|
| 77 |
+
{{- '# Tools\n\nYou may call one or more functions to assist with the user query.' }}
|
| 78 |
+
{%- endif %}
|
| 79 |
+
{{- '\n\nYou are provided with function signatures within <tools></tools> XML tags:' }}
|
| 80 |
+
{{- '\n<tools>\n' }}
|
| 81 |
+
{%- for tool in tools %}
|
| 82 |
+
{%- if loop.index0 > 0 %}
|
| 83 |
+
{{- '\n' }}
|
| 84 |
+
{%- endif %}
|
| 85 |
+
{{- tool | tojson }}
|
| 86 |
+
{%- endfor %}
|
| 87 |
+
{{- '\n</tools>\n\n' }}
|
| 88 |
+
{{- 'For function call returns, you should first print ' + toolcalls_begin_token + '\n' }}
|
| 89 |
+
{{- 'For each function call, you should return object like:\n' }}
|
| 90 |
+
{{- toolcall_begin_token + '{function-name}' + toolsep_token + '\n' }}
|
| 91 |
+
{{- argkey_begin_token + '{arg-key-1}' + argkey_end_token + '\n' }}
|
| 92 |
+
{{- argvalue_begin_token + '{arg-value-1}' + argvalue_end_token + '\n' }}
|
| 93 |
+
{{- argkey_begin_token + '{arg-key-2}' + argkey_end_token + '\n' }}
|
| 94 |
+
{{- argvalue_begin_token + '{arg-value-2}' + argvalue_end_token + '\n' }}
|
| 95 |
+
{{- '...\n' }}
|
| 96 |
+
{{- toolcall_end_token + '\n' }}
|
| 97 |
+
{%- if reasoning_effort is defined and reasoning_effort is string and reasoning_effort != '' %}
|
| 98 |
+
{{- 'At the end of function call returns, you should print ' + toolcalls_end_token + reasoning_mode_token + 'reasoning_effort:' + reasoning_effort }}
|
| 99 |
+
{%- else %}
|
| 100 |
+
{{- 'At the end of function call returns, you should print ' + toolcalls_end_token }}
|
| 101 |
+
{%- endif %}
|
| 102 |
+
{%- endif %}
|
| 103 |
+
|
| 104 |
+
{%- set prev_ns = namespace(is_tool=false, is_tool_first=true) %}
|
| 105 |
+
{%- set last_ns = namespace(last_is_assistant=false) %}
|
| 106 |
+
{%- for message in messages %}
|
| 107 |
+
{%- if message['role'] == 'user' %}
|
| 108 |
+
{%- if prev_ns.is_tool %}
|
| 109 |
+
{{- toolresponses_end_token }}
|
| 110 |
+
{%- endif %}
|
| 111 |
+
{{- user_token + visible_text(message['content']) }}
|
| 112 |
+
{%- set prev_ns.is_tool = false %}
|
| 113 |
+
{%- endif %}
|
| 114 |
+
{%- if message['role'] == 'assistant' %}
|
| 115 |
+
{%- if 'reasoning_content' in message and message['reasoning_content'] is string %}
|
| 116 |
+
{%- set rc = message['reasoning_content'] %}
|
| 117 |
+
{%- elif 'reasoning' in message and message['reasoning'] is string %}
|
| 118 |
+
{%- set rc = message['reasoning'] %}
|
| 119 |
+
{%- else %}
|
| 120 |
+
{%- set rc = none %}
|
| 121 |
+
{%- endif %}
|
| 122 |
+
{%- if is_training %}
|
| 123 |
+
{%- if rc is not none %}
|
| 124 |
+
{%- set content = think_begin_token + rc + think_end_token + visible_text(message['content']) %}
|
| 125 |
+
{%- else %}
|
| 126 |
+
{%- set content = think_begin_token + think_end_token + visible_text(message['content']) %}
|
| 127 |
+
{%- endif %}
|
| 128 |
+
{%- else %}
|
| 129 |
+
{%- if interleaved_thinking %}
|
| 130 |
+
{%- if loop.index0 > ns.last_user_index and rc is not none %}
|
| 131 |
+
{%- set content = think_begin_token + rc + think_end_token + visible_text(message['content']) %}
|
| 132 |
+
{%- else %}
|
| 133 |
+
{%- set content = think_begin_token + think_end_token + visible_text(message['content']) %}
|
| 134 |
+
{%- endif %}
|
| 135 |
+
{%- else %}
|
| 136 |
+
{%- set content = think_begin_token + think_end_token + visible_text(message['content']) %}
|
| 137 |
+
{%- endif %}
|
| 138 |
+
{%- endif %}
|
| 139 |
+
{%- if prev_ns.is_tool %}
|
| 140 |
+
{{- toolresponses_end_token }}
|
| 141 |
+
{%- endif %}
|
| 142 |
+
{{- assistant_token }}
|
| 143 |
+
{%- if message['tool_calls'] is defined and message['tool_calls'] %}
|
| 144 |
+
{%- set prev_ns.is_tool_first = true %}
|
| 145 |
+
{{- content }}
|
| 146 |
+
{{- toolcalls_begin_token + '\n' }}
|
| 147 |
+
{%- for tool in message['tool_calls'] %}
|
| 148 |
+
{%- set arguments = tool['function']['arguments'] %}
|
| 149 |
+
{{- toolcall_begin_token + tool['function']['name'] + toolsep_token + '\n' }}
|
| 150 |
+
{%- for key, value in arguments.items() %}
|
| 151 |
+
{{- argkey_begin_token + key + argkey_end_token + '\n' }}
|
| 152 |
+
{%- if value is not string %}
|
| 153 |
+
{%- set value = value | tojson(ensure_ascii=False) %}
|
| 154 |
+
{%- endif %}
|
| 155 |
+
{{- argvalue_begin_token + value + argvalue_end_token + '\n' }}
|
| 156 |
+
{%- endfor %}
|
| 157 |
+
{{- toolcall_end_token + '\n' }}
|
| 158 |
+
{%- endfor %}
|
| 159 |
+
{{- toolcalls_end_token + eos_token }}
|
| 160 |
+
{%- else %}
|
| 161 |
+
{%- if not loop.last or is_training %}
|
| 162 |
+
{{- content + eos_token }}
|
| 163 |
+
{%- else %}
|
| 164 |
+
{{- content }}
|
| 165 |
+
{%- endif %}
|
| 166 |
+
{%- endif %}
|
| 167 |
+
{%- set prev_ns.is_tool = false %}
|
| 168 |
+
{%- endif %}
|
| 169 |
+
{%- if message['role'] == 'tool' %}
|
| 170 |
+
{%- set prev_ns.is_tool = true %}
|
| 171 |
+
{%- if prev_ns.is_tool_first %}
|
| 172 |
+
{{- toolresponses_begin_token + '\n' }}
|
| 173 |
+
{%- set prev_ns.is_tool_first = false %}
|
| 174 |
+
{%- endif %}
|
| 175 |
+
{{- toolresponse_begin_token + '\n' + visible_text(message['content']) + '\n' + toolresponse_end_token + '\n' }}
|
| 176 |
+
{%- endif %}
|
| 177 |
+
{%- if loop.last and message['role'] == 'assistant' %}
|
| 178 |
+
{%- set last_ns.last_is_assistant = true %}
|
| 179 |
+
{%- endif %}
|
| 180 |
+
|
| 181 |
+
{%- endfor %}
|
| 182 |
+
{%- if prev_ns.is_tool %}
|
| 183 |
+
{{- toolresponses_end_token }}
|
| 184 |
+
{%- endif %}
|
| 185 |
+
{%- if add_generation_prompt %}
|
| 186 |
+
{%- if not last_ns.last_is_assistant %}
|
| 187 |
+
{%- if reasoning_effort is defined and reasoning_effort in ['low', 'high'] %}
|
| 188 |
+
{{- assistant_token + think_begin_token }}
|
| 189 |
+
{%- elif reasoning_effort is defined and reasoning_effort == 'no_think' %}
|
| 190 |
+
{{- assistant_token + think_begin_token + think_end_token }}
|
| 191 |
+
{%- else %}
|
| 192 |
+
{{- assistant_token }}
|
| 193 |
+
{%- endif %}
|
| 194 |
+
{%- endif %}
|
| 195 |
+
{%- endif %}
|
config.json
ADDED
|
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"HYV3ForCausalLM"
|
| 4 |
+
],
|
| 5 |
+
"bos_token_id": 120000,
|
| 6 |
+
"enable_attention_fp32_softmax": false,
|
| 7 |
+
"enable_lm_head_fp32": true,
|
| 8 |
+
"enable_moe_fp32_combine": false,
|
| 9 |
+
"eod_token_id": 120026,
|
| 10 |
+
"eos_token_id": 120025,
|
| 11 |
+
"expert_hidden_dim": 1536,
|
| 12 |
+
"moe_intermediate_size": 1536,
|
| 13 |
+
"first_k_dense_replace": 1,
|
| 14 |
+
"head_dim": 128,
|
| 15 |
+
"hidden_act": "silu",
|
| 16 |
+
"hidden_size": 4096,
|
| 17 |
+
"initializer_range": 0.006,
|
| 18 |
+
"intermediate_size": 13312,
|
| 19 |
+
"max_position_embeddings": 262144,
|
| 20 |
+
"model_type": "hy_v3",
|
| 21 |
+
"moe_router_enable_expert_bias": true,
|
| 22 |
+
"moe_router_use_sigmoid": true,
|
| 23 |
+
"num_attention_heads": 64,
|
| 24 |
+
"num_experts": 192,
|
| 25 |
+
"num_experts_per_tok": 8,
|
| 26 |
+
"num_hidden_layers": 80,
|
| 27 |
+
"num_key_value_heads": 8,
|
| 28 |
+
"num_shared_experts": 1,
|
| 29 |
+
"output_router_logits": true,
|
| 30 |
+
"pad_token_id": 120002,
|
| 31 |
+
"qk_norm": true,
|
| 32 |
+
"rms_norm_eps": 1e-05,
|
| 33 |
+
"rope_parameters": {
|
| 34 |
+
"rope_theta": 11158840.0,
|
| 35 |
+
"rope_type": "default"
|
| 36 |
+
},
|
| 37 |
+
"route_norm": true,
|
| 38 |
+
"router_scaling_factor": 2.826,
|
| 39 |
+
"sep_token_id": 120007,
|
| 40 |
+
"tie_word_embeddings": false,
|
| 41 |
+
"transformers_version": "5.6.0",
|
| 42 |
+
"use_cache": true,
|
| 43 |
+
"use_grouped_mm": false,
|
| 44 |
+
"vocab_size": 120832,
|
| 45 |
+
"num_nextn_predict_layers": 1
|
| 46 |
+
}
|
generation_config.json
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"bos_token_id": 120000,
|
| 3 |
+
"do_sample": true,
|
| 4 |
+
"eos_token_id": 120025,
|
| 5 |
+
"pad_token_id": 120002,
|
| 6 |
+
"temperature": 0.9,
|
| 7 |
+
"top_k": -1,
|
| 8 |
+
"top_p": 1,
|
| 9 |
+
"transformers_version": "5.6.0"
|
| 10 |
+
}
|
model-00001-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:90e847b6678d4eca7fb1035a4ed69bf6ae516a8af22f1913b3d1c8c16bb7bb02
|
| 3 |
+
size 5360373392
|
model-00002-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9c11912601b43eb7504502ce96b1e3e467fe1e79b6262ec710046831b223b878
|
| 3 |
+
size 5360373496
|
model-00003-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f52ecc0d02f3437be93acd1f84e041fd3c3526414e26f4f278353f2b6e1cbbd4
|
| 3 |
+
size 5360373688
|
model-00004-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3e459c5906fdd3d74c782f808c83954180cd4621c4555090e8682ca84d1e4711
|
| 3 |
+
size 5360374032
|
model-00005-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:12f8cbe56038861e940d7255d4ff8f336b62c520fa38bc2d2cb92b5b72c971e3
|
| 3 |
+
size 5360373848
|
model-00006-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e960d297ad0f085ddb12d6cd0fe3f8bb1fa3709dc42e6f2cf505480c298e5f51
|
| 3 |
+
size 5360373920
|
model-00007-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4515656bd97cce407b0c03115063b9287d916f21b01c1811b4b34c0c4e56621e
|
| 3 |
+
size 5360374008
|
model-00008-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c753613d3cfa739b7fc7e67843e21d815202fb8fb8b8f02e949db81322ac6055
|
| 3 |
+
size 5360373984
|
model-00009-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c61b6dcf801b2708a457c108d15126e38c8bd8b8337318c62875dec5e6478632
|
| 3 |
+
size 5360373880
|
model-00010-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c6ea52bd9b876da21bce317d13c317d9f66f1430245ff876773b03a3f0dfa95e
|
| 3 |
+
size 5360373912
|
model-00011-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4b62e6d7bb7993b3d620ecf2a0763789dea0f67d8c91d598121aa52a3a979ef4
|
| 3 |
+
size 5360374032
|
model-00012-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e6c0e5494404af20850d9da8a1b785c355e5b550f9f5225e3b0058d900a0828c
|
| 3 |
+
size 5360373936
|
model-00013-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f58b0036daf75e9e00ac5f8e775df57b42f555a12d73a63465673ae9a10a17b8
|
| 3 |
+
size 5360373904
|
model-00014-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:44950c905ea6004f44427bce3e89296883267b101012bc8844a6aa54568d9be4
|
| 3 |
+
size 5358512160
|
model-00015-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c34b1a5e2503ffeea3ef621d65619b170fc0c7711e83e82781af74898013dc78
|
| 3 |
+
size 5360373448
|
model-00016-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:02e312d724067b16dae51ab2b600b7a3d4f5a7ec7cfc75dac41420e81a26c3b2
|
| 3 |
+
size 5360373808
|
model-00017-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cccf4c47b080f76d88110f0916474684db79a36ec5a1855f4aa3764ef6c6b852
|
| 3 |
+
size 5360373952
|
model-00018-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:65849b7ca8545808cf7988298b127b4fb0b36ef9388ab82d4bab126f8bf23748
|
| 3 |
+
size 5360374072
|
model-00019-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f4dcfbbddfa1aa0ad4019277ff99a6615dc066a34ee8c06d746342dbc441a51a
|
| 3 |
+
size 5360373840
|
model-00020-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3cd78310c23ed7ef6e312b88f22d13c9ad1df66edcb79b574c02dc0f851f3f7c
|
| 3 |
+
size 5360373920
|
model-00021-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0f4a69b81ef2f7d923c017aaa01aaeed1c3140c666a74f44e264daa0adcda6bc
|
| 3 |
+
size 5360373976
|
model-00022-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fde0b3f0a6b820302008791b3f7fe4814080a93911c9a9625711cf2c1e427f58
|
| 3 |
+
size 5360374056
|
model-00023-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ffdb5e74e26e68cae3ff0768014227f84223951af15cb8f68b7f01ee45750ada
|
| 3 |
+
size 5360373832
|
model-00024-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6fa06a5e7357fb1b8b8710ae7cbfc604712ebdea0750942e50406c8063f2d4fd
|
| 3 |
+
size 5360373920
|
model-00025-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:87e397b688a497b5ce4e817cbd573f906e59ef8c9a8e7aaf2a02ede8a369fff9
|
| 3 |
+
size 5360374000
|
model-00026-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ae79ba6c96af3d3ac248c90d93eaa9c00415e9947200390c1b666756a733e279
|
| 3 |
+
size 5360374008
|
model-00027-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f057e8fb8e29d98b7120e5121222d4f7934b6e494ef27938d11145dd657e1cad
|
| 3 |
+
size 5360373864
|
model-00028-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9a2cfdbfe8b103ef2063b7b77ec4b2fe88ee554231b77a9dc9beb3821d65233d
|
| 3 |
+
size 5368749928
|
model-00029-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8107f87181761eae6b679d14bf8510f8a065a11423d51a092e590476a364c5d2
|
| 3 |
+
size 5360373520
|
model-00030-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:afc59493fddced04f2886ee24a4162a883d9d13e5def3763381faf65da837bab
|
| 3 |
+
size 5360373832
|
model-00031-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:af1807843453f56257d5876be8ac91bf8cdeb748d442cc61e464a8ede15a0f17
|
| 3 |
+
size 5360373912
|
model-00032-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fcf127e13a5bf43a7a93fe00a5e5b11a65b8619f1626fd6d9160e336f3a2f0c8
|
| 3 |
+
size 5360374056
|
model-00033-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:22f5dced1b39ab4140b9b1e299fe4562c6d34930219dc92cf398b2b2ade49121
|
| 3 |
+
size 5360373896
|
model-00034-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7fe694dfe69f2f9fd79d381425e0432e6566fc9d64cdb2af71161e1305fe6824
|
| 3 |
+
size 5360373920
|
model-00035-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:69de4a9a83282a408cb41737a7263f74abbfc676ecc0d0e60ac44dbb9e73d081
|
| 3 |
+
size 5360373928
|
model-00036-of-00112.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:60163fc29bec4e72eb15b9cc43cfb89a23b0f342e93ac4c6cf2d1b77a4750f55
|
| 3 |
+
size 5360374064
|