stevenkuang commited on
Commit
a9cc306
·
verified ·
1 Parent(s): 2a9a783

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -14
README.md CHANGED
@@ -83,6 +83,33 @@ For more experimental results and analysis, please refer to our [report](./HY_MT
83
  ---
84
 
85
  ## Inference and Deployment
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86
  ### transformers
87
 
88
  transformers>=5.6.0
@@ -194,20 +221,6 @@ cmake --build build --config Release
194
  ```
195
 
196
 
197
- For 1.8B and 7B, we recommend using the following parameters for inference. Note that our models do not have a default system_prompt.
198
-
199
- ```json
200
-
201
- {
202
- "temperature": 0.7,
203
- "top_p": 0.6,
204
- "top_k": 20,
205
- "repetition_penalty": 1.05,
206
- "max_tokens": 4096
207
- }
208
- ```
209
-
210
-
211
  ## Model Training
212
  Hy-MT2 provides a complete model training pipeline, supporting both full-parameter fine-tuning and LoRA fine-tuning, as well as multiple DeepSpeed ZeRO configurations and LLaMA-Factory integration.
213
 
 
83
  ---
84
 
85
  ## Inference and Deployment
86
+
87
+ For 1.8B and 7B, we recommend using the following parameters for inference. Note that our models do not have a default system_prompt.
88
+
89
+ ```json
90
+
91
+ {
92
+ "temperature": 0.7,
93
+ "top_p": 0.6,
94
+ "top_k": 20,
95
+ "repetition_penalty": 1.05,
96
+ "max_tokens": 4096
97
+ }
98
+ ```
99
+
100
+ For 30B-A3B, we recommend using the following parameters for inference. Note that our models do not have a default system_prompt.
101
+
102
+ ```json
103
+
104
+ {
105
+ "temperature": 0.7,
106
+ "top_p": 1.0,
107
+ "top_k": -1,
108
+ "repetition_penalty": 1.0,
109
+ "max_tokens": 4096
110
+ }
111
+ ```
112
+
113
  ### transformers
114
 
115
  transformers>=5.6.0
 
221
  ```
222
 
223
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
224
  ## Model Training
225
  Hy-MT2 provides a complete model training pipeline, supporting both full-parameter fine-tuning and LoRA fine-tuning, as well as multiple DeepSpeed ZeRO configurations and LLaMA-Factory integration.
226