stevenkuang commited on
Commit
355487d
·
verified ·
1 Parent(s): 1be4dbc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md CHANGED
@@ -181,6 +181,45 @@ Launch SGLang server:
181
  python3 -m sglang.launch_server --model tencent/Hy-MT2-1.8B-FP8 --tp 1
182
  ```
183
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
184
 
185
  ## Model Training
186
  Hy-MT2 provides a complete model training pipeline, supporting both full-parameter fine-tuning and LoRA fine-tuning, as well as multiple DeepSpeed ZeRO configurations and LLaMA-Factory integration.
 
181
  python3 -m sglang.launch_server --model tencent/Hy-MT2-1.8B-FP8 --tp 1
182
  ```
183
 
184
+ ### llama_cpp
185
+ **❕❕ This gguf depends on our STQ kernel, which is released at [PR #22836](https://github.com/ggml-org/llama.cpp/pull/22836).**
186
+
187
+ #### Clone llama.cpp
188
+
189
+ ```bash
190
+ git clone https://github.com/ggml-org/llama.cpp.git
191
+ ```
192
+
193
+ #### Enter the llama.cpp folder
194
+
195
+ ```bash
196
+ cd llama.cpp
197
+ ```
198
+
199
+ #### Build llama.cpp
200
+
201
+ ```bash
202
+ cmake -B build
203
+ cmake --build build --config Release
204
+ ```
205
+
206
+ #### Run a completion example
207
+
208
+ ```bash
209
+ ./build/bin/llama-completion \
210
+ --model model.gguf \
211
+ -p "Translate the following segment into Chinese, without additional explanation:Hello" \
212
+ --jinja \
213
+ -ngl 0 \
214
+ -n 64 -st
215
+ ```
216
+
217
+ #### Run the llama.cpp benchmark
218
+
219
+ ```bash
220
+ ./build/bin/llama-bench -m model_zoo/model.gguf -ngl 0
221
+ ```
222
+
223
 
224
  ## Model Training
225
  Hy-MT2 provides a complete model training pipeline, supporting both full-parameter fine-tuning and LoRA fine-tuning, as well as multiple DeepSpeed ZeRO configurations and LLaMA-Factory integration.