hhoh commited on
Commit
8a03ec0
·
verified ·
1 Parent(s): f1f6b69

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ### Use with transformers
2
+ First, please install transformers, recommends v4.56.0
3
+ ```SHELL
4
+ pip install transformers==4.56.0
5
+ ```
6
+
7
+ *!!! If you want to load fp8 model with transformers, you need to change the name"ignored_layers" in config.json to "ignore" and upgrade the compressed-tensors to compressed-tensors-0.11.0.*
8
+
9
+ The following code snippet shows how to use the transformers library to load and apply the model.
10
+
11
+ we use tencent/HY-MT1.5-1.8B for example
12
+
13
+ ```python
14
+ from transformers import AutoModelForCausalLM, AutoTokenizer
15
+ import os
16
+
17
+ model_name_or_path = "tencent/HY-MT1.5-1.8B"
18
+
19
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
20
+ model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map="auto") # You may want to use bfloat16 and/or move to GPU here
21
+ messages = [
22
+ {"role": "user", "content": "Translate the following segment into Chinese, without additional explanation.\n\nIt’s on the house."},
23
+ ]
24
+ tokenized_chat = tokenizer.apply_chat_template(
25
+ messages,
26
+ tokenize=True,
27
+ add_generation_prompt=False,
28
+ return_tensors="pt"
29
+ )
30
+
31
+ outputs = model.generate(tokenized_chat.to(model.device), max_new_tokens=2048)
32
+ output_text = tokenizer.decode(outputs[0])
33
+ ```
34
+
35
+ We recommend using the following set of parameters for inference. Note that our model does not have the default system_prompt.
36
+
37
+ ```json
38
+ {
39
+ "top_k": 20,
40
+ "top_p": 0.6,
41
+ "repetition_penalty": 1.05,
42
+ "temperature": 0.7
43
+ }
44
+ ```