SimonTliang commited on
Commit
e74c355
·
verified ·
1 Parent(s): 4000109

chore: update readme

Browse files
Files changed (1) hide show
  1. README.md +75 -3
README.md CHANGED
@@ -1,3 +1,75 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ # Model Card for Qwen3-32B-LoRA-ECHO-KK-GRPO
6
+
7
+ <!-- Provide a quick summary of what the model is/does. -->
8
+
9
+ Based on Qwen3-32B, we applied the ECHO framework to perform LoRA fine-tuning on the KK dataset.
10
+ Ultimately, it achieved near-perfect scores on the 2–8 PPL test set, surpassing o4-mini, DeepSeek-R1, and o3-mini-high.
11
+
12
+ # Quick start
13
+
14
+
15
+
16
+ ```python
17
+
18
+ from transformers import AutoModelForCausalLM, AutoTokenizer
19
+
20
+ model_name = "GradientNetwork/Qwen3-32B-LoRA-ECHO-KK-GRPO"# load the tokenizer and the model
21
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
22
+ model = AutoModelForCausalLM.from_pretrained(
23
+ model_name,
24
+ torch_dtype="auto",
25
+ device_map="auto"
26
+ )
27
+
28
+ # prepare the model input
29
+ prompt = "K & K"
30
+ messages = [
31
+ {"role": "user", "content": prompt}
32
+ ]
33
+ text = tokenizer.apply_chat_template(
34
+ messages,
35
+ tokenize=False,
36
+ add_generation_prompt=True,
37
+ enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
38
+ )
39
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
40
+
41
+ # conduct text completion
42
+ generated_ids = model.generate(
43
+ **model_inputs,
44
+ max_new_tokens=32768
45
+ )
46
+ output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
47
+
48
+ # parsing thinking contenttry:
49
+ # rindex finding 151668 (</think>)
50
+ index = len(output_ids) - output_ids[::-1].index(151668)
51
+ except ValueError:
52
+ index = 0
53
+
54
+ thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
55
+ content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
56
+
57
+ print("thinking content:", thinking_content)
58
+ print("content:", content)
59
+ ```
60
+
61
+
62
+ # Citation
63
+ ```
64
+ @misc{xiao2025echodecouplinginferencetraining,
65
+ title={Echo: Decoupling Inference and Training for Large-Scale RL Alignment on Heterogeneous Swarms},
66
+ author={Jie Xiao and Changyuan Fan and Qingnan Ren and Alfred Long and Yuchen Zhang and Rymon Yu and Eric Yang and Lynn Ai and Shaoduo Gan},
67
+ year={2025},
68
+ eprint={2508.05387},
69
+ archivePrefix={arXiv},
70
+ primaryClass={cs.LG},
71
+ url={https://arxiv.org/abs/2508.05387},
72
+ }
73
+ ```
74
+
75
+