lllyx commited on
Commit
e59bf75
·
verified ·
1 Parent(s): 513e04f

Add top navigation badges

Browse files
Files changed (1) hide show
  1. README.md +18 -1
README.md CHANGED
@@ -24,6 +24,23 @@ base_model_relation: finetune
24
 
25
  # Qwen3-4B-Base-GRPO
26
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  Qwen3-4B-Base-GRPO is a post-RL checkpoint trained with the **verl** framework.
28
  It starts from **Qwen3-4B-Base** and applies GRPO on the **DAPO-Math-17k-Processed** dataset for mathematical reasoning and problem-solving.
29
 
@@ -109,4 +126,4 @@ If you use this model, please consider citing the related paper:
109
  journal={arXiv preprint arXiv:2604.13016},
110
  year={2026}
111
  }
112
- ```
 
24
 
25
  # Qwen3-4B-Base-GRPO
26
 
27
+ <div align="center" style="line-height: 1;">
28
+ <a href="https://arxiv.org/abs/2604.13016" style="margin: 2px;">
29
+ <img alt="Paper" src="https://img.shields.io/badge/paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
30
+ </a>
31
+ <a href="https://github.com/thunlp/OPD" style="margin: 2px;">
32
+ <img alt="Github" src="https://img.shields.io/badge/OPD-000000?style=for-the-badge&logo=github&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
33
+ </a>
34
+ <a href="https://huggingface.co/papers/2604.13016" style="margin: 2px;">
35
+ <img alt="HF Papers" src="https://img.shields.io/badge/HF--Paper-%23FFD14D?style=for-the-badge&logo=huggingface&logoColor=black" style="display: inline-block; vertical-align: middle;"/>
36
+ </a>
37
+ <a href="https://x.com/HBX_hbx/status/2044464414829777354" style="margin: 2px;">
38
+ <img alt="Twitter" src="https://img.shields.io/badge/Twitter-%23000000.svg?style=for-the-badge&logo=x&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
39
+ </a>
40
+ </div>
41
+
42
+ <br>
43
+
44
  Qwen3-4B-Base-GRPO is a post-RL checkpoint trained with the **verl** framework.
45
  It starts from **Qwen3-4B-Base** and applies GRPO on the **DAPO-Math-17k-Processed** dataset for mathematical reasoning and problem-solving.
46
 
 
126
  journal={arXiv preprint arXiv:2604.13016},
127
  year={2026}
128
  }
129
+ ```