abcsk123 commited on
Commit
bad8a5c
·
verified ·
1 Parent(s): 6fd2ebb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -27,5 +27,5 @@ The project tracked performance gains and losses across multiple iterations:
27
  * **SFT v3 (released)**: **0.671 (+6.8%)** — achieved through precise loss calculation and data cleaning.
28
  * **DPO Merged**: < 0.628 — highlighting the extreme sensitivity of code models to preference data quality.
29
 
30
- ⚠️ Status & Roadmap
31
  This project is actively under development. Currently, the DPO alignment exhibits performance regression (Pass@1 < 0.628) due to preference data sensitivity. We are investigating advanced filtering and reward modeling to resolve this. Optimized weights will be uploaded as soon as the alignment bottleneck is cleared.
 
27
  * **SFT v3 (released)**: **0.671 (+6.8%)** — achieved through precise loss calculation and data cleaning.
28
  * **DPO Merged**: < 0.628 — highlighting the extreme sensitivity of code models to preference data quality.
29
 
30
+ ## ⚠️ Status & Roadmap
31
  This project is actively under development. Currently, the DPO alignment exhibits performance regression (Pass@1 < 0.628) due to preference data sensitivity. We are investigating advanced filtering and reward modeling to resolve this. Optimized weights will be uploaded as soon as the alignment bottleneck is cleared.