Junkang Wu's picture

Junkang Wu

junkang0909

·

https://junkangwu.github.io/

AI & ML interests

LLM alignment

Recent Activity

upvoted a paper 20 days ago

On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation

upvoted a paper 7 months ago

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

authored a paper 7 months ago

Aligning Multimodal LLM with Human Preference: A Survey

View all activity

Organizations

None yet

commented a paper 7 months ago

Quantile Advantage Estimation for Entropy-Safe Reasoning

Paper • 2509.22611 • Published Sep 26, 2025 • 120 •

commented a paper about 1 year ago

RePO: ReLU-based Preference Optimization

Paper • 2503.07426 • Published Mar 10, 2025 • 2 •