Hyeongjin Kim's picture

2 5

Hyeongjin Kim

madokalif

·

Index-23227

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States

upvoted a paper 1 day ago

KL for a KL: On-Policy Distillation with Control Variate Baseline

updated a dataset 20 days ago

madokalif/pluralistic-value-conflict-benchmark

View all activity

Organizations

None yet

upvoted 2 papers 1 day ago

Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States

Paper • 2605.07579 • Published 5 days ago • 12

KL for a KL: On-Policy Distillation with Control Variate Baseline

Paper • 2605.07865 • Published 5 days ago • 14