Zikun Li's picture

Zikun Li

zikun-li

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

upvoted a paper 18 days ago

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

upvoted a paper 18 days ago

OpenClaw-RL: Train Any Agent Simply by Talking

View all activity

Organizations

None yet

upvoted a paper 4 days ago

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Paper • 2604.06628 • Published 7 days ago • 308

upvoted 3 papers 18 days ago

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Paper • 2603.24472 • Published 20 days ago • 53

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 151

Flash-KMeans: Fast and Memory-Efficient Exact K-Means

Paper • 2603.09229 • Published Mar 10 • 82

upvoted a paper about 1 month ago

Helios: Real Real-Time Long Video Generation Model

Paper • 2603.04379 • Published Mar 4 • 186

upvoted 6 papers 2 months ago

AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents

Paper • 2602.06855 • Published Feb 6 • 82

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 27

Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning

Paper • 2601.19280 • Published Jan 27 • 9

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published Jan 28 • 43

Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation

Paper • 2601.20614 • Published Jan 28 • 120

Advancing Open-source World Models

Paper • 2601.20540 • Published Jan 28 • 135

upvoted 9 papers 3 months ago

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 202

InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning

Paper • 2601.14209 • Published Jan 20 • 6

Your Group-Relative Advantage Is Biased

Paper • 2601.08521 • Published Jan 13 • 158

NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Paper • 2601.11004 • Published Jan 16 • 30

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Paper • 2601.08808 • Published Jan 13 • 39

Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning

Paper • 2601.07641 • Published Jan 12 • 48

Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

Paper • 2601.09667 • Published Jan 14 • 93

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Paper • 2601.08763 • Published Jan 13 • 150

ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking

Paper • 2601.06487 • Published Jan 10 • 54