The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping Paper • 2604.11297 • Published 4 days ago • 132
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper • 2604.12627 • Published 3 days ago • 94
Training Data Efficiency in Multimodal Process Reward Models Paper • 2602.04145 • Published Feb 4 • 79
Thinking with Comics: Enhancing Multimodal Reasoning through Structured Visual Storytelling Paper • 2602.02453 • Published Feb 2 • 36
Language of Thought Shapes Output Diversity in Large Language Models Paper • 2601.11227 • Published Jan 16 • 9
FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data Paper • 2408.06273 • Published Aug 12, 2024 • 10
Language of Thought Shapes Output Diversity in Large Language Models Paper • 2601.11227 • Published Jan 16 • 9
PEAR: Phase Entropy Aware Reward for Efficient Reasoning Paper • 2510.08026 • Published Oct 9, 2025 • 9