End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer Paper • 2605.00503 • Published 7 days ago • 8
DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published 23 days ago • 62
DARE: Diffusion Large Language Models Alignment and Reinforcement Executor Paper • 2604.04215 • Published Apr 5 • 21
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 627
ResAdapt: Adaptive Resolution for Efficient Multimodal Reasoning Paper • 2603.28610 • Published Mar 30 • 20