-
Attention Is All You Need
Paper • 1706.03762 • Published • 121 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2509.04664
-
Reasoning with Sampling: Your Base Model is Smarter Than You Think
Paper • 2510.14901 • Published • 48 -
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?
Paper • 2505.23359 • Published • 38 -
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation
Paper • 2506.02397 • Published • 36 -
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Paper • 2505.24864 • Published • 146
-
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
Paper • 2509.09372 • Published • 254 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 213 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 238 -
Why Language Models Hallucinate
Paper • 2509.04664 • Published • 199
-
Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
Paper • 2510.03259 • Published • 57 -
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense
Paper • 2510.07242 • Published • 30 -
First Try Matters: Revisiting the Role of Reflection in Reasoning Models
Paper • 2510.08308 • Published • 24 -
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
Paper • 2510.03222 • Published • 76
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 282 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 254 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 263
-
Attention Is All You Need
Paper • 1706.03762 • Published • 121 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
-
Reasoning with Sampling: Your Base Model is Smarter Than You Think
Paper • 2510.14901 • Published • 48 -
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?
Paper • 2505.23359 • Published • 38 -
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation
Paper • 2506.02397 • Published • 36 -
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Paper • 2505.24864 • Published • 146
-
Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement Learning
Paper • 2510.03259 • Published • 57 -
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense
Paper • 2510.07242 • Published • 30 -
First Try Matters: Revisiting the Role of Reflection in Reasoning Models
Paper • 2510.08308 • Published • 24 -
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
Paper • 2510.03222 • Published • 76
-
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
Paper • 2509.09372 • Published • 254 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 213 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 238 -
Why Language Models Hallucinate
Paper • 2509.04664 • Published • 199
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 282 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 254 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 263