Test-Time Compute/Optimal Scaling
updated
Scaling LLM Inference with Optimized Sample Compute Allocation
Paper
• 2410.22480
• Published
Test-time Computing: from System-1 Thinking to System-2 Thinking
Paper
• 2501.02497
• Published • 45
Scaling of Search and Learning: A Roadmap to Reproduce o1 from
Reinforcement Learning Perspective
Paper
• 2412.14135
• Published
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta
Chain-of-Though
Paper
• 2501.04682
• Published • 99
O1 Replication Journey: A Strategic Progress Report -- Part 1
Paper
• 2410.18982
• Published • 3
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical
Reasoning
Paper
• 2501.06458
• Published • 31
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning
Trajectories Search
Paper
• 2410.03864
• Published • 12
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Paper
• 2501.18585
• Published • 61
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding
Paper
• 2501.13200
• Published • 70
Demystifying Long Chain-of-Thought Reasoning in LLMs
Paper
• 2502.03373
• Published • 58
Inference-Time Scaling for Generalist Reward Modeling
Paper
• 2504.02495
• Published • 58
TTRL: Test-Time Reinforcement Learning
Paper
• 2504.16084
• Published • 122
Scaling Test-time Compute for LLM Agents
Paper
• 2506.12928
• Published • 63