-
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
Paper • 2511.14460 • Published • 22 -
EVOLvE: Evaluating and Optimizing LLMs For Exploration
Paper • 2410.06238 • Published • 1 -
Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach
Paper • 2306.03604 • Published • 1 -
AGILE: A Novel Reinforcement Learning Framework of LLM Agents
Paper • 2405.14751 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2306.03604
-
Ada-Instruct: Adapting Instruction Generators for Complex Reasoning
Paper • 2310.04484 • Published • 5 -
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 82 -
Democratizing Reasoning Ability: Tailored Learning from Large Language Model
Paper • 2310.13332 • Published • 16
-
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
Paper • 2310.15123 • Published • 8 -
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 -
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search
Paper • 2310.13227 • Published • 15 -
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10
-
Moral Foundations of Large Language Models
Paper • 2310.15337 • Published • 1 -
Specific versus General Principles for Constitutional AI
Paper • 2310.13798 • Published • 3 -
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper • 2310.13639 • Published • 25 -
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 53
-
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
Paper • 2310.15123 • Published • 8 -
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search
Paper • 2310.13227 • Published • 15 -
LASER: LLM Agent with State-Space Exploration for Web Navigation
Paper • 2309.08172 • Published • 14 -
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10
-
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
Paper • 2310.15123 • Published • 8 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper • 2310.13671 • Published • 19 -
Self-Convinced Prompting: Few-Shot Question Answering with Repeated Introspection
Paper • 2310.05035 • Published • 1 -
Chain-of-Thought Reasoning is a Policy Improvement Operator
Paper • 2309.08589 • Published • 2
-
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
Paper • 2511.14460 • Published • 22 -
EVOLvE: Evaluating and Optimizing LLMs For Exploration
Paper • 2410.06238 • Published • 1 -
Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach
Paper • 2306.03604 • Published • 1 -
AGILE: A Novel Reinforcement Learning Framework of LLM Agents
Paper • 2405.14751 • Published
-
Ada-Instruct: Adapting Instruction Generators for Complex Reasoning
Paper • 2310.04484 • Published • 5 -
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 82 -
Democratizing Reasoning Ability: Tailored Learning from Large Language Model
Paper • 2310.13332 • Published • 16
-
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
Paper • 2310.15123 • Published • 8 -
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search
Paper • 2310.13227 • Published • 15 -
LASER: LLM Agent with State-Space Exploration for Web Navigation
Paper • 2309.08172 • Published • 14 -
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10
-
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
Paper • 2310.15123 • Published • 8 -
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 -
ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search
Paper • 2310.13227 • Published • 15 -
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10
-
Branch-Solve-Merge Improves Large Language Model Evaluation and Generation
Paper • 2310.15123 • Published • 8 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper • 2310.13671 • Published • 19 -
Self-Convinced Prompting: Few-Shot Question Answering with Repeated Introspection
Paper • 2310.05035 • Published • 1 -
Chain-of-Thought Reasoning is a Policy Improvement Operator
Paper • 2309.08589 • Published • 2
-
Moral Foundations of Large Language Models
Paper • 2310.15337 • Published • 1 -
Specific versus General Principles for Constitutional AI
Paper • 2310.13798 • Published • 3 -
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper • 2310.13639 • Published • 25 -
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 53