-
Monitored Markov Decision Processes
Paper • 2402.06819 • Published -
Generalization in Monitored Markov Decision Processes (Mon-MDPs)
Paper • 2505.08988 • Published -
Bayesian Risk Markov Decision Processes
Paper • 2106.02558 • Published -
Sotopia-RL: Reward Design for Social Intelligence
Paper • 2508.03905 • Published • 23
Collections
Discover the best community collections!
Collections including paper arxiv:2303.11366
-
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 74 -
Generative Agents: Interactive Simulacra of Human Behavior
Paper • 2304.03442 • Published • 15 -
Voyager: An Open-Ended Embodied Agent with Large Language Models
Paper • 2305.16291 • Published • 13 -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 7
-
Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning
Paper • 2211.04325 • Published • 1 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 26 -
On the Opportunities and Risks of Foundation Models
Paper • 2108.07258 • Published • 2 -
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper • 2204.07705 • Published • 2
-
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Paper • 2406.14491 • Published • 96 -
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Paper • 2405.21060 • Published • 68 -
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Paper • 2405.20541 • Published • 24 -
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Paper • 2406.01574 • Published • 54
-
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 51 -
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper • 2412.21139 • Published • 26 -
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Paper • 2412.19723 • Published • 87 -
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
Paper • 2408.00764 • Published • 1
-
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 7 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing
Paper • 2305.11738 • Published • 9 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21
-
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance
Paper • 2405.06682 • Published • 3 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
Rethinking Chain-of-Thought from the Perspective of Self-Training
Paper • 2412.10827 • Published -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 7
-
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
Paper • 2304.09842 • Published • 2 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 34 -
Gorilla: Large Language Model Connected with Massive APIs
Paper • 2305.15334 • Published • 6 -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 7
-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10 -
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Paper • 2305.10601 • Published • 15 -
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
Paper • 2404.02575 • Published • 50 -
Voyager: An Open-Ended Embodied Agent with Large Language Models
Paper • 2305.16291 • Published • 13
-
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper • 2412.21139 • Published • 26 -
Evaluating Language Models as Synthetic Data Generators
Paper • 2412.03679 • Published • 47 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117
-
Monitored Markov Decision Processes
Paper • 2402.06819 • Published -
Generalization in Monitored Markov Decision Processes (Mon-MDPs)
Paper • 2505.08988 • Published -
Bayesian Risk Markov Decision Processes
Paper • 2106.02558 • Published -
Sotopia-RL: Reward Design for Social Intelligence
Paper • 2508.03905 • Published • 23
-
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 7 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing
Paper • 2305.11738 • Published • 9 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 21
-
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 74 -
Generative Agents: Interactive Simulacra of Human Behavior
Paper • 2304.03442 • Published • 15 -
Voyager: An Open-Ended Embodied Agent with Large Language Models
Paper • 2305.16291 • Published • 13 -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 7
-
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance
Paper • 2405.06682 • Published • 3 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
Rethinking Chain-of-Thought from the Perspective of Self-Training
Paper • 2412.10827 • Published -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 7
-
Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning
Paper • 2211.04325 • Published • 1 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 26 -
On the Opportunities and Risks of Foundation Models
Paper • 2108.07258 • Published • 2 -
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper • 2204.07705 • Published • 2
-
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
Paper • 2304.09842 • Published • 2 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 34 -
Gorilla: Large Language Model Connected with Massive APIs
Paper • 2305.15334 • Published • 6 -
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper • 2303.11366 • Published • 7
-
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Paper • 2406.14491 • Published • 96 -
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Paper • 2405.21060 • Published • 68 -
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Paper • 2405.20541 • Published • 24 -
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Paper • 2406.01574 • Published • 54
-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10 -
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Paper • 2305.10601 • Published • 15 -
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
Paper • 2404.02575 • Published • 50 -
Voyager: An Open-Ended Embodied Agent with Large Language Models
Paper • 2305.16291 • Published • 13
-
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 51 -
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper • 2412.21139 • Published • 26 -
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Paper • 2412.19723 • Published • 87 -
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
Paper • 2408.00764 • Published • 1
-
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper • 2412.21139 • Published • 26 -
Evaluating Language Models as Synthetic Data Generators
Paper • 2412.03679 • Published • 47 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117