-
Monitored Markov Decision Processes
Paper • 2402.06819 • Published -
Generalization in Monitored Markov Decision Processes (Mon-MDPs)
Paper • 2505.08988 • Published -
Bayesian Risk Markov Decision Processes
Paper • 2106.02558 • Published -
Sotopia-RL: Reward Design for Social Intelligence
Paper • 2508.03905 • Published • 23
Collections
Discover the best community collections!
Collections including paper arxiv:2507.03112
-
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents
Paper • 2507.03112 • Published • 34 -
A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
Paper • 2507.01925 • Published • 39 -
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens
Paper • 2506.17218 • Published • 29 -
WebSailor: Navigating Super-human Reasoning for Web Agent
Paper • 2507.02592 • Published • 126
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline
Paper • 2411.12814 • Published • 23 -
SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation
Paper • 2411.14525 • Published • 19 -
MRGen: Diffusion-based Controllable Data Engine for MRI Segmentation towards Unannotated Modalities
Paper • 2412.04106 • Published • 5 -
PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion
Paper • 2412.17780 • Published • 5
-
Monitored Markov Decision Processes
Paper • 2402.06819 • Published -
Generalization in Monitored Markov Decision Processes (Mon-MDPs)
Paper • 2505.08988 • Published -
Bayesian Risk Markov Decision Processes
Paper • 2106.02558 • Published -
Sotopia-RL: Reward Design for Social Intelligence
Paper • 2508.03905 • Published • 23
-
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents
Paper • 2507.03112 • Published • 34 -
A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
Paper • 2507.01925 • Published • 39 -
Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens
Paper • 2506.17218 • Published • 29 -
WebSailor: Navigating Super-human Reasoning for Web Agent
Paper • 2507.02592 • Published • 126
-
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline
Paper • 2411.12814 • Published • 23 -
SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation
Paper • 2411.14525 • Published • 19 -
MRGen: Diffusion-based Controllable Data Engine for MRI Segmentation towards Unannotated Modalities
Paper • 2412.04106 • Published • 5 -
PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion
Paper • 2412.17780 • Published • 5
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75