Papers
arxiv:2512.16848

Meta-RL Induces Exploration in Language Agents

Published on Dec 18, 2025
ยท Submitted by
Yulun Jiang
on Dec 22, 2025
Authors:
,
,

Abstract

LaMer is a meta-reinforcement learning framework that enables large language model agents to actively explore and adapt through cross-episode training and in-context policy adaptation without gradient updates.

AI-generated summary

Reinforcement learning (RL) has enabled the training of large language model (LLM) agents to interact with the environment and to solve multi-turn long-horizon tasks. However, the RL-trained agents often struggle in tasks that require active exploration and fail to efficiently adapt from trial-and-error experiences. In this paper, we present LaMer, a general Meta-RL framework that enables LLM agents to actively explore and learn from the environment feedback at test time. LaMer consists of two key components: (i) a cross-episode training framework to encourage exploration and long-term rewards optimization; and (ii) in-context policy adaptation via reflection, allowing the agent to adapt their policy from task feedback signal without gradient update. Experiments across diverse environments show that LaMer significantly improves performance over RL baselines, with 11%, 14%, and 19% performance gains on Sokoban, MineSweeper and Webshop, respectively. Moreover, LaMer also demonstrates better generalization to more challenging or previously unseen tasks compared to the RL-trained agents. Overall, our results demonstrate that Meta-RL provides a principled approach to induce exploration in language agents, enabling more robust adaptation to novel environments through learned exploration strategies.

Community

Paper author Paper submitter

๐ŸŒŠLaMer, a general Meta-RL framework that enables LLM agents to explore and learn from the environment feedback at test time.

Glad to see others are researching the area of meta-RL exploration. I have done similar work in this space:

https://arxiv.org/pdf/2508.01287

If you want to collaborate give me a shout.

ยท
Paper author

Thank you for sharing your interesting work!

arXiv lens breakdown of this paper ๐Ÿ‘‰ https://arxivlens.com/PaperView/Details/meta-rl-induces-exploration-in-language-agents-7228-6ad15b2c

  • Executive Summary
  • Detailed Breakdown
  • Practical Applications

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2512.16848
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2512.16848 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2512.16848 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 1