metadata
license: mit
library_name: transformers
pipeline_tag: robotics
tags:
- embodied-ai
- reinforcement-learning
- multimodal-llm
- computer-vision
Explore with Long-term Memory: A Benchmark and Multimodal LLM-based Reinforcement Learning Framework for Embodied Exploration
CVPR 2026
MemoryExplorer
MemoryExplorer is a multimodal large language model (MLLM) framework designed for Long-term Memory Embodied Exploration (LMEE). It unifies an agent's exploratory cognition and decision-making behaviors to promote lifelong learning in complex environments.
The model is fine-tuned through reinforcement learning to encourage active memory querying using a multi-task reward function that includes action prediction, frontier selection, and memory-based question answering.
Resources
- Paper: Explore with Long-term Memory: A Benchmark and Multimodal LLM-based Reinforcement Learning Framework for Embodied Exploration
- Project Page: https://wangsen99.github.io/papers/lmee/
- Repository: https://github.com/wangsen99/LMEE
- Benchmark: LMEE-Bench
Citation
If you find this work useful, please consider citing:
@inproceedings{wang2026explore,
title={Explore with Long-term Memory: A Benchmark and Multimodal LLM-based Reinforcement Learning Framework for Embodied Exploration},
author={Wang, Sen and Liu, Bangwei and Gao, Zhenkun and Ma, Lizhuang and Wang, Xuhong and Xie, Yuan and Tan, Xin},
booktitle={Proceedings of the IEEE/CVF Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}