Papers
arxiv:2305.01521

Unlocking the Power of Representations in Long-term Novelty-based Exploration

Published on May 2, 2023
Authors:
,
,
,
,
,
,
,

Abstract

RECODE, a novelty-based exploration method using clustering for density estimation, combined with masked transformers for multi-step prediction, achieves state-of-the-art performance in 3D exploration tasks and hard Atari games.

AI-generated summary

We introduce Robust Exploration via Clustering-based Online Density Estimation (RECODE), a non-parametric method for novelty-based exploration that estimates visitation counts for clusters of states based on their similarity in a chosen embedding space. By adapting classical clustering to the nonstationary setting of Deep RL, RECODE can efficiently track state visitation counts over thousands of episodes. We further propose a novel generalization of the inverse dynamics loss, which leverages masked transformer architectures for multi-step prediction; which in conjunction with RECODE achieves a new state-of-the-art in a suite of challenging 3D-exploration tasks in DM-Hard-8. RECODE also sets new state-of-the-art in hard exploration Atari games, and is the first agent to reach the end screen in "Pitfall!".

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2305.01521
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2305.01521 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2305.01521 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.