Papers
arxiv:2605.18727

DexHoldem: Playing Texas Hold'em with Dexterous Embodied System

Published on May 18
· Submitted by
Tianzhe
on May 19
Authors:
,
,
,
,
,
,
,
,

Abstract

DexHoldem presents a real-world benchmark for evaluating embodied agents in dexterous manipulation tasks, testing both primitive execution and higher-level perception and decision-making capabilities.

AI-generated summary

Evaluating embodied systems on real dexterous hardware requires more than isolated primitive skills: an agent must perceive a changing tabletop scene, choose a context-appropriate action, execute it with a dexterous hand, and leave the scene usable for later decisions. We introduce DexHoldem, a real-world system-level benchmark built around Texas Hold'em dexterous manipulation with a ShadowHand. DexHoldem provides 1,470 teleoperated demonstrations across 14 Texas Hold'em manipulation primitives, a standardized physical policy benchmark, and an agentic perception benchmark that tests whether agents can recover the structured game state needed for embodied decision making. On primitive execution, π_{0.5} obtains the highest task completion rate (61.2%), while π_{0.5} and π_0 tie on scene-preserving success rate (47.5%). On agentic perception, Opus 4.7 obtains the best strict problem-level accuracy (34.3%), while GPT 5.5 obtains the best average field-wise accuracy (66.8%), exposing a gap between isolated visual sub-capabilities and complete routing-relevant state recovery. Finally, we instantiate the full embodied-agent loop in three case studies, where waiting, recovery dispatches, human-help requests, and repeated primitive execution reveal how perception and policy errors accumulate during closed-loop deployment. DexHoldem therefore evaluates dexterous tabletop execution, agentic perception, and embodied decision routing in a shared physical setting. Project page: https://dexholdem.github.io/Dexholdem/.

Community

Paper submitter

Paper on benchmarking dexterous manipulation policies and embodied agents

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.18727
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.18727 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.18727 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2605.18727 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.