V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising Paper • 2603.16792 • Published Mar 17 • 3
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis Paper • 2503.22420 • Published Mar 28, 2025
ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes Paper • 2304.04321 • Published Apr 9, 2023
Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation Paper • 2211.15402 • Published Nov 28, 2022