TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 9 days ago • 107
ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners? Paper • 2603.25823 • Published 19 days ago • 43
MolmoPoint: Better Pointing for VLMs with Grounding Tokens Paper • 2603.28069 • Published 16 days ago • 8
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence Paper • 2603.07660 • Published Mar 8 • 86
VST Collection A comprehensive framework designed to cultivate VLMs with human-like visuospatial abilities. • 6 items • Updated Feb 1 • 6
DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning Paper • 2512.12799 • Published Dec 14, 2025 • 12
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12, 2025 • 216