Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency Paper • 2501.04931 • Published Jan 9, 2025
From reactive to cognitive: brain-inspired spatial intelligence for embodied agents Paper • 2508.17198 • Published Aug 24, 2025 • 10
SFHand: A Streaming Framework for Language-guided 3D Hand Forecasting and Embodied Manipulation Paper • 2511.18127 • Published Nov 22, 2025 • 1
Can MLLMs Read the Room? A Multimodal Benchmark for Verifying Truthfulness in Multi-Party Social Interactions Paper • 2510.27195 • Published Oct 31, 2025 • 1
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality? Paper • 2605.22109 • Published 2 days ago • 156
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published Feb 13 • 246