HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published 15 days ago • 29
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published Jul 30, 2025 • 101
MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs Paper • 2505.21327 • Published May 27, 2025 • 83