CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 24 days ago • 96
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published Mar 13 • 148
view article Article Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance Dec 9, 2025 • 84
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published Nov 10, 2025 • 107
AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs Paper • 2509.08031 • Published Sep 9, 2025 • 21
WebMMU Collection WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation • 2 items • Updated Sep 16, 2025 • 2
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper • 2502.01341 • Published Feb 3, 2025 • 39
Multimodal foundation world models for generalist embodied agents Paper • 2406.18043 • Published Jun 26, 2024 • 1