ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published 2 days ago • 107
WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation Paper • 2511.11434 • Published Nov 14, 2025 • 47
SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models Paper • 2510.08531 • Published Oct 9, 2025 • 12
GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts Paper • 2509.25160 • Published Sep 29, 2025 • 32
EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering Paper • 2509.25175 • Published Sep 29, 2025 • 31