AutoGUI-v2: A Comprehensive Multi-Modal GUI Functionality Understanding Benchmark Paper • 2604.24441 • Published 11 days ago • 3
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published 16 days ago • 239
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 29 days ago • 245
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 30 days ago • 323
DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models Paper • 2603.26164 • Published Mar 27 • 364
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding Paper • 2603.19235 • Published Mar 19 • 95