Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization Paper • 2604.09574 • Published Feb 24 • 28
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 10 days ago • 312