ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 5 days ago • 247
LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 16 days ago • 137
FaithLens: Detecting and Explaining Faithfulness Hallucination Paper • 2512.20182 • Published Dec 23, 2025 • 9