Do What I Say: A Spoken Prompt Dataset for Instruction-Following Paper • 2603.09881 • Published Mar 10 • 9
Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions Paper • 2602.13013 • Published Feb 13 • 54