VideoFlexTok: Flexible-Length Coarse-to-Fine Video Tokenization Paper • 2604.12887 • Published 6 days ago • 4
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 11 days ago • 238
Emergent Social Intelligence Risks in Generative Multi-Agent Systems Paper • 2603.27771 • Published 21 days ago • 52
Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration Paper • 2603.24800 • Published 25 days ago • 68
The Pulse of Motion: Measuring Physical Frame Rate from Visual Dynamics Paper • 2603.14375 • Published Mar 15 • 19
ContextBench: A Benchmark for Context Retrieval in Coding Agents Paper • 2602.05892 • Published Feb 5 • 4
Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching Paper • 2602.12280 • Published Feb 12 • 34
What does RL improve for Visual Reasoning? A Frankenstein-Style Analysis Paper • 2602.12395 • Published Feb 12 • 17
Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception Paper • 2602.11858 • Published Feb 12 • 62
Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos Paper • 2602.23543 • Published Feb 26 • 9
Expert Threshold Routing for Autoregressive Language Modeling with Dynamic Computation Allocation and Load Balancing Paper • 2603.11535 • Published Mar 12 • 10
Digital Twin AI: Opportunities and Challenges from Large Language Models to World Models Paper • 2601.01321 • Published Jan 4 • 20
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published Mar 17 • 138
CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era Paper • 2602.23452 • Published Feb 26 • 17