VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images Paper • 2604.09531 • Published 6 days ago • 8
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published 10 days ago • 232
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 14 days ago • 847
FileGram: Grounding Agent Personalization in File-System Behavioral Traces Paper • 2604.04901 • Published 10 days ago • 40
A Simple Baseline for Streaming Video Understanding Paper • 2604.02317 • Published 14 days ago • 72
PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning Paper • 2603.26653 • Published 19 days ago • 18
HippoCamp: Benchmarking Contextual Agents on Personal Computers Paper • 2604.01221 • Published 14 days ago • 29
Insight-V++: Towards Advanced Long-Chain Visual Reasoning with Multimodal Large Language Models Paper • 2603.18118 • Published 28 days ago • 12
LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory Paper • 2603.03269 • Published Mar 3 • 63
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence Paper • 2603.07660 • Published Mar 8 • 86
Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition Paper • 2602.08439 • Published Feb 9 • 28
Back to Basics: Let Denoising Generative Models Denoise Paper • 2511.13720 • Published Nov 17, 2025 • 70
Uni-MMMU: A Massive Multi-discipline Multimodal Unified Benchmark Paper • 2510.13759 • Published Oct 15, 2025 • 11
VChain: Chain-of-Visual-Thought for Reasoning in Video Generation Paper • 2510.05094 • Published Oct 6, 2025 • 38
RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark Paper • 2509.24897 • Published Sep 29, 2025 • 46
On the Theoretical Limitations of Embedding-Based Retrieval Paper • 2508.21038 • Published Aug 28, 2025 • 21
EgoTwin: Dreaming Body and View in First Person Paper • 2508.13013 • Published Aug 18, 2025 • 21
4DNeX: Feed-Forward 4D Generative Modeling Made Easy Paper • 2508.13154 • Published Aug 18, 2025 • 62