OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks Paper • 2604.08539 • Published 7 days ago • 48
MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence Paper • 2512.10863 • Published Dec 11, 2025 • 22
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing Paper • 2512.10284 • Published Dec 11, 2025 • 26
G^2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning Paper • 2511.21688 • Published Nov 26, 2025 • 8
Model Extrapolation Expedites Alignment Collection Better aligned models obtained by model extrapolation (ExPO) • 23 items • Updated Mar 2 • 17