UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards Paper • 2604.14967 • Published 1 day ago • 6
UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards Paper • 2604.14967 • Published 1 day ago • 6
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence Paper • 2602.08683 • Published Feb 9 • 52
DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset Paper • 2601.10305 • Published Jan 15 • 36
DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset Paper • 2601.10305 • Published Jan 15 • 36
Towards Cross-View Point Correspondence in Vision-Language Models Paper • 2512.04686 • Published Dec 4, 2025