-
R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization
Paper • 2503.10615 • Published • 17 -
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation
Paper • 2503.10630 • Published • 6 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 39 -
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL
Paper • 2503.07536 • Published • 88
Rui Sun PRO
ThreeSR
AI & ML interests
Vision and Language Multimodal Learning, CV, NLP, LLM
Recent Activity
upvoted a paper about 13 hours ago
OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence upvoted a paper about 13 hours ago
MolmoWeb: Open Visual Web Agent and Open Data for the Open Web upvoted a paper about 13 hours ago
Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models