OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence Paper • 2602.08683 • Published Feb 9 • 52
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration Paper • 2505.03673 • Published May 6, 2025 • 2
Robo-Dopamine: General Process Reward Modeling for High-Precision Robotic Manipulation Paper • 2512.23703 • Published Dec 29, 2025 • 7
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training Paper • 2509.23661 • Published Sep 28, 2025 • 49
LLaVA-OneVision-1.5 Collection https://github.com/EvolvingLMMs-Lab/LLaVA-OneVision-1.5 • 9 items • Updated Oct 21, 2025 • 19
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning Paper • 2503.20752 • Published Mar 26, 2025 • 1
RoboBrain2.0 Collection RoboBrain 2.0: See Better. Think Harder. Do Smarter. • 6 items • Updated Feb 4 • 19
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics Paper • 2506.04308 • Published Jun 4, 2025 • 43