1 58 1

Rui Sun PRO

ThreeSR

https://threesr.github.io/

AI & ML interests

Vision and Language Multimodal Learning, CV, NLP, LLM

Recent Activity

updated a collection 32 minutes ago

New Papers

upvoted a paper 37 minutes ago

WildDet3D: Scaling Promptable 3D Detection in the Wild

upvoted a paper 1 day ago

OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence

View all activity

Organizations

upvoted a paper 37 minutes ago

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published 5 days ago • 204

upvoted 7 papers 1 day ago

Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

Paper • 2604.08224 • Published 5 days ago • 44

KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation

Paper • 2604.08455 • Published 5 days ago • 41

DMax: Aggressive Parallel Decoding for dLLMs

Paper • 2604.08302 • Published 5 days ago • 45

upvoted 2 papers 4 days ago

OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks

Paper • 2604.08539 • Published 5 days ago • 45

Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning

Paper • 2604.04746 • Published 6 days ago • 66

upvoted a paper 12 days ago

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published 25 days ago • 331

upvoted a paper 14 days ago

CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

Paper • 2603.24440 • Published 19 days ago • 96

upvoted a paper 28 days ago

MoKus: Leveraging Cross-Modal Knowledge Transfer for Knowledge-Aware Concept Customization

Paper • 2603.12743 • Published Mar 13 • 3

upvoted a paper about 2 months ago

UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding

Paper • 2307.00862 • Published Jul 3, 2023 • 1

upvoted a paper 2 months ago

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Paper • 2601.18491 • Published Jan 26 • 125

upvoted 2 papers 3 months ago

Aligning Agentic World Models via Knowledgeable Experience Learning

Paper • 2601.13247 • Published Jan 19 • 15

GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization

Paper • 2511.15705 • Published Nov 19, 2025 • 98

upvoted an article 3 months ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

Jun 3, 2025

•

343

upvoted a paper 4 months ago

RELIC: Interactive Video World Model with Long-Horizon Memory

Paper • 2512.04040 • Published Dec 3, 2025 • 24

upvoted a paper 5 months ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 63

Rui Sun PRO

AI & ML interests

Recent Activity

Organizations

ThreeSR's activity

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data