Collections
Discover the best community collections!
Collections including paper arxiv:2602.00919
-
The Trinity of Consistency as a Defining Principle for General World Models
Paper • 2602.23152 • Published • 201 -
From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models
Paper • 2602.22859 • Published • 151 -
OmniGAIA: Towards Native Omni-Modal AI Agents
Paper • 2602.22897 • Published • 53 -
Imagination Helps Visual Reasoning, But Not Yet in Latent Space
Paper • 2602.22766 • Published • 44
-
ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands
Paper • 2512.24965 • Published • 43 -
VLingNav: Embodied Navigation with Adaptive Reasoning and Visual-Assisted Linguistic Memory
Paper • 2601.08665 • Published • 8 -
HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning
Paper • 2507.00833 • Published • 1 -
IGen: Scalable Data Generation for Robot Learning from Open-World Images
Paper • 2512.01773 • Published • 1
-
GLM-5: from Vibe Coding to Agentic Engineering
Paper • 2602.15763 • Published • 144 -
Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning
Paper • 2602.07845 • Published • 71 -
LLaDA2.1: Speeding Up Text Diffusion via Token Editing
Paper • 2602.08676 • Published • 70 -
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents
Paper • 2602.02474 • Published • 62
-
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots
Paper • 2602.00919 • Published • 323 -
SberRoboticsCenter/GreenVLA-2b-base
Robotics • Updated • 8 -
SberRoboticsCenter/GreenVLA-5b-base-stride-1
Robotics • Updated • 19 -
SberRoboticsCenter/GreenVLA-5b-base-stride-4
Robotics • Updated • 10
-
InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactions
Paper • 2602.06035 • Published • 23 -
PaperBanana: Automating Academic Illustration for AI Scientists
Paper • 2601.23265 • Published • 223 -
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots
Paper • 2602.00919 • Published • 323 -
Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation
Paper • 2602.16705 • Published • 26
-
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models
Paper • 2601.23143 • Published • 39 -
PaperBanana: Automating Academic Illustration for AI Scientists
Paper • 2601.23265 • Published • 223 -
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
BabyVision: Visual Reasoning Beyond Language
Paper • 2601.06521 • Published • 201
-
FireGNN: Neuro-Symbolic Graph Neural Networks with Trainable Fuzzy Rules for Interpretable Medical Image Classification
Paper • 2509.10510 • Published -
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale
Paper • 2510.14979 • Published • 69 -
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Paper • 2511.22699 • Published • 245 -
Self-Supervised Prompt Optimization
Paper • 2502.06855 • Published • 18
-
GLM-5: from Vibe Coding to Agentic Engineering
Paper • 2602.15763 • Published • 144 -
Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning
Paper • 2602.07845 • Published • 71 -
LLaDA2.1: Speeding Up Text Diffusion via Token Editing
Paper • 2602.08676 • Published • 70 -
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents
Paper • 2602.02474 • Published • 62
-
The Trinity of Consistency as a Defining Principle for General World Models
Paper • 2602.23152 • Published • 201 -
From Blind Spots to Gains: Diagnostic-Driven Iterative Training for Large Multimodal Models
Paper • 2602.22859 • Published • 151 -
OmniGAIA: Towards Native Omni-Modal AI Agents
Paper • 2602.22897 • Published • 53 -
Imagination Helps Visual Reasoning, But Not Yet in Latent Space
Paper • 2602.22766 • Published • 44
-
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots
Paper • 2602.00919 • Published • 323 -
SberRoboticsCenter/GreenVLA-2b-base
Robotics • Updated • 8 -
SberRoboticsCenter/GreenVLA-5b-base-stride-1
Robotics • Updated • 19 -
SberRoboticsCenter/GreenVLA-5b-base-stride-4
Robotics • Updated • 10
-
InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactions
Paper • 2602.06035 • Published • 23 -
PaperBanana: Automating Academic Illustration for AI Scientists
Paper • 2601.23265 • Published • 223 -
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots
Paper • 2602.00919 • Published • 323 -
Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation
Paper • 2602.16705 • Published • 26
-
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models
Paper • 2601.23143 • Published • 39 -
PaperBanana: Automating Academic Illustration for AI Scientists
Paper • 2601.23265 • Published • 223 -
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
BabyVision: Visual Reasoning Beyond Language
Paper • 2601.06521 • Published • 201
-
ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands
Paper • 2512.24965 • Published • 43 -
VLingNav: Embodied Navigation with Adaptive Reasoning and Visual-Assisted Linguistic Memory
Paper • 2601.08665 • Published • 8 -
HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning
Paper • 2507.00833 • Published • 1 -
IGen: Scalable Data Generation for Robot Learning from Open-World Images
Paper • 2512.01773 • Published • 1
-
FireGNN: Neuro-Symbolic Graph Neural Networks with Trainable Fuzzy Rules for Interpretable Medical Image Classification
Paper • 2509.10510 • Published -
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale
Paper • 2510.14979 • Published • 69 -
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Paper • 2511.22699 • Published • 245 -
Self-Supervised Prompt Optimization
Paper • 2502.06855 • Published • 18