Sapiens 2 by Meta
#1
by emailfrom - opened
Fun
Sapiens 2
A computer vision model from Meta
Focused only on understanding humans in images and video
What it does
Detects body pose (arms, legs, joints)
Segments body parts (skin, clothing)
Estimates depth (how far body parts are)
Builds detailed spatial maps of a person
What makes it different
Trained mainly on human data
Works at very high resolution
One model can handle multiple human-related tasks
What it is not
Not a chatbot
Not a text model
Not for general image recognition like cars or objects
Where it is used
AR and VR (avatars, body tracking)
Games and animation
Robotics and motion analysis
Simple idea
It is a “human understanding” vision model, not a language model