Running on Zero MCP Featured 223 ViTPose Transformers ⚡ 223 Detect and visualize human poses in images and videos
view article Article Universal Image Segmentation with Mask2Former and OneFormer +1 Jan 19, 2023 • 15
Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation Paper • 2507.22886 • Published Jul 30, 2025 • 10
jonatasgrosman/wav2vec2-large-xlsr-53-english Automatic Speech Recognition • 0.3B • Updated Mar 25, 2023 • 47k • 478
openai/clip-vit-large-patch14 Zero-Shot Image Classification • 0.4B • Updated Sep 15, 2023 • 29.3M • 1.99k