Vision Models (Image & Video) - a adarshzolekar Collection

adarshzolekar 's Collections

Multimodal AI Models

Audio & Speech Models

Vision Models (Image & Video)

Text & Code Models (NLP)

Vision Models (Image & Video)

updated Jan 23

Purpose: Text-to-image, image classification, detection, segmentation.

stabilityai/stable-diffusion-xl-base-1.0

Text-to-Image • Updated Oct 30, 2023 • 2M • • 7.65k
rupeshs/LCM-runwayml-stable-diffusion-v1-5

Text-to-Image • Updated Nov 12, 2023 • 145 • 30
openai/clip-vit-base-patch32

Zero-Shot Image Classification • Updated Feb 29, 2024 • 20.2M • 917
facebook/detr-resnet-50

Object Detection • 41.6M • Updated Apr 10, 2024 • 223k • • 946
google/vit-base-patch16-224

Image Classification • 86.6M • Updated Sep 5, 2023 • 4.68M • • 953