Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
adarshzolekar 's Collections
Multimodal AI Models
Audio & Speech Models
Vision Models (Image & Video)
Text & Code Models (NLP)

Vision Models (Image & Video)

updated Jan 23

Purpose: Text-to-image, image classification, detection, segmentation.

Upvote
1

  • stabilityai/stable-diffusion-xl-base-1.0

    Text-to-Image • Updated Oct 30, 2023 • 2M • • 7.65k

  • rupeshs/LCM-runwayml-stable-diffusion-v1-5

    Text-to-Image • Updated Nov 12, 2023 • 145 • 30

  • openai/clip-vit-base-patch32

    Zero-Shot Image Classification • Updated Feb 29, 2024 • 20.2M • 917

  • facebook/detr-resnet-50

    Object Detection • 41.6M • Updated Apr 10, 2024 • 223k • • 946

  • google/vit-base-patch16-224

    Image Classification • 86.6M • Updated Sep 5, 2023 • 4.68M • • 953
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs