Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
segmond 's Collections
vision models
pending_space_downloads
Segmond Interests
Datasets
Papers
pending_downloads
CoolSpace
training examples
embedding models

vision models

updated Oct 1, 2025
Upvote
-

  • bartowski/UI-TARS-7B-DPO-GGUF

    Image-Text-to-Text • 8B • Updated Jan 23, 2025 • 6.04k • 9

  • bartowski/UI-TARS-72B-SFT-GGUF

    Image-Text-to-Text • 73B • Updated Jan 24, 2025 • 2.3k

  • bartowski/UI-TARS-7B-SFT-GGUF

    Image-Text-to-Text • 8B • Updated Jan 24, 2025 • 2.57k • 3

  • bartowski/UI-TARS-72B-DPO-GGUF

    Image-Text-to-Text • 73B • Updated Jan 23, 2025 • 2.04k • 3

  • bartowski/allenai_olmOCR-7B-0225-preview-GGUF

    Image-Text-to-Text • 8B • Updated Feb 25, 2025 • 1.41k • 7

  • microsoft/Phi-4-multimodal-instruct

    Automatic Speech Recognition • 6B • Updated Dec 10, 2025 • 329k • 1.58k

  • ggml-org/ultravox-v0_5-llama-3_2-1b-GGUF

    Audio-Text-to-Text • 1B • Updated May 25, 2025 • 3.49k • 6

  • mradermacher/Qwen2-Audio-7B-Instruct-GGUF

    Audio-Text-to-Text • 8B • Updated Jul 31, 2025 • 548

  • city96/FLUX.1-dev-gguf

    Text-to-Image • 12B • Updated Aug 18, 2024 • 107k • 1.3k

  • openbmb/MiniCPM-V-4_5

    Image-Text-to-Text • 9B • Updated Mar 10 • 139k • 1.08k

  • Qwen/Qwen-Image-Edit

    Image-to-Image • Updated Aug 25, 2025 • 76.2k • • 2.37k
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs