YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Qwen2.5-Omni Inference Endpoint

This repository contains code for deploying the Qwen2.5-Omni-0.5B model to Hugging Face Inference Endpoints for use with the Indoor Scenes dataset.

Overview

The LLaVA-Onevision implementation with Qwen2.5-Omni provides multimodal capabilities for:

  • Image captioning
  • Audio recognition
  • Video understanding
  • Test-time scaling implementation

Deployment Instructions

  1. Setup your Hugging Face account:

    • Ensure you have a Hugging Face account with a valid API token
    • Use huggingface-cli login to authenticate
  2. Create and push to a Hugging Face repository:

    huggingface-cli repo create YOUR_USERNAME/my-qwen-omni-endpoint --type model
    git init
    git add .
    git commit -m "Initial commit"
    git remote add origin https://huggingface.co/YOUR_USERNAME/my-qwen-omni-endpoint
    git push -u origin main
    
  3. Deploy to Inference Endpoints:

    • Go to your repository on Hugging Face
    • Navigate to "Settings" > "Inference Endpoints"
    • Create a new endpoint
    • Select appropriate hardware (recommend at least 16GB GPU)
    • Deploy!

Using the Endpoint

Text-only example:

{
  "conversation": [
    {"role": "user", "content": "Tell me about yourself."}
  ]
}

Image example:

{
  "conversation": [
    {
      "role": "user", 
      "content": "What do you see in this image?",
      "images": ["https://example.com/image.jpg"]
    }
  ]
}

For MIT Indoor Scenes Dataset

This endpoint is specifically designed to work with the MIT Indoor Scenes dataset from CVPR 2019. The model can be used to generate captions for indoor scene images to evaluate captioning performance.

Testing Test-Time Scaling

The implementation supports test-time scaling through the standard inference interface, allowing for:

  • Budget scaling/forcing
  • Beam search integration
  • Various performance metrics
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support