Cannot take image input in vllm

by rumcs - opened 20 days ago

Hi thank you so much for providing this model! Recently I run it on my dual 3090 PC using vllm and enjoyed it very much. However, it seems that it cannot read images as input, even I added "--limit-mm-per-prompt.image 1" in the vllm configuration.
ChatGPT told me that this AWQ-INT4 model is language only and does not support image input. I am not sure. Do you have any comments / suggestions? Thank you!

cpatonn

cyankiwi org 17 days ago

What is your vllm command and request format sent to vllm API endpoints? Could you test the following:

curl -X POST "http://localhost:8000/v1/chat/completions" \
        -H "Content-Type: application/json" \
        --data '{
                "model": "",
                "messages": [
                        {
                                "role": "user",
                                "content": [
                                        {
                                                "type": "text",
                                                "text": "Describe this image in very clear detail"
                                        },
                                        {
                                                "type": "image_url",
                                                "image_url": {
                                                        "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
                                                }
                                        }
                                ]
                        }
                ]
        }'

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment