VideoLLaMA2
π₯
162
Media understanding
Media understanding
Generate detailed image descriptions and highlight objects
A unified multimodal understanding and generation model.
Generate answers to questions about images
Interact with a chatbot that understands text and images
Ask questions about images and get answers
ViLT VQA with FlanT5 and Translations
Interact with images and text using Visual ChatGPT
Try XAI's Grok 2 vision model