microsoft/Phi-4-multimodal-instruct
Automatic Speech Recognition • 6B • Updated • 331k • 1.58k
Magma-8B model for UI Agents
Chat with Kimi-VL: respond to text, images, video, PDFs
OmniGen2: Unified Image Understanding and Generation.
THUDM/GLM-4.1V-9B-Thinking Demo