docling-project/SmolDocling-256M-preview
Image-Text-to-Text • Updated • 50.4k • 1.61k
Generate a talking-head video from a face image and audio
Generate a virtual try‑on image of a person wearing a garment
Blazingly Fast and Embarrassingly Simple Song Generation