baidu/Qianfan-OCR
Image-Text-to-Text • 5B • Updated • 44.8k • 1.14k
Qianfan-vl model series. The models are mainly domain enhanced vision language model, targeting enterprise level multi modal understanding scenarios.
Domain-Enhanced Universal Vision-Language Models