VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors
Paper • 2604.02486 • Published • 10
Low-resource, Multilingual, Cross-lingual Natural Language Processing; Data- and Compute-efficient Deep Learning for NLP