Multimodal Autoregressive Pre-training of Large Vision Encoders
Paper • 2411.14402 • Published • 47
timm compatible AIM-v2 (https://huggingface.co/papers/2411.14402) image encoder weights from https://huggingface.co/apple/aimv2-large-patch14-224