Meta VL-JEPA - Vision-Language Prediction Models Collection Meta VL-JEPA Vision-Language Joint Embedding Predictive Architecture for video understanding • 6 items • Updated Jan 16 • 8
nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL-BF16 Image-Text-to-Text • 13B • Updated Dec 2, 2025 • 114k • 82