MutiModal_Dataset
updated
Updated • 37.3k
• 119
Updated • 11.3k
• 136
WildVision/wildvision-chat
Viewer
• Updated • 45.2k • 947
• 20
Viewer
• Updated • 12.4M • 1.55k
• 172
lmms-lab/LLaVA-Video-178K
Viewer
• Updated • 1.63M • 37.5k
• 189
Viewer
• Updated • 7.29M • 347
• 50
Viewer
• Updated • 1.66M • 13
VILA-U: a Unified Foundation Model Integrating Visual Understanding and
Generation
Paper
• 2409.04429
• Published
Viewer
• Updated • 235M • 2.41k
• 46
Viewer
• Updated • 9.81M • 1.1k
• 54
JefferyZhan/Language-prompted-Localization-Dataset
Preview
• Updated • 15
• 4
Viewer
• Updated • 392 • 40
• 12
mlfoundations/MINT-1T-HTML
Viewer
• Updated • 623M • 97.3k
• 94
DINO-X: A Unified Vision Model for Open-World Object Detection and
Understanding
Paper
• 2411.14347
• Published • 16
Preview
• Updated • 94
• 52
Viewer
• Updated • 72.5k • 44
• 10
Viewer
• Updated • 10.9M • 31
• 9
Viewer
• Updated • 2.18M • 115
• 2
Viewer
• Updated • 110k • 289
• 4
Salesforce/blip3-grounding-50m
Viewer
• Updated • 52.4M • 387
• 28
Intelligent-Internet/II-Thought-RL-v0
Viewer
• Updated • 342k • 439
• 54
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and
Verifiable Mathematical Dataset for Advancing Reasoning
Paper
• 2504.11456
• Published • 12
Viewer
• Updated • 217M • 93.1k
• 117