VST Collection A comprehensive framework designed to cultivate VLMs with human-like visuospatial abilities. ⢠6 items ⢠Updated Feb 1 ⢠6
Cosmos-Predict2 Collection ā ļø This collection is archived. š https://huggingface.co/collections/nvidia/cosmos-predict25 ⢠10 items ⢠Updated 4 days ago ⢠36
Cosmos World Foundation Model Platform for Physical AI Paper ⢠2501.03575 ⢠Published Jan 7, 2025 ⢠82
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers ⢠29 items ⢠Updated 4 days ago ⢠143
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper ⢠2501.17161 ⢠Published Jan 28, 2025 ⢠125
PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog ⢠9 items ⢠Updated Mar 2 ⢠89
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper ⢠2411.10440 ⢠Published Nov 15, 2024 ⢠129
Theia Collection Distilling Diverse Vision Foundation Models for Robot Learning ⢠6 items ⢠Updated Sep 30, 2024 ⢠9
view article Article Metric and Relative Monocular Depth Estimation: An Overview. Fine-Tuning Depth Anything V2 š š Jul 10, 2024 ⢠93
3D-VLA: A 3D Vision-Language-Action Generative World Model Paper ⢠2403.09631 ⢠Published Mar 14, 2024 ⢠12
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation ⢠12 items ⢠Updated 4 days ago ⢠64
OpenResearcher: Unleashing AI for Accelerated Scientific Research Paper ⢠2408.06941 ⢠Published Aug 13, 2024 ⢠32
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks Paper ⢠2408.03615 ⢠Published Aug 7, 2024 ⢠31
Achieving Human Level Competitive Robot Table Tennis Paper ⢠2408.03906 ⢠Published Aug 7, 2024 ⢠28