PGC Psychiatric GWAS Summary Statistics Collection ~1 billion rows of genome-wide association study (GWAS) NOTE: We are in the process to transfer these datasets to the Psychiatric Genomics Consortiu • 12 items • Updated 3 days ago • 86
BioNeMo - Design Collection NVIDIA BioNeMo Models for Design • 12 items • Updated 2 days ago • 8
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model Paper • 2510.14528 • Published Oct 16, 2025 • 124
MediPhi Collection A collection of SLMs based on Phi3.5-mini-instruct adapted to clinical natural language processing tasks: https://arxiv.org/abs/2505.10717 • 10 items • Updated Oct 1, 2025 • 25
view article Article NVIDIA Releases Improved Pretraining Dataset: Preserves High Value Math & Code, and Augments with Multi-Lingual Aug 18, 2025 • 4
view article Article 📢 NVIDIA Releases Nemotron-CC-Math Pre-Training Dataset: A High-Quality, Web-Scale Math Corpus for Pretraining Large Language Models Aug 18, 2025 • 5
view article Article Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B Aug 18, 2025 • 32
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5, 2025 • 140
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 183
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience Paper • 2508.04700 • Published Aug 6, 2025 • 52