Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 12 items • Updated about 7 hours ago • 137
Sutra Pedagogical Datasets Collection High-quality synthetic educational datasets designed for LLM pretraining with structured pedagogical content across 9 knowledge domains. • 7 items • Updated 29 days ago • 4