Dataset Mix for Pre-Training SLMs
updated
open-thoughts/OpenThoughts-114k
Viewer
• Updated • 228k • 155k
• 832
open-r1/OpenThoughts-114k-math
Viewer
• Updated • 89.1k • 682
• 91
Viewer
• Updated • 52.5B • 632k
• 2.76k
FreedomIntelligence/medical-o1-reasoning-SFT
Viewer
• Updated • 90.1k • 7.74k
• 1.09k
Viewer
• Updated • 860k • 37.8k
• 564
Viewer
• Updated • 188k • 12
Viewer
• Updated • 30.4k • 3
ChayanM/MIMIC-Impression-Dataset
Viewer
• Updated • 292k • 33
• 2
Updated • 432
• 8
Viewer
• Updated • 2.21M • 11.2k
• 99
EleutherAI/SmolLM2-135M-10B
Viewer
• Updated • 10.1M • 2.48k
• 1