Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
HenriLD 's Collections
smolrx-135M
Dataset Mix for Pre-Training SLMs

Dataset Mix for Pre-Training SLMs

updated Mar 25, 2025
Upvote
2

  • open-thoughts/OpenThoughts-114k

    Viewer • Updated Aug 31, 2025 • 228k • 155k • 832

  • open-r1/OpenThoughts-114k-math

    Viewer • Updated Jan 30, 2025 • 89.1k • 682 • 91

  • HuggingFaceFW/fineweb

    Viewer • Updated Jul 11, 2025 • 52.5B • 632k • 2.76k

  • FreedomIntelligence/medical-o1-reasoning-SFT

    Viewer • Updated Apr 22, 2025 • 90.1k • 7.74k • 1.09k

  • AI-MO/NuminaMath-CoT

    Viewer • Updated Nov 25, 2024 • 860k • 37.8k • 564

  • dmariko/init_data

    Viewer • Updated Jul 10, 2024 • 188k • 12

  • HenriLD/FDA_Docs

    Viewer • Updated Feb 12, 2025 • 30.4k • 3

  • ChayanM/MIMIC-Impression-Dataset

    Viewer • Updated Apr 28, 2024 • 292k • 33 • 2

  • allenai/cord19

    Updated Nov 3, 2022 • 432 • 8

  • MedRAG/pubmed

    Viewer • Updated Feb 27, 2024 • 2.21M • 11.2k • 99

  • EleutherAI/SmolLM2-135M-10B

    Viewer • Updated Apr 15, 2025 • 10.1M • 2.48k • 1
Upvote
2
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs