Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Andrew W's picture
36 145

Andrew W

Andre3000
evalstate's profile picture shtefcs's profile picture Oromgaada's profile picture
·

AI & ML interests

None yet

Organizations

Fifth Wind's profile picture

Collections 3

Research
  • Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents

    Paper • 2509.06917 • Published Sep 8, 2025 • 44
Pre training
  • Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

    Paper • 2401.16380 • Published Jan 29, 2024 • 53
  • OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

    Paper • 2602.05400 • Published Feb 5 • 352
  • The Pile: An 800GB Dataset of Diverse Text for Language Modeling

    Paper • 2101.00027 • Published Dec 31, 2020 • 10
Research
  • Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents

    Paper • 2509.06917 • Published Sep 8, 2025 • 44
Pre training
  • Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

    Paper • 2401.16380 • Published Jan 29, 2024 • 53
  • OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

    Paper • 2602.05400 • Published Feb 5 • 352
  • The Pile: An 800GB Dataset of Diverse Text for Language Modeling

    Paper • 2101.00027 • Published Dec 31, 2020 • 10
View 3 collections

models 0

None public yet

datasets 0

None public yet
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs