Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
garyzhang's picture
3 10 4

garyzhang

xiaoniqiu
dark-pen's profile picture
·
  • garyzhang99

AI & ML interests

LLM, Agents

Organizations

Data-Juicer's profile picture

authored 5 papers 2 months ago

The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

Paper • 2407.08583 • Published Jul 11, 2024 • 13

Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models

Paper • 2505.17826 • Published May 23, 2025 • 10

Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models

Paper • 2501.14755 • Published Dec 23, 2024

Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends

Paper • 2509.24203 • Published Sep 29, 2025 • 8

On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models

Paper • 2602.03392 • Published Feb 3 • 59
authored a paper 8 months ago

On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting

Paper • 2508.11408 • Published Aug 15, 2025 • 8
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs