Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
kongqi's picture
2 5

kongqi

kongqi
·

AI & ML interests

None yet

Organizations

None yet

Collections 3

rl
  • MANSA: Learning Fast and Slow in Multi-Agent Systems

    Paper • 2302.05910 • Published Feb 12, 2023
Llm
  • AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

    Paper • 2509.08755 • Published Sep 10, 2025 • 56
  • The Majority is not always right: RL training for solution aggregation

    Paper • 2509.06870 • Published Sep 8, 2025 • 15
  • Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

    Paper • 2509.07980 • Published Sep 9, 2025 • 105
  • Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning

    Paper • 2509.03646 • Published Sep 3, 2025 • 33
rl
  • MANSA: Learning Fast and Slow in Multi-Agent Systems

    Paper • 2302.05910 • Published Feb 12, 2023
Llm
  • AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

    Paper • 2509.08755 • Published Sep 10, 2025 • 56
  • The Majority is not always right: RL training for solution aggregation

    Paper • 2509.06870 • Published Sep 8, 2025 • 15
  • Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

    Paper • 2509.07980 • Published Sep 9, 2025 • 105
  • Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning

    Paper • 2509.03646 • Published Sep 3, 2025 • 33
View 3 collections

models 0

None public yet

datasets 0

None public yet
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs