Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
uavleeva 's Collections
Multitask RLVR using GRPO (HSE Project)

Multitask RLVR using GRPO (HSE Project)

updated Feb 9
Upvote
-

  • uavleeva/grpo_mixed_run_004

    Updated Feb 8

  • uavleeva/grpo_mixed_run_001

    Updated Feb 8

  • uavleeva/grpo_math_run_level3_all_rewards_001

    Updated Feb 8

  • uavleeva/grpo_sql_run_004

    Updated Feb 8

  • uavleeva/grpo_code_run_001

    Updated Feb 7

  • uavleeva/grpo_sql_run_005

    Updated Feb 8

  • uavleeva/grpo_sql_run_002

    Updated Feb 8

  • uavleeva/grpo_mixed_run_002

    Updated Feb 8

  • uavleeva/grpo_code_run_002

    Updated Feb 8

  • uavleeva/grpo_merged_math_sql_code_ties_001

    Text Generation • Updated Feb 8 • 1

  • uavleeva/grpo_merged_math_sql_code_linear_001

    Text Generation • Updated Feb 8

  • uavleeva/grpo_math_run_level3_accformat_001

    Updated Feb 7

  • open-r1/Big-Math-RL-Verified-Processed

    Viewer • Updated Apr 11, 2025 • 1M • 584 • 26

  • uavleeva/text2sql_synthetics

    Viewer • Updated Feb 6 • 100 • 8

  • open-r1/verifiable-coding-problems-python

    Viewer • Updated Mar 3, 2025 • 35.7k • 379 • 12
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs