RLVR with Noisy Data uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-3 8B • Updated Jan 30 • 32 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-4 8B • Updated Jan 30 • 6 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.1-epoch-3 8B • Updated Jan 30 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.2-epoch-3 8B • Updated Jan 30
RL Generalizability uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-dapo 2B • Updated Nov 15, 2025 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-12-6 2B • Updated Nov 16, 2025 • 1 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-10-6 2B • Updated Nov 16, 2025 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-8-6 2B • Updated Nov 16, 2025
RLVR with Noisy Data uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-3 8B • Updated Jan 30 • 32 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-clean-epoch-4 8B • Updated Jan 30 • 6 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.1-epoch-3 8B • Updated Jan 30 uiuc-kang-lab/Qwen2.5-Math-7B-GRPO-noise-0.2-epoch-3 8B • Updated Jan 30
RL Generalizability uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-dapo 2B • Updated Nov 15, 2025 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-12-6 2B • Updated Nov 16, 2025 • 1 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-10-6 2B • Updated Nov 16, 2025 uiuc-kang-lab/R1-Distill-Qwen-1.5B-math-epoch-8-6 2B • Updated Nov 16, 2025