Chenlu123/grpo_warmup_graftTrue_qwen2_5_math_1_5b_guru_n16_bz64_mini_bz64_global_step_80 Updated 6 days ago
Chenlu123/grpo_warmup_graftTrue_qwen2_5_math_1_5b_guru_n16_bz64_mini_bz64_global_step_80 Updated 6 days ago
Adaptive Layerwise Perturbation: Unifying Off-Policy Corrections for LLM RL Paper • 2603.19470 • Published 25 days ago • 3
Adaptive Layerwise Perturbation: Unifying Off-Policy Corrections for LLM RL Paper • 2603.19470 • Published 25 days ago • 3
Chenlu123/teacher_Qwen3-4B_dapo-math-17k_n8_prompt_bsz_128_mini_bsz_32_step460 2B • Updated 24 days ago • 13
Chenlu123/teacher_Qwen3-4B_dapo-math-17k_n8_prompt_bsz_128_mini_bsz_32_step460 2B • Updated 24 days ago • 13
Chenlu123/teacher_Qwen3-4B_dapo-math-17k_n8_prompt_bsz_128_mini_bsz_32_step440 2B • Updated 24 days ago • 14
Chenlu123/teacher_Qwen3-4B_dapo-math-17k_n8_prompt_bsz_128_mini_bsz_32_step440 2B • Updated 24 days ago • 14
Chenlu123/teacher_Qwen3-4B_dapo-math-17k_n8_prompt_bsz_128_mini_bsz_32_step420 2B • Updated 24 days ago • 14
Chenlu123/teacher_Qwen3-4B_dapo-math-17k_n8_prompt_bsz_128_mini_bsz_32_step420 2B • Updated 24 days ago • 14
Chenlu123/teacher_Qwen3-4B_dapo-math-17k_n8_prompt_bsz_128_mini_bsz_32_step400 2B • Updated 24 days ago • 14
Chenlu123/teacher_Qwen3-4B_dapo-math-17k_n8_prompt_bsz_128_mini_bsz_32_step400 2B • Updated 24 days ago • 14
Chenlu123/perturb_23_null_sequence_inistd1e-4_clip_0.5_3.0_c10.0_lr5e-4_qwen2-1_5b Updated 24 days ago
Chenlu123/perturb_23_null_sequence_inistd1e-4_clip_0.5_3.0_c10.0_lr5e-4_qwen2-1_5b_step880 2B • Updated Mar 5
Chenlu123/perturb_23_null_sequence_inistd1e-4_clip_0.5_3.0_c10.0_lr5e-4_qwen2-1_5b_step880 2B • Updated Mar 5
Chenlu123/perturb_23_null_sequence_inistd1e-4_clip_0.5_3.0_c10.0_lr5e-4_qwen2-1_5b_step860 2B • Updated Mar 5 • 1
Chenlu123/perturb_23_null_sequence_inistd1e-4_clip_0.5_3.0_c10.0_lr5e-4_qwen2-1_5b_step860 2B • Updated Mar 5 • 1
Chenlu123/perturb_23_null_sequence_inistd1e-4_clip_0.5_3.0_c10.0_lr5e-4_qwen2-1_5b_step840 2B • Updated Mar 5 • 1
Chenlu123/perturb_23_null_sequence_inistd1e-4_clip_0.5_3.0_c10.0_lr5e-4_qwen2-1_5b_step840 2B • Updated Mar 5 • 1
Chenlu123/perturb_23_null_sequence_inistd1e-4_clip_0.5_3.0_c10.0_lr5e-4_qwen2-1_5b_step820 2B • Updated Mar 5 • 1