TAUR-dev/M-skillfactory_sft_countdown_3arg_qrepeat1_reflections3_formats0C.-C.-C-IC.-CC-sft
2B • Updated TAUR-dev/M-0910__qrepeat3_ref3_3args_grpo-rl
2B • Updated TAUR-dev/M-0910__qrepeat1_ref5_alltask_grpo-rl
Updated
TAUR-dev/M-0910__qrepeat1_ref3_alltask_grpo-rl
Updated
TAUR-dev/M-sft_exp_zayne-sft
2B • Updated TAUR-dev/M-0909__0epoch_3args_grpo-rl
2B • Updated TAUR-dev/M-0910__qrepeat3_ref5_3args_grpo-rl
Updated
TAUR-dev/M-0910__qrepeat1_ref5_3args_grpo-rl
Updated
TAUR-dev/M-0910__qrepeat1_ref3_3args_grpo-rl
Updated
TAUR-dev/M-0909__0epoch_alltask_grpo-rl
Updated
TAUR-dev/M-0909__0epoch_complex0.1_alltask_grpo-rl
Updated
TAUR-dev/M-skillfactory_sft_combinedtasks_promptvariants_qrepeat1_reflections5-sft
2B • Updated TAUR-dev/M-skillfactory_sft_combinedtasks_promptvariants_qrepeat1_reflections3-sft
2B • Updated TAUR-dev/M-skillfactory_sft_countdown_3arg_promptvariants_qrepeat1_reflections3-sft
2B • Updated TAUR-dev/M-skillfactory_sft_countdown_3arg_promptvariants_qrepeat3_reflections5-sft
2B • Updated TAUR-dev/M-skillfactory_sft_countdown_3arg_promptvariants_qrepeat1_reflections5-sft
2B • Updated TAUR-dev/M-skillfactory_sft_countdown_3arg_promptvariants_qrepeat3_reflections3-sft
2B • Updated 2B • Updated TAUR-dev/M-0903_rl_reflect__1e_complex0.3_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated TAUR-dev/M-0903_rl_reflect__0epoch_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated • 1
TAUR-dev/M-0903_rl_reflect__2b_complex0.3_3and4args__grpo_minibs32_lr1e-6_rollout16-rl
Updated • 6
TAUR-dev/M-0903_rl_reflect__3b_complex0.3_alltask__grpo_minibs32_lr1e-6_rollout16-rl
Updated
TAUR-dev/M-0903_rl_reflect__1a_3args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated TAUR-dev/M-0903_rl_reflect__0epoch_3and4args__grpo_minibs32_lr1e-6_rollout16-rl
2B • Updated TAUR-dev/M-0903_rl_reflect__0epoch_alltask__grpo_minibs32_lr1e-6_rollout16_32GPU-rl
Updated
TAUR-dev/M-0903_rl_reflect__2b_3and4args__grpo_minibs32_lr1e-6_rollout16-rl
Updated
TAUR-dev/M-0903_rl_reflect__2a_3and4args__grpo_minibs32_lr1e-6_rollout16-rl
Updated
TAUR-dev/M-0903_rl_reflect__3b_alltask__grpo_minibs32_lr1e-6_rollout16-rl
Updated
TAUR-dev/M-0903_rl_reflect__3a_alltask__grpo_minibs32_lr1e-6_rollout16-rl
Updated