TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-letter_countdown_4o__v1
Viewer
• Updated • 304 • 2
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoDiv-RL-letter_countdown_4o-eval_rl
Viewer
• Updated • 300 • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_InstOnly-RL
Viewer
• Updated • 11.5k • 2
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoReflects-RL-gsm8k__v1
Viewer
• Updated • 1.32k • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoReflects-RL-gsm8k-eval_rl
Viewer
• Updated • 1.32k • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_3args_Random-RL
Viewer
• Updated • 11.5k • 3
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-letter_countdown_5o__v1
Viewer
• Updated • 304 • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoDiv-RL-letter_countdown_5o-eval_rl
Viewer
• Updated • 300 • 2
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoReflects-RL-commonsenseQA__v1
Viewer
• Updated • 1.23k • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoReflects-RL-commonsenseQA-eval_rl
Viewer
• Updated • 1.22k • 2
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-acronym_4o__v1
Viewer
• Updated • 200 • 2
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoDiv-RL-acronym_4o-eval_rl
Viewer
• Updated • 197 • 3
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-acronym_5o__v1
Viewer
• Updated • 148 • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoDiv-RL-acronym_5o-eval_rl
Viewer
• Updated • 144 • 3
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-longmult_5dig__v1
Viewer
• Updated • 1k • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoDiv-RL-longmult_5dig-eval_rl
Viewer
• Updated • 1k • 3
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-longmult_4dig__v1
Viewer
• Updated • 1k • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoDiv-RL-longmult_4dig-eval_rl
Viewer
• Updated • 1k • 2
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoReflects-RL-countdown_6arg__v1
Viewer
• Updated • 1k • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoReflects-RL-countdown_6arg-eval_rl
Viewer
• Updated • 1k • 3
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-longmult_3dig__v1
Viewer
• Updated • 1k • 2
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoDiv-RL-longmult_3dig-eval_rl
Viewer
• Updated • 1k • 3
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-longmult_2dig__v1
Viewer
• Updated • 1k • 2
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoDiv-RL-longmult_2dig-eval_rl
Viewer
• Updated • 1k • 3
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-gsm8k__v1
Viewer
• Updated • 1.32k • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoDiv-RL-gsm8k-eval_rl
Viewer
• Updated • 1.32k • 3
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-commonsenseQA__v1
Viewer
• Updated • 1.23k • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoDiv-RL-commonsenseQA-eval_rl
Viewer
• Updated • 1.22k • 2
TAUR-dev/D-ExpTracker__FinEval_16k_fulleval_3args_NoDiv-RL-countdown_6arg__v1
Viewer
• Updated • 1k • 3
TAUR-dev/D-EVAL__standard_eval_v3__FinEval_16k_fulleval_3args_NoDiv-RL-countdown_6arg-eval_rl
Viewer
• Updated • 1k • 3