Deep Data Research Benchmark
weiliu
thinkwee
AI & ML interests
LLM reasoning, agents
Organizations
None yet
NOVEReason
General Reasoning datasets for training the NOVER model
-
NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
Paper β’ 2505.16022 β’ Published β’ 4 -
thinkwee/NOVEReason_2k
Viewer β’ Updated β’ 24.3k β’ 49 β’ 1 -
thinkwee/NOVEReason_5k
Viewer β’ Updated β’ 36.3k β’ 57 β’ 1 -
thinkwee/NOVEReason_full
Viewer β’ Updated β’ 1.7M β’ 70 β’ 1
DDRBench
Deep Data Research Benchmark
-
Hunt Instead of Wait: Evaluating Deep Data Research on Large Language Models
Paper β’ 2602.02039 β’ Published β’ 5 - Running3
DDR Bench
π3Deep Data Research Benchmark
-
thinkwee/DDRBench_10K
Viewer β’ Updated β’ 3.16M β’ 98 -
thinkwee/DDRBench_10K_trajectory
Viewer β’ Updated β’ 50.9k β’ 31
NOVER1
NOVER-series models for general reasoning
NOVEReason
General Reasoning datasets for training the NOVER model
-
NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
Paper β’ 2505.16022 β’ Published β’ 4 -
thinkwee/NOVEReason_2k
Viewer β’ Updated β’ 24.3k β’ 49 β’ 1 -
thinkwee/NOVEReason_5k
Viewer β’ Updated β’ 36.3k β’ 57 β’ 1 -
thinkwee/NOVEReason_full
Viewer β’ Updated β’ 1.7M β’ 70 β’ 1