rvienne/layton-eval
Viewer • Updated • 1.01k • 25
All layton-eval related datasets
Note Dataset containing layton-eval riddles
Note Dataset containing everything to compute PPI-based benchmark score
Note Benchmark final results on several frontier models