SNU Thunder-LLM Korean Benchmark Suite - a thunder-research-group Collection

thunder-research-group 's Collections

SNU Thunder-LLM Korean Benchmark Suite

SNU Thunder-LLM English Benchmark Suite

SNU Thunder-LLM Dataset Suite

Post-Training Datasets

Negation Benchmarks

SNU Thunder-LLM Korean Benchmark Suite

updated 3 days ago

thunder-research-group/SNU_Ko-LAMBADA

Viewer • Updated Jun 13, 2025 • 2.26k • 128
thunder-research-group/SNU_Ko-WinoGrande

Viewer • Updated Jun 13, 2025 • 1.27k • 46
thunder-research-group/SNU_Ko-ARC

Viewer • Updated Jun 13, 2025 • 3.54k • 11
thunder-research-group/SNU_Ko-GSM8K

Viewer • Updated Oct 16, 2025 • 1.32k • 35 • 1
thunder-research-group/SNU_Ko-IFEval

Viewer • Updated Jun 13, 2025 • 841 • 233
thunder-research-group/SNU_Ko-EQ-Bench

Viewer • Updated Jun 13, 2025 • 171 • 25
skt/kobest_v1

Viewer • Updated Mar 28, 2024 • 23.4k • 3.33k • 54

Note We use hellaswag > test set for evaluation
HAERAE-HUB/KMMLU

Viewer • Updated Mar 5, 2024 • 244k • 8.49k • 97
HYU-NLP/KR-HumanEval

Viewer • Updated Jun 3, 2025 • 328 • 28

Note We use v1 for evaluation
LGCNS/KorQuAD_2.0

Viewer • Updated Aug 7, 2025 • 93.7k • 180 • 2
thunder-research-group/SNU_Ko-MuSR

Viewer • Updated Nov 24, 2025 • 750 • 22
thunder-research-group/SNU_Thunder-KoNUBench

Viewer • Updated about 14 hours ago • 4.78k • 25 • 1