File size: 2,181 Bytes
017b3a7
e9e6671
 
 
 
017b3a7
e9e6671
60966fd
017b3a7
 
e9e6671
017b3a7
 
e9e6671
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
---
title: RPC-Bench Leaderboard
emoji: πŸ“Š
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 4.44.1
python_version: 3.12
app_file: app.py
pinned: false
license: mit
---

<p align="center">
    🌐 <a href="https://rpc-bench.github.io/" target="_blank">Project Page</a> β€’
    πŸ’» <a href="https://github.com/RPC-Bench/PRC-Bench" target="_blank">GitHub</a> β€’
    πŸ“– <a href="https://arxiv.org/abs/2601.14289" target="_blank">Paper</a> β€’
    πŸ€— <a href="https://huggingface.co" target="_blank">Hugging Face</a> β€’
    🧭 <a href="https://community.modelscope.cn/" target="_blank">ModelScope</a>
</p>

# RPC-Bench Leaderboard

RPC-Bench is a benchmark for research paper comprehension. This Space provides two functions:

- a public leaderboard for published submissions
- a submission entry for uploading new evaluation files

## Expected repository layout

The Space is designed to work with a separate submission dataset repository.

```text
space/
β”œβ”€β”€ app.py
β”œβ”€β”€ constants.py
β”œβ”€β”€ eval.py
β”œβ”€β”€ requirements.txt
└── benchmark/
    β”œβ”€β”€ dev.json
    └── test.json
```

If `benchmark/dev.json` and `benchmark/test.json` are not bundled in the Space repo, set `RPC_BENCH_GOLD_DIR` or `RPC_BENCH_GOLD_PATH` through Space secrets / variables.

The static leaderboard seed is stored in `leaderboard_seed.csv`. `index.html` is only used locally to generate that CSV and should not be uploaded to the Space repository.

## Submission format

Uploaded files should be JSONL with one answer per line:

```json
{"id":"...", "part_idx":1, "question":"...", "gen_answer":"...", "category":"..."}
```

## Required environment variables

- `HF_TOKEN`: token for cloning and pushing the submission repository
- `SUBMISSION_REPO_ID`: dataset repo used to store leaderboard results
- `RPC_BENCH_GOLD_DIR`: optional directory containing `dev.json` and `test.json`
- `OPENAI_API_KEY`: optional, required if you want the Space to run LLM-based judging inline
- `OPENAI_BASE_URL`: optional, for OpenAI-compatible endpoints

The Space can still accept uploads when the judge variables are missing, but evaluation will be marked as pending.