--- title: RPC-Bench Leaderboard emoji: 📊 colorFrom: indigo colorTo: purple sdk: gradio sdk_version: 4.44.1 python_version: 3.12 app_file: app.py pinned: false license: mit ---

🌐 Project Page • 💻 GitHub • 📖 Paper • 🤗 Hugging Face • 🧭 ModelScope

# RPC-Bench Leaderboard RPC-Bench is a benchmark for research paper comprehension. This Space provides two functions: - a public leaderboard for published submissions - a submission entry for uploading new evaluation files ## Expected repository layout The Space is designed to work with a separate submission dataset repository. ```text space/ ├── app.py ├── constants.py ├── eval.py ├── requirements.txt └── benchmark/ ├── dev.json └── test.json ``` If `benchmark/dev.json` and `benchmark/test.json` are not bundled in the Space repo, set `RPC_BENCH_GOLD_DIR` or `RPC_BENCH_GOLD_PATH` through Space secrets / variables. The static leaderboard seed is stored in `leaderboard_seed.csv`. `index.html` is only used locally to generate that CSV and should not be uploaded to the Space repository. ## Submission format Uploaded files should be JSONL with one answer per line: ```json {"id":"...", "part_idx":1, "question":"...", "gen_answer":"...", "category":"..."} ``` ## Required environment variables - `HF_TOKEN`: token for cloning and pushing the submission repository - `SUBMISSION_REPO_ID`: dataset repo used to store leaderboard results - `RPC_BENCH_GOLD_DIR`: optional directory containing `dev.json` and `test.json` - `OPENAI_API_KEY`: optional, required if you want the Space to run LLM-based judging inline - `OPENAI_BASE_URL`: optional, for OpenAI-compatible endpoints The Space can still accept uploads when the judge variables are missing, but evaluation will be marked as pending.