Spaces:

QAQ123
/

test

Running

App Files Files Community

test / README.md

QAQ123

Upload RPC-Bench Space

60966fd verified 1 day ago

preview code

raw

history blame contribute delete

2.18 kB

	---
	title: RPC-Bench Leaderboard
	emoji: 📊
	colorFrom: indigo
	colorTo: purple
	sdk: gradio
	sdk_version: 4.44.1
	python_version: 3.12
	app_file: app.py
	pinned: false
	license: mit
	---

	<p align="center">
	🌐 <a href="https://rpc-bench.github.io/" target="_blank">Project Page</a> •
	💻 <a href="https://github.com/RPC-Bench/PRC-Bench" target="_blank">GitHub</a> •
	📖 <a href="https://arxiv.org/abs/2601.14289" target="_blank">Paper</a> •
	🤗 <a href="https://huggingface.co" target="_blank">Hugging Face</a> •
	🧭 <a href="https://community.modelscope.cn/" target="_blank">ModelScope</a>
	</p>

	# RPC-Bench Leaderboard

	RPC-Bench is a benchmark for research paper comprehension. This Space provides two functions:

	- a public leaderboard for published submissions
	- a submission entry for uploading new evaluation files

	## Expected repository layout

	The Space is designed to work with a separate submission dataset repository.

	```text
	space/
	├── app.py
	├── constants.py
	├── eval.py
	├── requirements.txt
	└── benchmark/
	├── dev.json
	└── test.json
	```

	If `benchmark/dev.json` and `benchmark/test.json` are not bundled in the Space repo, set `RPC_BENCH_GOLD_DIR` or `RPC_BENCH_GOLD_PATH` through Space secrets / variables.

	The static leaderboard seed is stored in `leaderboard_seed.csv`. `index.html` is only used locally to generate that CSV and should not be uploaded to the Space repository.

	## Submission format

	Uploaded files should be JSONL with one answer per line:

	```json
	{"id":"...", "part_idx":1, "question":"...", "gen_answer":"...", "category":"..."}
	```

	## Required environment variables

	- `HF_TOKEN`: token for cloning and pushing the submission repository
	- `SUBMISSION_REPO_ID`: dataset repo used to store leaderboard results
	- `RPC_BENCH_GOLD_DIR`: optional directory containing `dev.json` and `test.json`
	- `OPENAI_API_KEY`: optional, required if you want the Space to run LLM-based judging inline
	- `OPENAI_BASE_URL`: optional, for OpenAI-compatible endpoints

	The Space can still accept uploads when the judge variables are missing, but evaluation will be marked as pending.