LiveCodeBench-Pro Version

#43
by beyoung - opened

may I ask which version of the livecodebench-pro do you use? 24? 25? 25Q2? all?

Nanbeige LLM Lab org
This comment has been hidden

So far as I know, 25Q3 do not have official test samples? see https://huggingface.co/datasets/QAQAQAQAQ/LiveCodeBench-Pro-Testcase

but 25Q2 do have

Nanbeige LLM Lab org

Sorry for the confusion. I was mistaken earlier — the correct version is 25Q2, not 25Q3.

Do you run 4 or 8 repeats to get the average or pass@1?

cannot reproduce the score of livebench pro 25Q2 Medium, can you provide any suggestion? tricks? of prompts?

Nanbeige LLM Lab org

Sorry for the delayed response. For LiveCodeBench-Pro, we only run the evaluation once to obtain the final score. For reproducibility, could you share your configuration?

Sign up or log in to comment