Cannot reproduce AIME_2025 result on the FP8 checkpoint
#7
by chenjiel - opened
Hi, Qwen team,
Could you share the instructions how to reproduce the AIME_2025 result for the FP8 checkpoint? From our benchmarking result, it looks like that the AIME_2025 score of the FP8 checkpoint is much lower than the measurement from the BF16 checkpoint published.