Training and evaluation datasets of R-HORIZON: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?