Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
mike-ravkineΒ 
posted an update Mar 3
Post
289
gpt-oss-120b has held on to the ReasonScape crown since it's release on Aug 5, 2025 - 7 months in the LLM space is *impressive*.

With the release of Qwen-3.5 the king has been dethroned by not one but 2 models the mid-dense Qwen/Qwen3.5-27B and the large-MoE Qwen/Qwen3.5-122B-A10B-FP8.

The old king is dead - long live the new king πŸ‘‘

Note that these rankings are based on r12 - a 27k prompts, 12 task domain 3rd iteration of the ReasonScape evaluation. Compared to the previous m12x ranking this evaluation fixes a slew of test bugs, refines the task set to add table-extraction, and lifts the context ceiling to 16k - so these rankings are quite a bit different vs the previous m12x Leaderboard (which has an 8k context limit).
In this post