gpt-oss-120b claimed score of 68.4 on aider polyglot pass_rate_2

#14

by dagb - opened Sep 30, 2025

Sep 30, 2025

Meanwhile, the listed score of gpt-oss-120b (high) on aider polyglot remains at 41.8%.

But it is claimed that this score is incorrect:
https://www.reddit.com/r/LocalLLaMA/comments/1ndjxdt/comment/ndiash3/?context=1
https://github.com/Aider-AI/aider/pull/4444 (Merge conflict.)

At the same time:
The improved score may or may not be improved further by fixing the same issue that made it necessary to reissue the Unsloth DeepSeek 3.1 quants. (Some llama.cpp code/behavior which made it necessary to requant DS 3.1)

I was hoping to learn if the Unsloth team made any effort to at least evaluate if this was worth the effort. I find the size/performance ratio of gpt oss 120b to be best there is, and it fits my 4x 3090s as a glove.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment