Upload ugi-leaderboard-data.csv
Browse files- ugi-leaderboard-data.csv +8 -0
ugi-leaderboard-data.csv
CHANGED
|
@@ -1193,3 +1193,11 @@ openai/gpt-5.5-2026-04-23 (reasoning_effort=high),https://huggingface.co/openai/
|
|
| 1193 |
openai/gpt-5.5-2026-04-23 (reasoning_effort=xhigh),https://huggingface.co/openai/gpt-5.5-2026-04-23 (reasoning_effort=xhigh),4/23/2026,4/25/2026,,,,,FALSE,FALSE,TRUE,69.66,51.19,65.53,5.9,5.6,8.2,2.2,3.0,1.5,78.16,88.3,76.9,69.29,65.53,0.8434,0.8631,0.7278,0.6361,0.3939,-32.1%,72.8%,44.9%,48.0%,63.9%,45.2%,63.5%,43.3%,27.5%,30.6%,23.5%,56.5%,56.9%,30.6%,57.7%,65.0%,69.0%,Liberalism,True,0,0,,30.3,0.84,11.0,4.1,0.403,13.0,58.0,0.887,0.426,0.307,1.243,0.437,0.356,12.9,733.0,61.2,19.27,0.9,4.3
|
| 1194 |
xai/grok-4.20-0309-non-reasoning,https://huggingface.co/xai/grok-4.20-0309-non-reasoning,3/19/2026,4/25/2026,,,,,FALSE,FALSE,TRUE,47.15,51.77,46.4,8.8,3.1,3.1,6.2,6.0,6.5,40.76,52.25,36.21,33.82,46.4,0.3499,0.2187,0.3252,0.4425,0.3545,11.8%,55.4%,46.5%,28.7%,60.6%,40.0%,56.2%,35.8%,52.7%,38.3%,42.7%,32.9%,20.8%,32.3%,62.3%,48.1%,71.5%,Classical Liberalism,False,0,0,,22.4,0.77,13.8,4.8,0.343,36.0,100.0,0.851,0.425,0.294,1.287,0.394,0.31,40.9,5843.0,108.4,21.03,8.5,8.2
|
| 1195 |
xai/grok-4.20-0309-reasoning,https://huggingface.co/xai/grok-4.20-0309-reasoning,3/19/2026,4/25/2026,,,,,FALSE,FALSE,TRUE,55.26,64.23,58.84,7.1,4.5,6.7,7.5,7.0,8.0,52.75,70.23,46.9,41.13,58.84,0.3714,0.3311,0.5281,0.4425,0.3835,27.9%,49.8%,43.6%,18.6%,62.2%,49.2%,56.2%,36.2%,57.1%,45.0%,48.5%,13.1%,18.5%,24.2%,66.5%,40.8%,79.4%,Classical Liberalism,True,0,0,,22.9,0.73,13.2,5.3,0.325,46.0,100.0,0.865,0.43,0.29,1.267,0.385,0.349,39.0,3995.0,81.9,21.03,9.9,6.2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1193 |
openai/gpt-5.5-2026-04-23 (reasoning_effort=xhigh),https://huggingface.co/openai/gpt-5.5-2026-04-23 (reasoning_effort=xhigh),4/23/2026,4/25/2026,,,,,FALSE,FALSE,TRUE,69.66,51.19,65.53,5.9,5.6,8.2,2.2,3.0,1.5,78.16,88.3,76.9,69.29,65.53,0.8434,0.8631,0.7278,0.6361,0.3939,-32.1%,72.8%,44.9%,48.0%,63.9%,45.2%,63.5%,43.3%,27.5%,30.6%,23.5%,56.5%,56.9%,30.6%,57.7%,65.0%,69.0%,Liberalism,True,0,0,,30.3,0.84,11.0,4.1,0.403,13.0,58.0,0.887,0.426,0.307,1.243,0.437,0.356,12.9,733.0,61.2,19.27,0.9,4.3
|
| 1194 |
xai/grok-4.20-0309-non-reasoning,https://huggingface.co/xai/grok-4.20-0309-non-reasoning,3/19/2026,4/25/2026,,,,,FALSE,FALSE,TRUE,47.15,51.77,46.4,8.8,3.1,3.1,6.2,6.0,6.5,40.76,52.25,36.21,33.82,46.4,0.3499,0.2187,0.3252,0.4425,0.3545,11.8%,55.4%,46.5%,28.7%,60.6%,40.0%,56.2%,35.8%,52.7%,38.3%,42.7%,32.9%,20.8%,32.3%,62.3%,48.1%,71.5%,Classical Liberalism,False,0,0,,22.4,0.77,13.8,4.8,0.343,36.0,100.0,0.851,0.425,0.294,1.287,0.394,0.31,40.9,5843.0,108.4,21.03,8.5,8.2
|
| 1195 |
xai/grok-4.20-0309-reasoning,https://huggingface.co/xai/grok-4.20-0309-reasoning,3/19/2026,4/25/2026,,,,,FALSE,FALSE,TRUE,55.26,64.23,58.84,7.1,4.5,6.7,7.5,7.0,8.0,52.75,70.23,46.9,41.13,58.84,0.3714,0.3311,0.5281,0.4425,0.3835,27.9%,49.8%,43.6%,18.6%,62.2%,49.2%,56.2%,36.2%,57.1%,45.0%,48.5%,13.1%,18.5%,24.2%,66.5%,40.8%,79.4%,Classical Liberalism,True,0,0,,22.9,0.73,13.2,5.3,0.325,46.0,100.0,0.865,0.43,0.29,1.267,0.385,0.349,39.0,3995.0,81.9,21.03,9.9,6.2
|
| 1196 |
+
llmfan46/Qwen3.6-35B-A3B-uncensored-heretic (no think),https://huggingface.co/llmfan46/Qwen3.6-35B-A3B-uncensored-heretic,4/20/2026,4/26/2026,chatml w/ no think,3.0,35.0,35.0,TRUE,FALSE,FALSE,37.6,44.25,18.88,2.4,1.9,1.5,9.5,10.0,9.0,28.42,38.69,14.14,32.44,18.88,0.3593,0.2304,0.3453,0.4531,0.2337,-12.5%,58.3%,49.7%,46.0%,56.8%,45.0%,59.0%,52.9%,44.0%,45.8%,35.2%,49.4%,48.1%,40.4%,51.0%,55.2%,64.2%,Liberalism,False,0,1,Qwen3_5MoeForConditionalGeneration,21.0,0.74,12.9,5.8,0.328,9.0,72.0,0.849,0.44,0.319,1.587,0.121,0.208,40.1,5586.0,105.2,20.93,3.6,3.4
|
| 1197 |
+
llmfan46/Qwen3.6-35B-A3B-uncensored-heretic (<think> prefill),https://huggingface.co/llmfan46/Qwen3.6-35B-A3B-uncensored-heretic,4/20/2026,4/26/2026,chatml w/ <think> prefill,3.0,35.0,35.0,TRUE,FALSE,FALSE,46.22,50.9,26.35,4.1,1.9,2.3,10.0,10.0,10.0,31.62,42.97,12.41,39.47,26.35,0.5634,0.1747,0.6411,0.3163,0.278,-12.4%,61.3%,48.9%,45.4%,60.4%,45.6%,63.1%,55.4%,37.3%,40.0%,38.8%,49.0%,49.4%,37.9%,51.2%,58.1%,71.9%,Liberalism,True,11730,8,Qwen3_5MoeForConditionalGeneration,17.8,0.76,13.9,6.2,0.284,13.0,29.0,0.855,0.389,0.247,1.537,0.02,0.242,26.4,7038.0,70.1,22.33,2.7,2.8
|
| 1198 |
+
llmfan46/gemma-4-26B-A4B-it-uncensored-heretic,https://huggingface.co/llmfan46/gemma-4-26B-A4B-it-uncensored-heretic,4/7/2026,4/26/2026,gemma-4,26.0,26.0,26.0,TRUE,FALSE,FALSE,42.09,49.03,27.29,3.5,2.8,2.0,9.2,10.0,8.5,34.12,33.75,29.66,38.94,27.29,0.2609,0.206,0.5838,0.5294,0.3669,-22.2%,62.4%,48.5%,47.7%,54.7%,39.8%,65.6%,51.0%,34.8%,49.4%,28.5%,52.1%,49.4%,41.7%,30.4%,61.0%,72.7%,Liberalism,False,0,0,Gemma4ForConditionalGeneration,31.8,0.65,12.4,6.6,0.34,9.0,74.0,0.827,0.428,0.34,1.277,0.342,0.317,50.5,6145.0,75.9,20.23,3.3,2.3
|
| 1199 |
+
llmfan46/gemma-4-26B-A4B-it-uncensored-heretic (<|channel>thought prefill),https://huggingface.co/llmfan46/gemma-4-26B-A4B-it-uncensored-heretic,4/7/2026,4/26/2026,gemma-4 w/ <|channel>thought prefill,26.0,26.0,26.0,TRUE,FALSE,FALSE,46.36,41.01,14.02,0.0,2.0,1.8,9.5,9.0,10.0,37.3,42.86,32.41,36.64,14.02,0.2991,0.1286,0.5879,0.4836,0.3328,-23.5%,64.0%,50.1%,47.1%,59.8%,41.0%,61.2%,52.5%,32.1%,48.5%,27.3%,48.3%,53.5%,39.4%,40.2%,61.7%,77.5%,Liberalism,True,5897,8,Gemma4ForConditionalGeneration,29.3,0.68,12.9,7.0,0.342,10.0,65.0,0.83,0.425,0.335,1.327,0.344,0.293,45.9,8954.0,75.5,20.65,2.7,2.4
|
| 1200 |
+
deepseek-ai/DeepSeek-V4-Flash (reasoning=disabled),https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash,4/23/2026,4/26/2026,,13.0,284.0,284.0,FALSE,FALSE,TRUE,54.58,59.21,52.57,5.9,4.9,5.1,7.2,5.0,9.5,47.94,62.82,36.21,44.78,52.57,0.7253,0.1839,0.3483,0.572,0.4097,-13.9%,63.8%,45.9%,46.5%,56.9%,46.7%,64.4%,48.8%,37.5%,35.4%,35.8%,48.5%,49.8%,41.2%,54.2%,57.5%,59.2%,Liberalism,False,0,0,,27.4,0.69,12.8,4.3,0.326,19.0,82.0,0.865,0.421,0.334,1.2,0.469,0.348,18.5,6752.0,104.7,19.85,3.3,3.3
|
| 1201 |
+
deepseek-ai/DeepSeek-V4-Flash (reasoning=enabled),https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash,4/23/2026,4/26/2026,,13.0,284.0,284.0,FALSE,FALSE,TRUE,49.74,56.63,58.7,8.8,4.4,5.3,5.2,4.0,6.5,45.52,68.72,33.45,34.4,58.7,0.5125,0.1769,0.3537,0.3281,0.3488,-19.2%,68.2%,46.2%,45.9%,61.2%,44.4%,62.5%,45.4%,34.0%,28.3%,33.1%,56.0%,49.4%,32.3%,57.5%,59.2%,66.9%,Liberalism,True,0,0,,26.3,0.78,12.2,4.0,0.365,12.0,70.0,0.862,0.426,0.33,1.31,0.369,0.316,29.3,6968.0,103.9,22.2,2.9,3.1
|
| 1202 |
+
deepseek-ai/DeepSeek-V4-Pro (reasoning=disabled),https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro,4/23/2026,4/26/2026,,49.0,1600.0,1600.0,FALSE,FALSE,TRUE,60.25,48.55,57.83,5.9,6.9,4.3,3.0,2.0,4.0,59.21,75.23,52.41,50.0,57.83,0.4108,0.3301,0.603,0.7464,0.4097,-22.4%,66.0%,44.2%,45.2%,59.9%,49.4%,61.0%,43.1%,36.0%,36.5%,29.6%,48.8%,50.2%,36.7%,55.2%,63.8%,60.8%,Liberalism,False,0,0,,23.6,0.59,14.1,5.7,0.351,42.0,100.0,0.864,0.415,0.346,1.217,0.338,0.344,35.9,4007.0,73.9,18.18,2.0,3.6
|
| 1203 |
+
deepseek-ai/DeepSeek-V4-Pro (reasoning=enabled),https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro,4/23/2026,4/26/2026,,49.0,1600.0,1600.0,FALSE,FALSE,TRUE,68.4,62.26,77.15,8.8,7.2,7.4,3.2,2.0,4.5,67.03,85.81,72.41,42.87,77.15,0.6775,0.3486,0.5101,0.2274,0.3797,-14.7%,66.0%,46.0%,44.2%,58.5%,49.6%,58.8%,46.2%,32.3%,34.2%,35.6%,48.3%,50.0%,34.2%,52.7%,58.5%,64.2%,Liberalism,True,0,0,,24.8,0.67,13.1,5.0,0.358,34.0,94.0,0.879,0.42,0.324,1.33,0.179,0.357,20.7,3792.0,83.9,23.48,2.4,3.0
|