DontPlanToEnd commited on
Commit
9709080
·
verified ·
1 Parent(s): 0021202

Upload ugi-leaderboard-data.csv

Browse files
Files changed (1) hide show
  1. ugi-leaderboard-data.csv +2 -2
ugi-leaderboard-data.csv CHANGED
@@ -1202,7 +1202,7 @@ deepseek-ai/DeepSeek-V4-Flash (reasoning=enabled),https://huggingface.co/deepsee
1202
  deepseek-ai/DeepSeek-V4-Pro (reasoning=disabled),https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro,4/23/2026,4/26/2026,,49.0,1600.0,1600.0,FALSE,FALSE,TRUE,60.25,48.55,57.83,5.9,6.9,4.3,3.0,2.0,4.0,59.21,75.23,52.41,50.0,57.83,0.4108,0.3301,0.603,0.7464,0.4097,-22.4%,66.0%,44.2%,45.2%,59.9%,49.4%,61.0%,43.1%,36.0%,36.5%,29.6%,48.8%,50.2%,36.7%,55.2%,63.8%,60.8%,Liberalism,False,0,0,,23.6,0.59,14.1,5.7,0.351,42.0,100.0,0.864,0.415,0.346,1.217,0.338,0.344,35.9,4007.0,73.9,18.18,2.0,3.6
1203
  deepseek-ai/DeepSeek-V4-Pro (reasoning=enabled),https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro,4/23/2026,4/26/2026,,49.0,1600.0,1600.0,FALSE,FALSE,TRUE,68.4,62.26,77.15,8.8,7.2,7.4,3.2,2.0,4.5,67.03,85.81,72.41,42.87,77.15,0.6775,0.3486,0.5101,0.2274,0.3797,-14.7%,66.0%,46.0%,44.2%,58.5%,49.6%,58.8%,46.2%,32.3%,34.2%,35.6%,48.3%,50.0%,34.2%,52.7%,58.5%,64.2%,Liberalism,True,0,0,,24.8,0.67,13.1,5.0,0.358,34.0,94.0,0.879,0.42,0.324,1.33,0.179,0.357,20.7,3792.0,83.9,23.48,2.4,3.0
1204
  xai/grok-4.3,,4/30/2026,5/2/2026,,,,,FALSE,FALSE,TRUE,57.74,59.41,65.37,10.0,4.6,6.2,4.8,5.0,4.5,53.75,70.52,53.45,37.28,65.37,0.4317,0.2163,0.5785,0.295,0.3426,20.7%,54.2%,43.9%,22.6%,61.0%,51.2%,56.0%,39.0%,48.3%,42.5%,46.5%,17.9%,23.3%,26.5%,61.2%,44.8%,76.9%,Classical Liberalism,True,0,0,,36.4,0.72,12.7,7.2,0.338,21.0,82.0,0.897,0.455,0.286,1.35,0.397,0.344,34.4,5897.0,76.5,22.58,8.0,3.4
1205
- tencent/Hy3-preview (reasoning=disabled),https://huggingface.co/tencent/Hy3-preview,4/22/2026,5/2/2026,,,,,FALSE,FALSE,TRUE,41.55,45.07,46.35,5.3,4.4,4.4,4.2,4.0,4.5,37.5,47.9,28.62,35.98,46.35,0.4058,0.2596,0.5575,0.2964,0.2798,-24.1%,69.9%,44.3%,44.2%,64.7%,43.1%,62.3%,38.3%,33.3%,29.8%,27.3%,51.0%,50.0%,31.5%,66.0%,62.9%,65.2%,Liberalism,False,0,0,,36.0,0.73,11.8,5.5,0.345,32.0,100.0,0.864,0.436,0.342,1.53,0.045,0.251,36.3,5027.0,78.7,22.57,3.3,5.6
1206
- tencent/Hy3-preview (reasoning=enabled),https://huggingface.co/tencent/Hy3-preview,4/22/2026,5/2/2026,,,,,FALSE,FALSE,TRUE,51.12,40.09,53.88,7.1,4.4,5.3,1.2,1.0,1.5,50.11,72.49,39.66,38.17,53.88,0.5686,0.158,0.6354,0.2308,0.3156,-24.2%,68.4%,46.0%,44.1%,65.9%,42.9%,60.4%,41.2%,31.5%,37.3%,26.0%,51.2%,52.1%,29.0%,70.4%,60.6%,66.7%,Liberalism,True,0,0,,25.8,0.67,12.1,5.9,0.327,18.0,74.0,0.883,0.443,0.308,1.373,0.24,0.275,26.1,7627.0,70.6,23.43,2.4,3.4
1207
  Qwen/Qwen3.6-27B (<think> prefill),https://huggingface.co/Qwen/Qwen3.6-27B,4/22/2026,5/2/2026,chatml w/ <think> prefill,27.0,27.0,27.0,FALSE,FALSE,TRUE,42.47,27.15,26.98,4.7,1.2,2.9,2.8,4.0,1.5,33.16,43.52,17.24,38.72,26.98,0.3473,0.302,0.5662,0.3619,0.3585,-20.0%,64.8%,45.9%,44.2%,62.2%,43.3%,67.1%,48.1%,35.0%,39.2%,31.5%,48.5%,49.8%,34.2%,61.7%,62.3%,62.5%,Liberalism,True,12387,6,Qwen3_5ForConditionalGeneration,30.6,0.73,12.0,6.1,0.305,24.0,75.0,0.869,0.406,0.288,1.36,0.171,0.336,41.1,4370.0,77.8,21.83,2.0,5.3
1208
  Qwen/Qwen3.6-27B (no think),https://huggingface.co/Qwen/Qwen3.6-27B,4/22/2026,5/3/2026,chatml w/ no think,27.0,27.0,27.0,FALSE,FALSE,TRUE,38.83,19.32,17.73,1.8,1.0,2.7,2.2,3.0,1.5,30.67,38.4,12.76,40.84,17.73,0.401,0.1551,0.5169,0.6611,0.308,-24.4%,70.1%,47.4%,42.6%,63.5%,39.6%,64.8%,46.7%,30.0%,34.0%,25.6%,48.8%,51.9%,27.3%,57.5%,61.9%,71.0%,Liberalism,False,0,0,Qwen3_5ForConditionalGeneration,19.4,0.74,12.8,5.8,0.347,9.0,71.0,0.843,0.433,0.324,1.39,0.22,0.268,36.7,7739.0,83.1,19.03,2.1,3.2
 
1202
  deepseek-ai/DeepSeek-V4-Pro (reasoning=disabled),https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro,4/23/2026,4/26/2026,,49.0,1600.0,1600.0,FALSE,FALSE,TRUE,60.25,48.55,57.83,5.9,6.9,4.3,3.0,2.0,4.0,59.21,75.23,52.41,50.0,57.83,0.4108,0.3301,0.603,0.7464,0.4097,-22.4%,66.0%,44.2%,45.2%,59.9%,49.4%,61.0%,43.1%,36.0%,36.5%,29.6%,48.8%,50.2%,36.7%,55.2%,63.8%,60.8%,Liberalism,False,0,0,,23.6,0.59,14.1,5.7,0.351,42.0,100.0,0.864,0.415,0.346,1.217,0.338,0.344,35.9,4007.0,73.9,18.18,2.0,3.6
1203
  deepseek-ai/DeepSeek-V4-Pro (reasoning=enabled),https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro,4/23/2026,4/26/2026,,49.0,1600.0,1600.0,FALSE,FALSE,TRUE,68.4,62.26,77.15,8.8,7.2,7.4,3.2,2.0,4.5,67.03,85.81,72.41,42.87,77.15,0.6775,0.3486,0.5101,0.2274,0.3797,-14.7%,66.0%,46.0%,44.2%,58.5%,49.6%,58.8%,46.2%,32.3%,34.2%,35.6%,48.3%,50.0%,34.2%,52.7%,58.5%,64.2%,Liberalism,True,0,0,,24.8,0.67,13.1,5.0,0.358,34.0,94.0,0.879,0.42,0.324,1.33,0.179,0.357,20.7,3792.0,83.9,23.48,2.4,3.0
1204
  xai/grok-4.3,,4/30/2026,5/2/2026,,,,,FALSE,FALSE,TRUE,57.74,59.41,65.37,10.0,4.6,6.2,4.8,5.0,4.5,53.75,70.52,53.45,37.28,65.37,0.4317,0.2163,0.5785,0.295,0.3426,20.7%,54.2%,43.9%,22.6%,61.0%,51.2%,56.0%,39.0%,48.3%,42.5%,46.5%,17.9%,23.3%,26.5%,61.2%,44.8%,76.9%,Classical Liberalism,True,0,0,,36.4,0.72,12.7,7.2,0.338,21.0,82.0,0.897,0.455,0.286,1.35,0.397,0.344,34.4,5897.0,76.5,22.58,8.0,3.4
1205
+ tencent/Hy3-preview (reasoning=disabled),https://huggingface.co/tencent/Hy3-preview,4/22/2026,5/2/2026,,21.0,295.0,295.0,FALSE,FALSE,TRUE,41.55,45.07,46.35,5.3,4.4,4.4,4.2,4.0,4.5,37.5,47.9,28.62,35.98,46.35,0.4058,0.2596,0.5575,0.2964,0.2798,-24.1%,69.9%,44.3%,44.2%,64.7%,43.1%,62.3%,38.3%,33.3%,29.8%,27.3%,51.0%,50.0%,31.5%,66.0%,62.9%,65.2%,Liberalism,False,0,0,,36.0,0.73,11.8,5.5,0.345,32.0,100.0,0.864,0.436,0.342,1.53,0.045,0.251,36.3,5027.0,78.7,22.57,3.3,5.6
1206
+ tencent/Hy3-preview (reasoning=enabled),https://huggingface.co/tencent/Hy3-preview,4/22/2026,5/2/2026,,21.0,295.0,295.0,FALSE,FALSE,TRUE,51.12,40.09,53.88,7.1,4.4,5.3,1.2,1.0,1.5,50.11,72.49,39.66,38.17,53.88,0.5686,0.158,0.6354,0.2308,0.3156,-24.2%,68.4%,46.0%,44.1%,65.9%,42.9%,60.4%,41.2%,31.5%,37.3%,26.0%,51.2%,52.1%,29.0%,70.4%,60.6%,66.7%,Liberalism,True,0,0,,25.8,0.67,12.1,5.9,0.327,18.0,74.0,0.883,0.443,0.308,1.373,0.24,0.275,26.1,7627.0,70.6,23.43,2.4,3.4
1207
  Qwen/Qwen3.6-27B (<think> prefill),https://huggingface.co/Qwen/Qwen3.6-27B,4/22/2026,5/2/2026,chatml w/ <think> prefill,27.0,27.0,27.0,FALSE,FALSE,TRUE,42.47,27.15,26.98,4.7,1.2,2.9,2.8,4.0,1.5,33.16,43.52,17.24,38.72,26.98,0.3473,0.302,0.5662,0.3619,0.3585,-20.0%,64.8%,45.9%,44.2%,62.2%,43.3%,67.1%,48.1%,35.0%,39.2%,31.5%,48.5%,49.8%,34.2%,61.7%,62.3%,62.5%,Liberalism,True,12387,6,Qwen3_5ForConditionalGeneration,30.6,0.73,12.0,6.1,0.305,24.0,75.0,0.869,0.406,0.288,1.36,0.171,0.336,41.1,4370.0,77.8,21.83,2.0,5.3
1208
  Qwen/Qwen3.6-27B (no think),https://huggingface.co/Qwen/Qwen3.6-27B,4/22/2026,5/3/2026,chatml w/ no think,27.0,27.0,27.0,FALSE,FALSE,TRUE,38.83,19.32,17.73,1.8,1.0,2.7,2.2,3.0,1.5,30.67,38.4,12.76,40.84,17.73,0.401,0.1551,0.5169,0.6611,0.308,-24.4%,70.1%,47.4%,42.6%,63.5%,39.6%,64.8%,46.7%,30.0%,34.0%,25.6%,48.8%,51.9%,27.3%,57.5%,61.9%,71.0%,Liberalism,False,0,0,Qwen3_5ForConditionalGeneration,19.4,0.74,12.8,5.8,0.347,9.0,71.0,0.843,0.433,0.324,1.39,0.22,0.268,36.7,7739.0,83.1,19.03,2.1,3.2