Upload ugi-leaderboard-data.csv
Browse files- ugi-leaderboard-data.csv +9 -0
ugi-leaderboard-data.csv
CHANGED
|
@@ -501,3 +501,12 @@ Sicarius-Prototyping/Impish_Longtail_12B,https://huggingface.co/Sicarius-Prototy
|
|
| 501 |
tiiuae/Falcon-H1-0.5B-Instruct,https://huggingface.co/tiiuae/Falcon-H1-0.5B-Instruct,5/1/2025,10/17/2025,chatml,0.5,0.5,0.5,FALSE,FALSE,TRUE,14.55,15.86,0.0,0.5,0.7,5.2,5.0,5.5,9.26,13.44,5.17,9.17,4.4,0.0217,0.1392,0.0241,0.1623,0.1114,-4.2%,53.1%,52.0%,50.2%,56.1%,47.7%,54.2%,57.9%,45.0%,51.0%,44.8%,46.9%,50.0%,53.8%,54.6%,54.2%,59.4%,Centrism,False,0,0,FalconH1ForCausalLM,29.4,0.65,12.9,6.4,0.36,40.0,81.0,0.832,0.501,0.314,1.919,0.234,0.082,201.4,8421.0,292.0,24.6,3.3,1.2
|
| 502 |
tiiuae/Falcon-H1-1.5B-Instruct,https://huggingface.co/tiiuae/Falcon-H1-1.5B-Instruct,5/1/2025,10/17/2025,chatml,1.5,1.5,1.5,FALSE,FALSE,TRUE,21.01,12.02,1.2,0.8,0.3,2.8,3.0,2.5,8.65,10.54,5.17,10.22,7.19,0.0371,0.1612,0.1213,0.0612,0.1303,-17.9%,63.1%,49.7%,46.7%,54.9%,46.7%,51.5%,47.1%,38.8%,38.1%,34.0%,52.7%,41.7%,45.8%,39.8%,58.1%,66.9%,Liberalism,False,0,0,FalconH1ForCausalLM,37.5,0.62,12.8,8.4,0.321,23.0,75.0,0.911,0.502,0.299,1.647,0.344,0.071,152.8,7508.0,164.5,27.77,1.3,3.0
|
| 503 |
Qwen/Qwen3-4B-Thinking-2507,https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507,8/5/2025,10/18/2025,chatml,4.0,4.0,4.0,FALSE,FALSE,TRUE,10.87,20.78,3.5,1.4,1.1,2.8,4.0,1.5,16.16,27.48,3.45,17.57,18.68,0.112,0.1349,0.3982,0.0624,0.1708,-28.1%,67.8%,49.3%,47.2%,59.2%,48.1%,54.8%,50.8%,31.9%,35.2%,29.6%,56.5%,50.6%,34.6%,58.5%,65.2%,54.0%,Liberalism,True,6202,0,Qwen3ForCausalLM,25.5,0.66,17.8,8.1,0.279,64.0,99.0,0.819,0.394,0.259,1.553,0.295,0.103,84.5,8631.0,97.5,27.7,1.2,2.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 501 |
tiiuae/Falcon-H1-0.5B-Instruct,https://huggingface.co/tiiuae/Falcon-H1-0.5B-Instruct,5/1/2025,10/17/2025,chatml,0.5,0.5,0.5,FALSE,FALSE,TRUE,14.55,15.86,0.0,0.5,0.7,5.2,5.0,5.5,9.26,13.44,5.17,9.17,4.4,0.0217,0.1392,0.0241,0.1623,0.1114,-4.2%,53.1%,52.0%,50.2%,56.1%,47.7%,54.2%,57.9%,45.0%,51.0%,44.8%,46.9%,50.0%,53.8%,54.6%,54.2%,59.4%,Centrism,False,0,0,FalconH1ForCausalLM,29.4,0.65,12.9,6.4,0.36,40.0,81.0,0.832,0.501,0.314,1.919,0.234,0.082,201.4,8421.0,292.0,24.6,3.3,1.2
|
| 502 |
tiiuae/Falcon-H1-1.5B-Instruct,https://huggingface.co/tiiuae/Falcon-H1-1.5B-Instruct,5/1/2025,10/17/2025,chatml,1.5,1.5,1.5,FALSE,FALSE,TRUE,21.01,12.02,1.2,0.8,0.3,2.8,3.0,2.5,8.65,10.54,5.17,10.22,7.19,0.0371,0.1612,0.1213,0.0612,0.1303,-17.9%,63.1%,49.7%,46.7%,54.9%,46.7%,51.5%,47.1%,38.8%,38.1%,34.0%,52.7%,41.7%,45.8%,39.8%,58.1%,66.9%,Liberalism,False,0,0,FalconH1ForCausalLM,37.5,0.62,12.8,8.4,0.321,23.0,75.0,0.911,0.502,0.299,1.647,0.344,0.071,152.8,7508.0,164.5,27.77,1.3,3.0
|
| 503 |
Qwen/Qwen3-4B-Thinking-2507,https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507,8/5/2025,10/18/2025,chatml,4.0,4.0,4.0,FALSE,FALSE,TRUE,10.87,20.78,3.5,1.4,1.1,2.8,4.0,1.5,16.16,27.48,3.45,17.57,18.68,0.112,0.1349,0.3982,0.0624,0.1708,-28.1%,67.8%,49.3%,47.2%,59.2%,48.1%,54.8%,50.8%,31.9%,35.2%,29.6%,56.5%,50.6%,34.6%,58.5%,65.2%,54.0%,Liberalism,True,6202,0,Qwen3ForCausalLM,25.5,0.66,17.8,8.1,0.279,64.0,99.0,0.819,0.394,0.259,1.553,0.295,0.103,84.5,8631.0,97.5,27.7,1.2,2.3
|
| 504 |
+
tiiuae/Falcon-H1-1.5B-Deep-Instruct,https://huggingface.co/tiiuae/Falcon-H1-1.5B-Deep-Instruct,5/1/2025,10/18/2025,chatml,1.5,1.5,1.5,False,False,True,19.86,14.09,0.0,1.3,0.5,3.8,5.0,2.5,11.11,16.7,5.17,11.45,6.77,0.0668,0.1594,0.0598,0.0637,0.223,-27.4%,56.3%,44.7%,55.1%,61.1%,51.5%,59.8%,45.4%,43.5%,57.1%,31.7%,55.2%,64.8%,45.4%,47.1%,67.5%,67.7%,Liberalism,False,0,0,FalconH1ForCausalLM,39.5,0.66,12.1,7.8,0.313,20.0,51.0,0.911,0.507,0.294,1.493,0.333,0.187,112.2,7574.0,213.0,27.63,3.4,5.4
|
| 505 |
+
tiiuae/Falcon-H1-3B-Instruct,https://huggingface.co/tiiuae/Falcon-H1-3B-Instruct,5/1/2025,10/18/2025,chatml,3.0,3.0,3.0,False,False,True,22.79,15.08,0.6,0.4,0.8,4.5,5.0,4.0,10.91,12.58,5.17,14.98,5.73,0.0502,0.1479,0.155,0.1511,0.2447,-19.9%,62.4%,46.5%,53.3%,55.2%,55.6%,60.4%,55.6%,33.3%,45.2%,34.4%,51.9%,59.4%,48.8%,37.3%,62.3%,66.0%,Liberalism,False,0,0,FalconH1ForCausalLM,38.4,0.69,12.6,7.9,0.326,18.0,46.0,0.909,0.498,0.276,1.48,0.291,0.216,130.5,8034.0,149.6,24.83,1.6,2.2
|
| 506 |
+
Qwen/Qwen2.5-72B-Instruct,https://huggingface.co/Qwen/Qwen2.5-72B-Instruct,9/16/2024,10/18/2025,chatml,72.0,72.0,72.0,False,False,True,31.52,27.42,2.4,2.7,3.1,2.8,5.0,0.5,21.81,32.85,6.21,26.37,27.4,0.1919,0.2118,0.1552,0.439,0.3208,-16.8%,64.2%,43.6%,42.6%,61.2%,48.3%,64.6%,43.8%,34.8%,38.8%,33.8%,45.8%,43.3%,38.8%,54.8%,63.1%,65.8%,Liberalism,False,0,0,Qwen2ForCausalLM,33.9,0.82,12.8,6.2,0.363,21.0,69.0,0.892,0.483,0.307,1.29,0.486,0.256,61.6,6004.0,149.5,21.07,1.7,1.2
|
| 507 |
+
Qwen/Qwen2.5-VL-72B-Instruct,https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct,1/27/2025,10/18/2025,chatml,72.0,72.0,72.0,False,False,True,35.59,34.05,4.7,1.5,2.5,5.8,5.0,6.5,25.33,36.48,8.97,30.53,26.72,0.2197,0.2038,0.3215,0.4674,0.3143,-21.8%,66.9%,47.0%,44.3%,62.4%,44.8%,60.6%,46.5%,36.2%,34.6%,28.3%,49.8%,46.0%,37.1%,60.0%,61.2%,65.8%,Liberalism,False,0,2,Qwen2_5_VLForConditionalGeneration,32.3,0.81,13.7,7.6,0.321,29.0,98.0,0.892,0.477,0.316,1.32,0.406,0.259,56.5,6202.0,109.0,20.8,3.2,0.8
|
| 508 |
+
KaraKaraWitch/GoldDiamondGold-L33-70b,https://huggingface.co/KaraKaraWitch/GoldDiamondGold-L33-70b,8/7/2025,10/18/2025,llama-3,70.0,70.0,70.0,True,True,False,46.05,46.28,3.5,3.8,5.0,6.2,6.0,6.5,35.91,47.19,33.45,27.1,41.21,0.2802,0.1947,0.1526,0.439,0.2884,-13.8%,64.2%,49.2%,43.4%,61.5%,39.0%,61.7%,48.1%,39.4%,34.8%,33.1%,42.7%,46.9%,40.6%,58.1%,61.0%,65.2%,Liberalism,False,0,0,LlamaForCausalLM,44.1,0.94,13.4,6.1,0.355,29.0,80.0,0.895,0.486,0.279,1.44,0.416,0.304,48.1,6442.0,150.5,21.07,7.9,7.2
|
| 509 |
+
Qwen/Qwen2.5-VL-32B-Instruct,https://huggingface.co/Qwen/Qwen2.5-VL-32B-Instruct,3/25/2025,10/18/2025,chatml,32.0,32.0,32.0,False,False,True,21.69,17.28,2.4,0.8,2.6,1.5,3.0,0.0,16.02,22.23,3.45,22.39,17.99,0.3319,0.1626,0.1115,0.334,0.1795,-16.8%,64.4%,48.6%,44.4%,59.5%,40.8%,63.3%,50.0%,41.2%,34.0%,31.7%,45.2%,47.1%,40.8%,57.3%,61.7%,59.6%,Liberalism,False,0,0,Qwen2_5_VLForConditionalGeneration,31.1,0.95,15.0,7.0,0.316,64.0,99.0,0.882,0.451,0.275,1.57,0.338,0.139,42.6,7456.0,169.8,22.13,2.5,3.3
|
| 510 |
+
Qwen/Qwen2.5-VL-3B-Instruct,https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct,1/26/2025,10/18/2025,chatml,3.0,3.0,3.0,False,False,True,20.43,16.87,1.8,0.6,0.5,4.2,5.0,3.5,9.33,11.92,6.21,9.86,8.85,0.0764,0.1005,0.0746,0.0505,0.1912,-12.6%,59.5%,49.3%,49.4%,44.4%,51.7%,50.4%,50.0%,46.0%,53.8%,21.7%,47.3%,55.8%,45.2%,37.7%,43.3%,52.3%,Centrism,False,0,0,Qwen2_5_VLForConditionalGeneration,27.5,0.8,13.3,7.4,0.301,20.0,32.0,0.885,0.503,0.291,1.443,0.645,0.129,104.3,10779.0,196.7,28.4,1.2,2.1
|
| 511 |
+
CrucibleLab/M3.2-24B-Loki-V1.3,https://huggingface.co/CrucibleLab/M3.2-24B-Loki-V1.3,8/3/2025,10/18/2025,mistral V7-Tekken,24.0,24.0,24.0,True,False,False,33.4,38.5,2.9,1.3,4.7,7.0,7.0,7.0,28.78,38.91,22.76,24.69,28.66,0.0849,0.205,0.2457,0.4181,0.2806,-24.2%,63.8%,42.9%,44.9%,60.1%,47.7%,66.5%,42.9%,38.5%,40.8%,29.4%,53.1%,49.6%,31.9%,53.8%,62.5%,64.0%,Liberalism,False,0,0,MistralForCausalLM,46.2,0.85,11.8,5.7,0.373,17.0,55.0,0.895,0.502,0.315,1.42,0.329,0.257,98.5,6170.0,123.5,21.27,2.0,2.7
|
| 512 |
+
open-r1/OlympicCoder-32B,https://huggingface.co/open-r1/OlympicCoder-32B,3/12/2025,10/18/2025,chatml w/ <think> prefill,32.0,32.0,32.0,True,False,False,-27.69,33.42,2.9,1.4,1.6,8.0,9.0,7.0,17.42,29.62,9.66,12.98,18.86,0.1976,0.1191,0.101,0.1081,0.1234,-31.7%,66.7%,46.6%,50.8%,61.1%,45.8%,66.5%,52.1%,31.5%,44.8%,23.8%,55.0%,55.6%,41.7%,53.3%,66.2%,63.8%,Liberalism,True,12378,6,Qwen2ForCausalLM,28.2,0.75,12.5,6.4,0.316,598.0,100.0,0.852,0.415,0.365,1.57,0.644,0.062,60.4,9490.0,176.1,25.92,1.3,3.0
|