leaderboard / backend /data_loader.py

Commit History

fix(ui): render structured benchmark details correctly
7749d9c

Alyafeai commited on

UI polish: table options grouping, column visibility redesign, model size fixes
bcff379

LeenAlQadi commited on

added filters and fixed adaptive filtered average
f20c7d0

LeenAlQadi commited on

fix to show bert score for each sample
8cbc289

Alyafeai commited on

Align FannOrFlop detail summary with results f1 and return full benchmark-detail rows
5af9331

Alyafeai commited on

fix issue with model's type emoji
633c620

Alyafeai commited on

Add Expand and Collapse button for long samples in the detail section
129be1e

Alyafeai commited on

Enahnce downloading the details repo
b50cc9d

Alyafeai commited on

Fix score mismatch between details and table
03ab794

Alyafeai commited on

Prioritize extracted_json over raw_response
d243d1c

Alyafeai commited on

Enhance Details for benchmarks without sub-tasks
c61f1c0

Alyafeai commited on

New formats for details, add fannorflop benchmark
86f7358

Alyafeai commited on

fix issue with multi-options answers, and with the samples that don't have binary score
53dfe4f

Alyafeai commited on

fix details
f8be5d4

Alyafeai commited on

fix issue with some parquet files have different formats
034f762

Alyafeai commited on

fix requirements
5e5f8d3

Alyafeai commited on

adding details
3725eb1

Alyafeai commited on

fix get_model_size
163662f

Basma Boussaha commited on

change the way we read results by specifying source type, source type is the prefix of the filename. Sperating the tasks into different lists based on the source
c7488db

Alyafeai commited on

fix some of the status
8eddc6c

Alyafeai commited on

fix requirements.txt
747d6d8

Alyafeai commited on

feat: add average column and improve file loading logic
510e4b3

Alyafeai commited on

Add ArabLegalQA and MedArabicQA benchmarks + default missing results to 0
6cb3a47

Alyafeai commited on

download_datasets
800eeca

Alyafeai commited on

push first code
178c53e

Alyafeai commited on