LeenAlQadi commited on
Commit
cbeb01c
·
1 Parent(s): ba42b92

Revert "update about.html"

Browse files

This reverts commit ba42b925bcd21bbde2a17ed80060a6cc196f06cf.

Files changed (1) hide show
  1. frontend/about.html +1 -1
frontend/about.html CHANGED
@@ -43,7 +43,7 @@
43
  </div>
44
  <div class="prose dark:prose-invert max-w-none text-slate-600 dark:text-slate-300 leading-relaxed">
45
  <p class="mb-4">
46
- QIMMA قمّة (Summit in Arabic) is a quality-assured Arabic LLM evaluation leaderboard built on 14 carefully chosen benchmarks spanning STEM, legal reasoning, medical knowledge, poetry, cultural understanding, and code generation. QIMMA includes over 52,000 quality-validated samples across multiple-choice, generative, and code evaluation tracks. Over 99% of QIMMA's content is native Arabic, ensuring authentic linguistic and cultural assessment rather than relying on translated materials.
47
  </p>
48
  <p>
49
  QIMMA was constructed through a systematic benchmark curation process: candidate benchmarks were assessed using a multi-model quality validation pipeline that identified issues in the samples, including false, missing or invalid gold answers, textual encoding problems and many more. Only clean, validated samples made it into the final leaderboard. This process also revealed that quality problems are more pervasive across existing Arabic benchmarks than previously documented.
 
43
  </div>
44
  <div class="prose dark:prose-invert max-w-none text-slate-600 dark:text-slate-300 leading-relaxed">
45
  <p class="mb-4">
46
+ QIMMA قمّة (Summit in Arabic) is a quality-assured Arabic LLM evaluation leaderboard built on 13 carefully chosen benchmarks spanning STEM, legal reasoning, medical knowledge, poetry, cultural understanding, and code generation. QIMMA includes over 52,000 quality-validated samples across multiple-choice, generative, and code evaluation tracks. Over 99% of QIMMA's content is native Arabic, ensuring authentic linguistic and cultural assessment rather than relying on translated materials.
47
  </p>
48
  <p>
49
  QIMMA was constructed through a systematic benchmark curation process: candidate benchmarks were assessed using a multi-model quality validation pipeline that identified issues in the samples, including false, missing or invalid gold answers, textual encoding problems and many more. Only clean, validated samples made it into the final leaderboard. This process also revealed that quality problems are more pervasive across existing Arabic benchmarks than previously documented.