clemsail commited on
Commit
b3cc62b
·
verified ·
1 Parent(s): 8b6a122

docs: add base vs tuned bench comparison

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -111,3 +111,20 @@ LoRA weights: **cc-by-sa-4.0** — see License chain table above for derivation
111
  ## Related
112
 
113
  See the full [Ailiance-fr LoRA collection](https://huggingface.co/Ailiance-fr).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
111
  ## Related
112
 
113
  See the full [Ailiance-fr LoRA collection](https://huggingface.co/Ailiance-fr).
114
+
115
+
116
+ ## Bench comparison (2026-05-11)
117
+
118
+ ### Base model (Apertus-70B-Instruct-2509) capability
119
+
120
+ | Task | Score | Notes |
121
+ |---|---:|---|
122
+ | ARC-Easy acc / acc_norm | **0.81 / 0.77** | W3 lm-eval-harness BF16 |
123
+ | GSM8K-CoT | TIMEOUT (1800s budget) | base 70B BF16 too slow for CoT |
124
+ | MMLU-Pro Computer Science | TIMEOUT | |
125
+
126
+ ### This LoRA (tuned) — bench PENDING
127
+
128
+ Production usage: served via gateway alias `ailiance-apertus-<domain>` on
129
+ <https://www.ailiance.fr> through the Apertus multi-LoRA hot-swap server
130
+ (Studio :9322, 1 base + 10 LoRA dynamic swap, ~40GB VRAM).