docs: add upstream base model official evaluations

77f0fd6 verified 2 days ago

6 kB

license: apache-2.0
base_model: swiss-ai/Apertus-70B-Instruct-2509
library_name: peft
tags:
  - mlx
  - lora
  - peft
  - ailiance
  - apertus
  - math
language:
  - en
  - fr
pipeline_tag: text-generation

Ailiance — Apertus-70B-Instruct math LoRA

LoRA adapter fine-tuned on swiss-ai/Apertus-70B-Instruct-2509 for math tasks.

Maintained by Ailiance — French AI org publishing EU AI Act aligned LoRA adapters and datasets.

Quick start (MLX)

from mlx_lm import load, generate

model, tokenizer = load(
    "swiss-ai/Apertus-70B-Instruct-2509",
    adapter_path="Ailiance-fr/apertus-math-lora",
)

print(generate(model, tokenizer, prompt="..."))

Training

Hyperparameter	Value
Base model	`swiss-ai/Apertus-70B-Instruct-2509`
Method	LoRA via `mlx-lm`
Rank	16
Scale	2.0
Alpha	32
Max seq length	1024
Iterations	500
Optimizer	Adam, LR 1e-5
Hardware	Apple M3 Ultra 512 GB

Training data lineage

Derived from the internal eu-kiki / mascarade curation. All upstream samples are synthetic, permissively-licensed, or generated from Apache-2.0 base resources. See the Ailiance-fr catalog for related cards.

Benchmark roadmap

This LoRA has not yet been evaluated through electron-bench (the current pipeline supports gemma-4-E4B base only). Training was completed with the standard mlx-lm LoRA trainer (rank 16, alpha 32, scale 2.0, AdamW LR 1e-5, 500 iters) — full hyperparameters are in the Training table above.

Planned evaluations:

Perplexity on the validation split of the training data
Functional benchmark on apertus-specific tasks
Comparison vs base swiss-ai/Apertus-70B-Instruct-2509

Track progress: ailiance-bench issues.

For reference benchmarks on the gemma-4-E4B base, see the base-vs-LoRA matrix.

License chain

Component	License
Base model (`swiss-ai/Apertus-70B-Instruct-2509`)	apache-2.0
Training data (internal Ailiance curation (synthetic + permissive sources))	apache-2.0
LoRA adapter (this repo)	apache-2.0

All upstream components are Apache 2.0 / MIT — LoRA inherits permissive terms.

EU AI Act compliance

Article 53(1)(c): training data licenses preserved (per-dataset cards declare upstream licenses).
Article 53(1)(d): training data summary — see upstream dataset cards on Ailiance-fr.
GPAI Code of Practice (July 2025): base swiss-ai/Apertus-70B-Instruct-2509 released under apache-2.0.
No web scraping by Ailiance, no licensed data, no PII.
Upstream Stack Exchange content (where applicable) is CC-BY-SA-4.0 and propagates to this adapter.

License

LoRA weights: apache-2.0 — see License chain table above for derivation rationale.

Citation

@misc{ailiance_apertus_math_2026,
  author    = {Ailiance},
  title     = {Ailiance — Apertus-70B-Instruct math LoRA},
  year      = {2026},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/Ailiance-fr/apertus-math-lora}
}

See the full Ailiance-fr LoRA collection.

Bench comparison (2026-05-11)

Base model (Apertus-70B-Instruct-2509) capability

Task	Score	Notes
ARC-Easy acc / acc_norm	0.81 / 0.77	W3 lm-eval-harness BF16
GSM8K-CoT	TIMEOUT (1800s budget)	base 70B BF16 too slow for CoT
MMLU-Pro Computer Science	TIMEOUT

This LoRA (tuned) — bench PENDING

Production usage: served via gateway alias ailiance-apertus-<domain> on https://www.ailiance.fr through the Apertus multi-LoRA hot-swap server (Studio :9322, 1 base + 10 LoRA dynamic swap, ~40GB VRAM).

Upstream base model — official evaluations

This LoRA fine-tunes swiss-ai/Apertus-70B-Instruct-2509, the EU-sovereign open-source LLM released by the Swiss AI Initiative. Below are the official scores reported in the Apertus Tech Report on a suite of multilingual reasoning benchmarks.

Model	Avg	ARC	HellaSwag	WinoGrande	XNLI	XCOPA	PIQA
Apertus-70B (this base)	67.5	70.6	64.0	73.3	45.3	69.8	81.9
Apertus-8B	65.8	72.7	59.8	70.6	45.2	66.5	79.8
Llama3.1-70B	67.3	74.4	56.5	79.4	44.3	66.7	82.3
Qwen2.5-72B	69.8	76.2	67.5	78.0	46.9	68.2	82.0
OLMo2-32B	67.7	76.2	66.7	78.6	42.9	60.1	82.1
EuroLLM-9B	62.8	67.9	57.9	68.8	41.5	61.1	79.6

Many additional benchmark evaluations (pretraining/post-training phases, multilingual in ~100 languages, long-context) are in Section 5 of the Apertus Tech Report.

Source: official Apertus-70B-Instruct-2509 model card.

Reading these alongside this LoRA: Apertus-70B is EU AI Act-compliant (Apertus_EU_Code_of_Practice.pdf, Apertus_EU_Public_Summary.pdf included in upstream weights). This LoRA inherits that compliance plus the general-capability floor shown above, then adds domain specialization.

Ailiance-fr
/

apertus-math-lora