clemsail commited on
Commit
a4bfb80
·
verified ·
1 Parent(s): 70fec39

docs: PST-verbatim model card v0.4.3 (Commission template 2025-07-24)

Browse files
Files changed (1) hide show
  1. README.md +9 -8
README.md CHANGED
@@ -5,6 +5,7 @@ tags:
5
  - lora
6
  - peft
7
  - mlx
 
8
  - eu-kiki
9
  - eu-ai-act
10
  - art-52
@@ -18,7 +19,7 @@ library_name: peft
18
 
19
  # eu-kiki-devstral-cpp-lora
20
 
21
- LoRA adapter for **mistralai/Devstral-Small-2-24B-Instruct-2512**, part of the [eu-kiki](https://github.com/L-electron-Rare/eu-kiki) project. Live demo: https://ml.saillant.cc.
22
 
23
  > **EU AI Act compliance.** This card follows the **European Commission's
24
  > *Template for the Public Summary of Training Content* for general-purpose
@@ -36,7 +37,7 @@ LoRA adapter for **mistralai/Devstral-Small-2-24B-Instruct-2512**, part of the [
36
 
37
  | Field | Value |
38
  |---|---|
39
- | **Provider name and contact details** | L'Électron Rare (Saillant Clément) — `clemsail` on Hugging Face — Issues: https://github.com/L-electron-Rare/eu-kiki/issues |
40
  | **Authorised representative name and contact details** | Not applicable — provider is established within the European Union (France). |
41
 
42
  ## 1.2. Model identification
@@ -57,7 +58,7 @@ LoRA adapter for **mistralai/Devstral-Small-2-24B-Instruct-2512**, part of the [
57
  | **Approximate size in alternative units** | ≈ 0.6 M tokens (2 850 rows × ≈ 200 tokens/row). |
58
  | **Latest date of data acquisition / collection for model training** | 10/2025 (last commit on scraped repos). The model is **not** continuously trained on new data after this date. |
59
  | **Linguistic characteristics of the overall training data** | English (technical instruction language). No other natural languages. |
60
- | **Other relevant characteristics / additional comments** | LoRA fine-tune (rank 16, alpha 32, dropout 0.05); only attention projections (`q_proj`, `k_proj`, `v_proj`, `o_proj`) are trained. Per-record `_provenance` (source, SPDX licence, `record_idx`, `access_date`) attached at the system level (see [`docs/eu-ai-act-transparency.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/docs/eu-ai-act-transparency.md) §4.4). Tokenizer: inherited from the base model. |
61
 
62
  ---
63
 
@@ -141,8 +142,8 @@ _(N/A — no other data sources used.)_
141
 
142
  - **Public HF datasets (§2.1):** all carry permissive open licences (Apache-2.0, MIT, CC-BY-*, BSD); SPDX matrix verified per-source. The licences explicitly authorise instructional / model-training use for the rows actually selected.
143
  - **Web-scraped sources (§2.3):** prior to collection the provider verified `robots.txt`, `<meta name="robots" content="noai">`, `ai.txt`, and TDM-Reservation HTTP headers. Any source returning a reservation under Article 4(3) of Directive (EU) 2019/790 was excluded from collection. Scraping was limited to authoritative vendor-controlled repositories (ESP-IDF, STM32Cube, Arduino, KiCad symbols/footprints) operating under permissive licences.
144
- - **Vendor PDF datasheets (§2.2.2 where present):** processed under the EU DSM Directive Article 4 TDM exception. SHA-256 manifests and per-source legal-basis records are published in [`docs/pdf-compliance-report.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/docs/pdf-compliance-report.md).
145
- - **Public copyright policy (Art. 53(1)(c)):** [`docs/eu-ai-act-transparency.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/docs/eu-ai-act-transparency.md). Removal requests are handled via the issue tracker on the source repository; the provider commits to remove disputed content within 30 days and re-train on the next release cycle.
146
 
147
  ## 3.2. Removal of illegal content
148
 
@@ -166,8 +167,8 @@ _(N/A — no other data sources used.)_
166
  **HumanEval** (custom Studio scorer; EvalPlus extra-tests not run — Linux-only sandbox): base 87.20 → +cpp 85.98 = **−1.22 pts**. For rigorous HumanEval+ Δ, sample re-scoring on Linux is required (samples preserved at `eval/results/2026-05-04/devstral-cpp-fused-humanevalplus/`).
167
 
168
  Full bench results, methodology, env.json, and rerun.sh per measurement:
169
- [`eval/results/SUMMARY.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/eval/results/SUMMARY.md) ·
170
- [`MODEL_CARD.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/MODEL_CARD.md).
171
 
172
  ---
173
 
@@ -214,7 +215,7 @@ python -m mlx_lm fuse \
214
  title = {eu-kiki: EU-sovereign multi-model LLM serving with HF-traceable LoRA adapters},
215
  author = {Saillant, Clément},
216
  year = {2026},
217
- url = {https://github.com/L-electron-Rare/eu-kiki},
218
  note = {Live demo: https://ml.saillant.cc}
219
  }
220
  ```
 
5
  - lora
6
  - peft
7
  - mlx
8
+ - ailiance
9
  - eu-kiki
10
  - eu-ai-act
11
  - art-52
 
19
 
20
  # eu-kiki-devstral-cpp-lora
21
 
22
+ LoRA adapter for **mistralai/Devstral-Small-2-24B-Instruct-2512**, part of the [ailiance](https://github.com/L-electron-Rare/ailiance) project. Live demo: https://ml.saillant.cc.
23
 
24
  > **EU AI Act compliance.** This card follows the **European Commission's
25
  > *Template for the Public Summary of Training Content* for general-purpose
 
37
 
38
  | Field | Value |
39
  |---|---|
40
+ | **Provider name and contact details** | L'Électron Rare (Saillant Clément) — `clemsail` on Hugging Face — Issues: https://github.com/L-electron-Rare/ailiance/issues |
41
  | **Authorised representative name and contact details** | Not applicable — provider is established within the European Union (France). |
42
 
43
  ## 1.2. Model identification
 
58
  | **Approximate size in alternative units** | ≈ 0.6 M tokens (2 850 rows × ≈ 200 tokens/row). |
59
  | **Latest date of data acquisition / collection for model training** | 10/2025 (last commit on scraped repos). The model is **not** continuously trained on new data after this date. |
60
  | **Linguistic characteristics of the overall training data** | English (technical instruction language). No other natural languages. |
61
+ | **Other relevant characteristics / additional comments** | LoRA fine-tune (rank 16, alpha 32, dropout 0.05); only attention projections (`q_proj`, `k_proj`, `v_proj`, `o_proj`) are trained. Per-record `_provenance` (source, SPDX licence, `record_idx`, `access_date`) attached at the system level (see [`docs/eu-ai-act-transparency.md`](https://github.com/L-electron-Rare/ailiance/blob/main/docs/eu-ai-act-transparency.md) §4.4). Tokenizer: inherited from the base model. |
62
 
63
  ---
64
 
 
142
 
143
  - **Public HF datasets (§2.1):** all carry permissive open licences (Apache-2.0, MIT, CC-BY-*, BSD); SPDX matrix verified per-source. The licences explicitly authorise instructional / model-training use for the rows actually selected.
144
  - **Web-scraped sources (§2.3):** prior to collection the provider verified `robots.txt`, `<meta name="robots" content="noai">`, `ai.txt`, and TDM-Reservation HTTP headers. Any source returning a reservation under Article 4(3) of Directive (EU) 2019/790 was excluded from collection. Scraping was limited to authoritative vendor-controlled repositories (ESP-IDF, STM32Cube, Arduino, KiCad symbols/footprints) operating under permissive licences.
145
+ - **Vendor PDF datasheets (§2.2.2 where present):** processed under the EU DSM Directive Article 4 TDM exception. SHA-256 manifests and per-source legal-basis records are published in [`docs/pdf-compliance-report.md`](https://github.com/L-electron-Rare/ailiance/blob/main/docs/pdf-compliance-report.md).
146
+ - **Public copyright policy (Art. 53(1)(c)):** [`docs/eu-ai-act-transparency.md`](https://github.com/L-electron-Rare/ailiance/blob/main/docs/eu-ai-act-transparency.md). Removal requests are handled via the issue tracker on the source repository; the provider commits to remove disputed content within 30 days and re-train on the next release cycle.
147
 
148
  ## 3.2. Removal of illegal content
149
 
 
167
  **HumanEval** (custom Studio scorer; EvalPlus extra-tests not run — Linux-only sandbox): base 87.20 → +cpp 85.98 = **−1.22 pts**. For rigorous HumanEval+ Δ, sample re-scoring on Linux is required (samples preserved at `eval/results/2026-05-04/devstral-cpp-fused-humanevalplus/`).
168
 
169
  Full bench results, methodology, env.json, and rerun.sh per measurement:
170
+ [`eval/results/SUMMARY.md`](https://github.com/L-electron-Rare/ailiance/blob/main/eval/results/SUMMARY.md) ·
171
+ [`MODEL_CARD.md`](https://github.com/L-electron-Rare/ailiance/blob/main/MODEL_CARD.md).
172
 
173
  ---
174
 
 
215
  title = {eu-kiki: EU-sovereign multi-model LLM serving with HF-traceable LoRA adapters},
216
  author = {Saillant, Clément},
217
  year = {2026},
218
+ url = {https://github.com/L-electron-Rare/ailiance},
219
  note = {Live demo: https://ml.saillant.cc}
220
  }
221
  ```