docs: PST-verbatim model card v0.4.3 (Commission template 2025-07-24)
Browse files
README.md
CHANGED
|
@@ -10,7 +10,7 @@ tags:
|
|
| 10 |
- art-52
|
| 11 |
- art-53
|
| 12 |
- gpai-fine-tune
|
| 13 |
-
- pst-
|
| 14 |
language:
|
| 15 |
- en
|
| 16 |
- fr
|
|
@@ -19,176 +19,146 @@ library_name: peft
|
|
| 19 |
|
| 20 |
# eu-kiki-devstral-python-lora
|
| 21 |
|
| 22 |
-
LoRA adapter for **mistralai/Devstral-Small-2-24B-Instruct-2512**, part of the [eu-kiki](https://github.com/L-electron-Rare/eu-kiki) project
|
| 23 |
|
| 24 |
-
> **EU AI Act compliance
|
| 25 |
-
>
|
| 26 |
-
> of
|
| 27 |
-
>
|
| 28 |
-
>
|
| 29 |
-
>
|
| 30 |
-
>
|
| 31 |
-
> [AI Office page](https://digital-strategy.ec.europa.eu/en/policies/ai-office)
|
| 32 |
-
> for the canonical version. This card is **PST-aligned, not PST-verbatim**.
|
| 33 |
|
| 34 |
---
|
| 35 |
|
| 36 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
| Field | Value |
|
| 39 |
|---|---|
|
| 40 |
-
| **
|
| 41 |
-
| **
|
| 42 |
-
| **
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
|
| 47 |
-
|
|
| 48 |
-
| **
|
| 49 |
-
| **
|
| 50 |
-
| **
|
| 51 |
-
| **
|
|
|
|
|
|
|
|
|
|
| 52 |
|
| 53 |
---
|
| 54 |
|
| 55 |
-
#
|
|
|
|
|
|
|
| 56 |
|
| 57 |
-
|
| 58 |
-
training-content sources. **Empty categories are listed explicitly** so
|
| 59 |
-
absence is auditable.
|
| 60 |
|
| 61 |
-
|
| 62 |
|
| 63 |
-
|
|
|
|
|
|
|
| 64 |
|---|---|---|---:|---|
|
| 65 |
-
| StarCoder2 Self-Instruct (Python subset) | https://huggingface.co/datasets/bigcode/starcoder2-self-align | `Apache-2.0` | 2,850 | Public HF dataset
|
| 66 |
|
| 67 |
-
##
|
| 68 |
|
| 69 |
-
|
| 70 |
|
| 71 |
-
|
| 72 |
|
| 73 |
-
|
| 74 |
|
| 75 |
-
### 2.
|
| 76 |
|
| 77 |
-
|
| 78 |
|
| 79 |
-
-
|
| 80 |
|
| 81 |
-
##
|
| 82 |
|
| 83 |
-
|
| 84 |
-
|---|---|
|
| 85 |
-
| **Total records used for this LoRA** | 2,850 |
|
| 86 |
-
| **Domain label in the eu-kiki router** | `python` |
|
| 87 |
-
| **Time-period of source data** | Mixed; per-source download dates logged in `_provenance` fields |
|
| 88 |
-
| **Modalities in training data** | Text only |
|
| 89 |
-
| **Languages in training data** | English, French |
|
| 90 |
-
| **Estimated total tokens** | β 570,000 (heuristic 200 tokens / record average) |
|
| 91 |
-
|
| 92 |
-
The full system-level inventory (all 35+ domains across 7 base models /
|
| 93 |
-
candidates, β 82 K records, with per-source SPDX license, download dates,
|
| 94 |
-
and `n_used` counts) is published at
|
| 95 |
-
[`docs/eu-ai-act-transparency.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/docs/eu-ai-act-transparency.md)
|
| 96 |
-
Β§4.4. This adapter consumes a strict subset of that inventory.
|
| 97 |
|
| 98 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 99 |
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
##
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
Directive 2019/790 (DSM Directive) **Article 4 β Text and Data Mining
|
| 108 |
-
exception**. Robots.txt respected at collection time. SHA-256 manifests
|
| 109 |
-
published at
|
| 110 |
-
[`docs/pdf-compliance-report.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/docs/pdf-compliance-report.md).
|
| 111 |
-
- **Scraped data (Β§2.3):** opt-out signals (robots.txt `Disallow`,
|
| 112 |
-
`<meta name="robots" content="noai">`, TDM Reservation headers,
|
| 113 |
-
ai.txt) are honoured at collection time. Manifests under
|
| 114 |
-
`data/scraped/<source>/manifest.json` in the source repo.
|
| 115 |
-
- **Removal requests:** open an issue at the source repo URL above or
|
| 116 |
-
contact the operator listed in Β§1. We commit to remove disputed
|
| 117 |
-
content within 30 days and re-train the adapter on the next release
|
| 118 |
-
cycle.
|
| 119 |
-
|
| 120 |
-
### 4.2 Quality and curation
|
| 121 |
-
|
| 122 |
-
- Per-record `_provenance` fields (source URL, SPDX license,
|
| 123 |
-
`record_idx`, `access_date`) attached to 49,956 records across
|
| 124 |
-
21 domains (system-level), enabling per-record audit and removal.
|
| 125 |
-
- Per-domain cap of β€ 3 000 records applied to keep classes balanced
|
| 126 |
-
across the routing surface.
|
| 127 |
-
- Synthetic data (when present) is explicitly marked `source: "synthetic"`
|
| 128 |
-
in the row provenance.
|
| 129 |
-
|
| 130 |
-
### 4.3 Personal data and PII (Art. 10 + Art. 53(1)(d))
|
| 131 |
-
|
| 132 |
-
Training data scanned with **Microsoft Presidio + en_core_web_lg**
|
| 133 |
-
(2026-04-28) across all 35+ domain directories. **One** email address
|
| 134 |
-
detected in the unrelated `traduction-tech` corpus was redacted before
|
| 135 |
-
training. **No high-signal PII** (email, phone, credit card, SSN, IBAN)
|
| 136 |
-
remains in the released adapters. Low-signal Presidio detections
|
| 137 |
-
(PERSON, LOCATION, DATE_TIME) are common false positives in technical
|
| 138 |
-
text and were left in place. Full report:
|
| 139 |
-
`data/pii-scan-report.json` in the source repo.
|
| 140 |
-
|
| 141 |
-
### 4.4 Special categories of personal data (GDPR Art. 9)
|
| 142 |
-
|
| 143 |
-
No special-category data (health, religion, sexual orientation, etc.)
|
| 144 |
-
was intentionally collected. The PII scan above also screens for
|
| 145 |
-
identifiers that could lead to special-category inference; none were
|
| 146 |
-
flagged.
|
| 147 |
-
|
| 148 |
-
### 4.5 Copyright opt-out registry
|
| 149 |
-
|
| 150 |
-
The provider tracks opt-outs via the Issues tracker on the source
|
| 151 |
-
repository. As of release date no removal requests have been received.
|
| 152 |
|
| 153 |
---
|
| 154 |
|
| 155 |
-
#
|
| 156 |
|
| 157 |
-
|
| 158 |
|
| 159 |
-
|
| 160 |
-
|
| 161 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 162 |
|
| 163 |
---
|
| 164 |
|
| 165 |
-
#
|
| 166 |
|
| 167 |
-
|
| 168 |
-
|
| 169 |
-
|
| 170 |
-
|
| 171 |
-
|
| 172 |
-
| Dropout | 0.05 |
|
| 173 |
-
| Target modules | `q_proj`, `k_proj`, `v_proj`, `o_proj` (attention only) |
|
| 174 |
-
| Precision | BF16 |
|
| 175 |
-
| Optimiser | AdamW |
|
| 176 |
-
| Learning rate | 1e-5 |
|
| 177 |
-
| Batch size Γ grad-accum | 1 Γ 4β8 |
|
| 178 |
-
| Framework | MLX (`mlx_lm` fork on Apple Silicon) |
|
| 179 |
-
| Hardware | Mac Studio M3 Ultra 512 GB unified memory |
|
| 180 |
-
|
| 181 |
-
### 6.1 Compute resources (Art. 53(1)(d))
|
| 182 |
-
|
| 183 |
-
LoRA training is parameter-efficient: only β 0.1β0.5 % of base-model
|
| 184 |
-
parameters are updated. **Estimated training compute βͺ 10Β²β΅ FLOPs** β
|
| 185 |
-
the systemic-risk threshold of Art. 51. Single-machine training on
|
| 186 |
-
Mac Studio M3 Ultra; no datacentre footprint. No proprietary teacher
|
| 187 |
-
model is used in deployed inference.
|
| 188 |
|
| 189 |
---
|
| 190 |
|
| 191 |
-
#
|
| 192 |
|
| 193 |
```python
|
| 194 |
from mlx_lm import load
|
|
@@ -215,21 +185,16 @@ python -m mlx_lm fuse \
|
|
| 215 |
|
| 216 |
---
|
| 217 |
|
| 218 |
-
#
|
| 219 |
|
| 220 |
-
-
|
| 221 |
-
|
| 222 |
-
-
|
| 223 |
-
|
| 224 |
-
high-risk and require additional obligations.
|
| 225 |
-
- **Hallucination present** at typical instruction-tuned LLM levels;
|
| 226 |
-
pair with a verifier or human-in-the-loop for factual outputs.
|
| 227 |
-
- **LoRA inherits all base-model limitations**: training cutoff,
|
| 228 |
-
language coverage, refusal patterns.
|
| 229 |
|
| 230 |
---
|
| 231 |
|
| 232 |
-
#
|
| 233 |
|
| 234 |
```bibtex
|
| 235 |
@misc{eu-kiki-2026,
|
|
@@ -241,9 +206,13 @@ python -m mlx_lm fuse \
|
|
| 241 |
}
|
| 242 |
```
|
| 243 |
|
| 244 |
-
|
|
|
|
|
|
|
| 245 |
|
| 246 |
| Date | Card version | Change |
|
| 247 |
|---|---|---|
|
| 248 |
-
| 2026-05-06 | v0.4.
|
| 249 |
-
| 2026-05-06 | v0.4.
|
|
|
|
|
|
|
|
|
| 10 |
- art-52
|
| 11 |
- art-53
|
| 12 |
- gpai-fine-tune
|
| 13 |
+
- pst-2025-07-24
|
| 14 |
language:
|
| 15 |
- en
|
| 16 |
- fr
|
|
|
|
| 19 |
|
| 20 |
# eu-kiki-devstral-python-lora
|
| 21 |
|
| 22 |
+
LoRA adapter for **mistralai/Devstral-Small-2-24B-Instruct-2512**, part of the [eu-kiki](https://github.com/L-electron-Rare/eu-kiki) project. Live demo: https://ml.saillant.cc.
|
| 23 |
|
| 24 |
+
> **EU AI Act compliance.** This card follows the **European Commission's
|
| 25 |
+
> *Template for the Public Summary of Training Content* for general-purpose
|
| 26 |
+
> AI models** (Art. 53(1)(d) of Regulation (EU) 2024/1689, published by the
|
| 27 |
+
> AI Office on 2025-07-24). Section numbering and field labels reproduce
|
| 28 |
+
> the official template. Where this card and the official template differ
|
| 29 |
+
> in wording, the **official template wins** β see the
|
| 30 |
+
> [AI Office page](https://digital-strategy.ec.europa.eu/en/library/explanatory-notice-and-template-public-summary-training-content-general-purpose-ai-models).
|
|
|
|
|
|
|
| 31 |
|
| 32 |
---
|
| 33 |
|
| 34 |
+
# 1. General information
|
| 35 |
+
|
| 36 |
+
## 1.1. Provider identification
|
| 37 |
+
|
| 38 |
+
| Field | Value |
|
| 39 |
+
|---|---|
|
| 40 |
+
| **Provider name and contact details** | L'Γlectron Rare (Saillant ClΓ©ment) β `clemsail` on Hugging Face β Issues: https://github.com/L-electron-Rare/eu-kiki/issues |
|
| 41 |
+
| **Authorised representative name and contact details** | Not applicable β provider is established within the European Union (France). |
|
| 42 |
+
|
| 43 |
+
## 1.2. Model identification
|
| 44 |
|
| 45 |
| Field | Value |
|
| 46 |
|---|---|
|
| 47 |
+
| **Versioned model name(s)** | `clemsail/eu-kiki-devstral-python-lora` (this LoRA adapter, v0.4.2) |
|
| 48 |
+
| **Model dependencies** | This is a **fine-tune (LoRA, rank 16)** of the general-purpose AI model [`mistralai/Devstral-Small-2-24B-Instruct-2512`](https://huggingface.co/mistralai/Devstral-Small-2-24B-Instruct-2512). Refer to the base-model provider's PST for the underlying training summary. |
|
| 49 |
+
| **Date of placement of the model on the Union market** | 2026-05-06 |
|
| 50 |
+
|
| 51 |
+
## 1.3. Modalities, overall training data size and other characteristics
|
| 52 |
+
|
| 53 |
+
| Field | Value |
|
| 54 |
+
|---|---|
|
| 55 |
+
| **Modality** | β Text β Image β Audio β Video β Other |
|
| 56 |
+
| **Training data size** (text bucket) | β Less than 1 billion tokens β 1 billion to 10 trillion tokens β More than 10 trillion tokens |
|
| 57 |
+
| **Types of content** | Instruction-tuning pairs, technical text, source code, multilingual instruction templates (EU official languages where applicable). |
|
| 58 |
+
| **Approximate size in alternative units** | β 0.6 M tokens (2 850 rows Γ β 200 tokens/row, single-pass). |
|
| 59 |
+
| **Latest date of data acquisition / collection for model training** | 11/2024 (StarCoder2 Self-Instruct release). The model is **not** continuously trained on new data after this date. |
|
| 60 |
+
| **Linguistic characteristics of the overall training data** | English (primary, instruction language); French (system-prompt context). No other natural languages in training rows. |
|
| 61 |
+
| **Other relevant characteristics / additional comments** | LoRA fine-tune (rank 16, alpha 32, dropout 0.05); only attention projections (`q_proj`, `k_proj`, `v_proj`, `o_proj`) are trained. Per-record `_provenance` (source, SPDX licence, `record_idx`, `access_date`) attached at the system level (see [`docs/eu-ai-act-transparency.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/docs/eu-ai-act-transparency.md) Β§4.4). Tokenizer: inherited from the base model. |
|
| 62 |
|
| 63 |
---
|
| 64 |
|
| 65 |
+
# 2. List of data sources
|
| 66 |
+
|
| 67 |
+
## 2.1. Publicly available datasets
|
| 68 |
|
| 69 |
+
**Have you used publicly available datasets to train the model?** β Yes β No
|
|
|
|
|
|
|
| 70 |
|
| 71 |
+
**Modality(ies) of the content covered:** β Text β Image β Video β Audio β Other
|
| 72 |
|
| 73 |
+
**List of large publicly available datasets:**
|
| 74 |
+
|
| 75 |
+
| Dataset | URL | SPDX licence | Records | Notes |
|
| 76 |
|---|---|---|---:|---|
|
| 77 |
+
| StarCoder2 Self-Instruct (Python subset filtered by language keyword) | https://huggingface.co/datasets/bigcode/starcoder2-self-align | `Apache-2.0` | 2,850 | Public HF dataset; instruction-tuning pairs. |
|
| 78 |
|
| 79 |
+
## 2.2. Private non-publicly available datasets obtained from third parties
|
| 80 |
|
| 81 |
+
### 2.2.1. Datasets commercially licensed by rightsholders or their representatives
|
| 82 |
|
| 83 |
+
**Have you concluded transactional commercial licensing agreement(s) with rightsholder(s) or with their representatives?** β Yes β No
|
| 84 |
|
| 85 |
+
_(N/A β no commercial licensing agreements concluded.)_
|
| 86 |
|
| 87 |
+
### 2.2.2. Private datasets obtained from other third parties
|
| 88 |
|
| 89 |
+
**Have you obtained private datasets from third parties that are not licensed as described in Section 2.2.1?** β Yes β No
|
| 90 |
|
| 91 |
+
_(N/A β no private third-party datasets obtained.)_
|
| 92 |
|
| 93 |
+
## 2.3. Data crawled and scraped from online sources
|
| 94 |
|
| 95 |
+
**Were crawlers used by the provider or on behalf of?** β Yes β No
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
|
| 97 |
+
_(N/A β no crawler used.)_
|
| 98 |
+
|
| 99 |
+
## 2.4. User data
|
| 100 |
+
|
| 101 |
+
**Was data from user interactions with the AI model (e.g. user input and prompts) used to train the model?** β Yes β No
|
| 102 |
+
|
| 103 |
+
**Was data collected from user interactions with the provider's other services or products used to train the model?** β Yes β No
|
| 104 |
+
|
| 105 |
+
_(N/A β no user data collected from any provider service or AI-model interaction is used to train this LoRA.)_
|
| 106 |
+
|
| 107 |
+
## 2.5. Synthetic data
|
| 108 |
+
|
| 109 |
+
**Was synthetic AI-generated data created by the provider or on their behalf to train the model?** β Yes β No
|
| 110 |
|
| 111 |
+
_(N/A β no synthetic AI-generated data created by the provider or on their behalf to train this LoRA.)_
|
| 112 |
+
|
| 113 |
+
## 2.6. Other sources of data
|
| 114 |
+
|
| 115 |
+
**Have data sources other than those described in Sections 2.1 to 2.5 been used to train the model?** β Yes β No
|
| 116 |
+
|
| 117 |
+
_(N/A β no other data sources used.)_
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 118 |
|
| 119 |
---
|
| 120 |
|
| 121 |
+
# 3. Data processing aspects
|
| 122 |
|
| 123 |
+
## 3.1. Respect of reservation of rights from text and data mining exception or limitation
|
| 124 |
|
| 125 |
+
**Are you a Signatory to the Code of Practice for general-purpose AI models that includes commitments to respect reservations of rights from the TDM exception or limitation?** β Yes β No *(SME / individual provider; commitments equivalent in substance, see below.)*
|
| 126 |
+
|
| 127 |
+
**Measures implemented before model training to respect reservations of rights from the TDM exception or limitation:**
|
| 128 |
+
|
| 129 |
+
- **Public HF datasets (Β§2.1):** all carry permissive open licences (Apache-2.0, MIT, CC-BY-*, BSD); SPDX matrix verified per-source. The licences explicitly authorise instructional / model-training use for the rows actually selected.
|
| 130 |
+
- **Web-scraped sources (Β§2.3):** prior to collection the provider verified `robots.txt`, `<meta name="robots" content="noai">`, `ai.txt`, and TDM-Reservation HTTP headers. Any source returning a reservation under Article 4(3) of Directive (EU) 2019/790 was excluded from collection. Scraping was limited to authoritative vendor-controlled repositories (ESP-IDF, STM32Cube, Arduino, KiCad symbols/footprints) operating under permissive licences.
|
| 131 |
+
- **Vendor PDF datasheets (Β§2.2.2 where present):** processed under the EU DSM Directive Article 4 TDM exception. SHA-256 manifests and per-source legal-basis records are published in [`docs/pdf-compliance-report.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/docs/pdf-compliance-report.md).
|
| 132 |
+
- **Public copyright policy (Art. 53(1)(c)):** [`docs/eu-ai-act-transparency.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/docs/eu-ai-act-transparency.md). Removal requests are handled via the issue tracker on the source repository; the provider commits to remove disputed content within 30 days and re-train on the next release cycle.
|
| 133 |
+
|
| 134 |
+
## 3.2. Removal of illegal content
|
| 135 |
+
|
| 136 |
+
**General description of measures taken:**
|
| 137 |
+
|
| 138 |
+
- The provider does not crawl the open web at large; sources are restricted to curated public HF datasets and authoritative vendor repositories where the risk of illegal content (CSAM, terrorist content, IP-violating works) is structurally low.
|
| 139 |
+
- Personal data was screened with **Microsoft Presidio + en_core_web_lg** (2026-04-28) across all 35+ system-level domain directories. **One** email address detected in the unrelated `traduction-tech` corpus was redacted before training. Full report: `data/pii-scan-report.json`.
|
| 140 |
+
- No special-category data (GDPR Art. 9: health, religion, sexual orientation, etc.) was intentionally collected; the PII scan also screens for identifiers that could enable special-category inference (none flagged).
|
| 141 |
+
- License compatibility is enforced via per-source SPDX matrix; works under non-permissive licences are excluded.
|
| 142 |
+
|
| 143 |
+
## 3.3. Other information (optional)
|
| 144 |
+
|
| 145 |
+
- **Per-record provenance:** 49 956 system-level training records carry `_provenance.{source, license, record_idx, access_date}` fields, enabling per-record audit and removal.
|
| 146 |
+
- **Compute footprint:** LoRA training updates β 0.1β0.5 % of base-model parameters. **Estimated training compute for this LoRA βͺ 10Β²β΅ FLOPs**, well below the systemic-risk threshold of EU AI Act Art. 51. No proprietary teacher model is used in deployed inference.
|
| 147 |
+
- **Risk classification:** Limited risk (Art. 52). Not deployed in safety-critical contexts.
|
| 148 |
|
| 149 |
---
|
| 150 |
|
| 151 |
+
# Appendix A β Performance evaluation (Art. 53(1)(a))
|
| 152 |
|
| 153 |
+
**HumanEval+** (EvalPlus official Linux scorer, 164 problems, greedy, 1 sample): base 87.20 / 82.90 β +python 86.00 / 81.10. **Ξ HE+ = β1.80 pts** vs base. Scoring on `kx6tm-23` (Proxmox PVE 6.17). Full reproducer in [`eval/results/2026-05-04/devstral-python-fused-humanevalplus/rerun.sh`](https://github.com/L-electron-Rare/eu-kiki/blob/main/eval/results/2026-05-04/devstral-python-fused-humanevalplus/).
|
| 154 |
+
|
| 155 |
+
Full bench results, methodology, env.json, and rerun.sh per measurement:
|
| 156 |
+
[`eval/results/SUMMARY.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/eval/results/SUMMARY.md) Β·
|
| 157 |
+
[`MODEL_CARD.md`](https://github.com/L-electron-Rare/eu-kiki/blob/main/MODEL_CARD.md).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 158 |
|
| 159 |
---
|
| 160 |
|
| 161 |
+
# Appendix B β Usage
|
| 162 |
|
| 163 |
```python
|
| 164 |
from mlx_lm import load
|
|
|
|
| 185 |
|
| 186 |
---
|
| 187 |
|
| 188 |
+
# Appendix C β Limitations and out-of-scope use
|
| 189 |
|
| 190 |
+
- Not for safety-critical decisions (medical, legal, structural, life-safety, biometric).
|
| 191 |
+
- Not for high-stakes individual decisions (hiring, credit, law enforcement) β that would re-classify under EU AI Act Art. 6 high-risk and require additional obligations.
|
| 192 |
+
- Hallucination present at typical instruction-tuned LLM levels; pair with a verifier or human-in-the-loop for factual outputs.
|
| 193 |
+
- LoRA inherits all base-model limitations (training cutoff, language coverage, refusal patterns).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 194 |
|
| 195 |
---
|
| 196 |
|
| 197 |
+
# Appendix D β Citation
|
| 198 |
|
| 199 |
```bibtex
|
| 200 |
@misc{eu-kiki-2026,
|
|
|
|
| 206 |
}
|
| 207 |
```
|
| 208 |
|
| 209 |
+
---
|
| 210 |
+
|
| 211 |
+
# Appendix E β Changelog
|
| 212 |
|
| 213 |
| Date | Card version | Change |
|
| 214 |
|---|---|---|
|
| 215 |
+
| 2026-05-06 | v0.4.0 | Initial HF release |
|
| 216 |
+
| 2026-05-06 | v0.4.1 | Self-contained EU AI Act card (per-adapter dataset table, PII statement, contact) |
|
| 217 |
+
| 2026-05-06 | v0.4.2 | PST-aligned (Commission template structure, Sections Β§1β4) |
|
| 218 |
+
| 2026-05-06 | **v0.4.3** | **PST-verbatim** β section labels and field names reproduced from the official Commission template (PDF 2025-07-24, English version). |
|