Token Classification
GLiNER2
ONNX
GLiNER
Rust
pii
ner
privacy
redaction
information-extraction
span-extraction
iobinding
Instructions to use SemplificaAI/gliner2-privacy-filter-PII-multi with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- GLiNER2
How to use SemplificaAI/gliner2-privacy-filter-PII-multi with GLiNER2:
from gliner2 import GLiNER2 model = GLiNER2.from_pretrained("SemplificaAI/gliner2-privacy-filter-PII-multi") # Extract entities text = "Apple CEO Tim Cook announced iPhone 15 in Cupertino yesterday." result = extractor.extract_entities(text, ["company", "person", "product", "location"]) print(result) - GLiNER
How to use SemplificaAI/gliner2-privacy-filter-PII-multi with GLiNER:
from gliner import GLiNER model = GLiNER.from_pretrained("SemplificaAI/gliner2-privacy-filter-PII-multi") - Notebooks
- Google Colab
- Kaggle
File size: 7,293 Bytes
e1383b2 a255827 e1383b2 a255827 e1383b2 a255827 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 | ---
library_name: gliner2
license: apache-2.0
base_model: fastino/gliner2-privacy-filter-PII-multi
pipeline_tag: token-classification
tags:
- token-classification
- gliner2
- gliner
- onnx
- rust
- pii
- ner
- privacy
- redaction
- information-extraction
- span-extraction
- iobinding
language:
- en
- fr
- es
- de
- it
- pt
- nl
---
# GLiNER2 Privacy-Filter PII Multi (ONNX Fragmented & IOBinding)
This repository contains the **ONNX-exported weights** of [`fastino/gliner2-privacy-filter-PII-multi`](https://huggingface.co/fastino/gliner2-privacy-filter-PII-multi),
the multilingual **PII detection model** built on GLiNER2 by Fastino AI.
The model is exported in a **fragmented format** (encoder, token_gather, span_rep, schema_gather, count_pred_argmax, count_lstm_fixed, scorer, classifier) for direct compatibility with [gliner2-rs](https://github.com/SemplificaAI/gliner2-rs), the official **Zero-Python Native Rust inference engine** for GLiNER2.
It supports detection of **42 PII entity types** across **7 languages** (EN, FR, ES, DE, IT, PT, NL).
---
## 🆕 V2 Zero-Copy IOBinding Models
Like the [`gliner2-multi-v1-onnx`](https://huggingface.co/SemplificaAI/gliner2-multi-v1-onnx) base release, this repo ships the **V2 fused IOBinding** variant. `Gather`, `ArgMax`, `MatMul` operations are fused directly into the ONNX graphs so that tensors **never leave the GPU/NPU VRAM**, bypassing the PCIe bus and cutting inference latency by ~30 % on discrete GPUs.
## 📂 Available Variants
| Variant | Use case | Notes |
|---|---|---|
| **`fp16_v2`** *(recommended)* | NVIDIA CUDA · AMD ROCm · Apple CoreML · Qualcomm QNN | Zero-Copy VRAM (IOBinding), full FP16 IO, fused ops |
| **`fp32_v2`** | CPU (AVX2 / XNNPACK / ARM NEON) | High precision V2 fusions for CPU |
| **`fp16`** *(standard)* | Legacy compatible, all EPs | FP32 IO (CoreML-compatible), slower on CUDA due to PCIe round-trips |
| **`fp32`** *(standard)* | Universal fallback | Legacy Float32 |
Each variant ships 8 fragments:
```
encoder_{precision}.onnx ~530–1060 MB
token_gather_{precision}.onnx ~ <1 MB
span_rep_{precision}.onnx ~32–63 MB
schema_gather_{precision}.onnx ~ <1 MB
count_pred_argmax_{precision}.onnx ~2–5 MB
count_lstm_fixed_{precision}.onnx ~20–41 MB
scorer_{precision}.onnx ~ <1 MB
classifier_{precision}.onnx ~2–5 MB
```
Total: **~590 MB (FP16)** or **~1.17 GB (FP32)** per variant.
---
## 🎯 Supported PII Labels (42 types)
### Person / Names (6 labels)
`person`, `full_name`, `first_name`, `middle_name`, `last_name`, `date_of_birth`
### Contact / Address (8 labels)
`email`, `phone_number`, `address`, `street_address`, `city`, `state_or_region`, `postal_code`, `country`
### Government / Tax IDs (7 labels)
`government_id`, `national_id_number`, `passport_number`, `drivers_license_number`, `license_number`, `tax_id`, `tax_number`
### Banking / Payment (8 labels)
`bank_account`, `account_number`, `routing_number`, `iban`, `payment_card`, `card_number`, `card_expiry`, `card_cvv`
### Digital Identity (4 labels)
`username`, `ip_address`, `account_id`, `sensitive_account_id`
### Secrets / Credentials (5 labels)
`password`, `secret`, `api_key`, `access_token`, `recovery_code`
### Sensitive Dates (4 labels)
`sensitive_date`, `document_date`, `expiration_date`, `transaction_date`
---
## 🚀 Usage in Rust (`gliner2-rs`)
```rust
use gliner2_inference::{Gliner2Engine, ModelType, SchemaTask};
// Auto-downloads the V2 FP16 fragments from this HuggingFace repo
// and switches to the high-performance IOBinding engine.
let engine = Gliner2Engine::from_pretrained(
"SemplificaAI/gliner2-privacy-filter-PII-multi",
Some("fp16_v2"),
ModelType::HuggingFace,
)?;
let text = "Please contact Maria Jensen at maria.jensen@example.dk or +45 20 12 34 56.";
let tasks = vec![
SchemaTask::Entities(vec![
"person".into(), "email".into(), "phone_number".into(),
])
];
let (entities, _, _) = engine.extract(text, &tasks)?;
```
Requires **`gliner2-rs >= 0.4.1`** for automatic V2 detection / IOBinding routing.
## 🐍 Usage in Python (`onnxruntime`)
Run the 8-fragment pipeline manually (no Python `gliner2` dependency needed):
```python
import onnxruntime as ort
# Per fragment (example for the encoder, CUDA backend)
encoder = ort.InferenceSession(
"encoder_fp16_iobinding.onnx",
providers=["CUDAExecutionProvider"],
)
# ...load the other 7 fragments analogously...
# Chain them via IOBinding (see validate_onnx_v2.py for a full reference impl)
```
For a simpler entry point you can keep using the original PyTorch model via the `gliner2` Python package on `fastino/gliner2-privacy-filter-PII-multi`; this ONNX repo is optimised for **production deployment without Python**.
---
## 🛠 Pipeline Wiring (IOBinding chain)
```
encoder_fp16_iobinding.onnx
│
├─ token_gather_fp16_iobinding.onnx
│ └─ span_rep_fp16_iobinding.onnx
│
└─ schema_gather_fp16_iobinding.onnx
├─ count_pred_argmax_fp16_iobinding.onnx → pred_count (int64)
└─ count_lstm_fixed_fp16_iobinding.onnx
└─ scorer_fp16_iobinding.onnx → entity_scores
classifier_fp16_iobinding.onnx (only for classification tasks)
```
---
## ⚙️ Technical Notes
- **opset 17** (ONNX 1.14+) for maximum execution-provider compatibility.
- `count_lstm_fixed` exports the GRU **unrolled to 20 fixed steps** at tracing time → compatible with execution providers that don't support dynamic loops (Apple CoreML, Qualcomm QNN).
- `scorer` uses **fused Reshape + MatMul + Transpose** instead of `Einsum` for compatibility with QNN/CoreML FP16.
- **INT8 not supported**: the DeBERTa-v3 disentangled-attention activations contain extreme outliers that saturate 8-bit ranges (the same limitation called out by the GLiNER2 maintainers). FP16 remains the optimal compression target.
- **Encoder size**: ~1.06 GB FP32 → ~530 MB FP16. Larger than the multi-v1 base because of the wider classification head (42 PII labels) and per-language fine-tuning.
## 🪪 License
Apache 2.0 — same as the upstream model.
## 🙏 Acknowledgements
- Upstream model: [`fastino/gliner2-privacy-filter-PII-multi`](https://huggingface.co/fastino/gliner2-privacy-filter-PII-multi) by Fastino AI.
- GLiNER2 paper: Zaratiana et al., *GLiNER2: Schema-Driven Multi-Task Learning for Structured Information Extraction*, EMNLP 2025.
- ONNX fragmentation + IOBinding strategy: Semplifica s.r.l., as used in [`gliner2-multi-v1-onnx`](https://huggingface.co/SemplificaAI/gliner2-multi-v1-onnx).
## 📚 Citation
```bibtex
@misc{fastino2026gliner2pii,
title = {GLiNER2-PII: Multilingual PII Extraction via Synthetic Fine-Tuning},
author = {{Fastino AI Team}},
year = {2026},
url = {https://huggingface.co/fastino/gliner2-privacy-filter-PII-multi}
}
@inproceedings{zaratiana-etal-2025-gliner2,
title = {GLiNER2: Schema-Driven Multi-Task Learning for Structured Information Extraction},
author = {Zaratiana, Urchade and Pasternak, Gil and Boyd, Oliver and Hurn-Maloney, George and Lewis, Ash},
booktitle = {Proceedings of EMNLP 2025: System Demonstrations},
year = {2025}
}
```
|