Instructions to use SemplificaAI/gliner2-multi-v1-onnx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- GLiNER
How to use SemplificaAI/gliner2-multi-v1-onnx with GLiNER:
from gliner import GLiNER model = GLiNER.from_pretrained("SemplificaAI/gliner2-multi-v1-onnx") - Notebooks
- Google Colab
- Kaggle
Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,97 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
- fr
|
| 5 |
+
- it
|
| 6 |
+
- es
|
| 7 |
+
- de
|
| 8 |
+
- pt
|
| 9 |
+
tags:
|
| 10 |
+
- gliner
|
| 11 |
+
- ner
|
| 12 |
+
- information-extraction
|
| 13 |
+
- onnx
|
| 14 |
+
- rust
|
| 15 |
+
pipeline_tag: token-classification
|
| 16 |
+
library_name: gliner2-rs
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
# GLiNER2 Multi V1 (ONNX)
|
| 20 |
+
|
| 21 |
+
This repository contains the ONNX-exported weights for the official [fastino/gliner2-multi-v1](https://huggingface.co/fastino/gliner2-multi-v1) model.
|
| 22 |
+
|
| 23 |
+
The model has been specifically exported, fragmented, and optimized to be used natively in **Rust** using the [SemplificaAI/gliner2-rs](https://github.com/SemplificaAI/gliner2-rs) inference engine, powered by ONNX Runtime.
|
| 24 |
+
|
| 25 |
+
## Model Formats Available
|
| 26 |
+
To overcome ONNX static graph limitations with GLiNER2's dynamic routing, the model is split into 5 fragments (`encoder`, `span_rep`, `count_pred`, `count_lstm`, `classifier`).
|
| 27 |
+
|
| 28 |
+
- **`fp16/`**: Half-precision ONNX weights (~580MB total). Highly recommended for Edge Devices, NPUs (Qualcomm Snapdragon X Elite / Apple Neural Engine) and GPUs.
|
| 29 |
+
- **`fp32/`**: Full-precision ONNX weights (~1.2GB total). Recommended for standard CPU execution where FP16 is not natively accelerated.
|
| 30 |
+
|
| 31 |
+
## 🚀 Usage with Rust (`gliner2-rs`)
|
| 32 |
+
|
| 33 |
+
This model is designed to be used with the zero-Python Rust inference engine, allowing you to run complex multi-task NLP pipelines with native performance and hardware acceleration.
|
| 34 |
+
|
| 35 |
+
### 1. Installation
|
| 36 |
+
Add the engine to your `Cargo.toml`:
|
| 37 |
+
```toml
|
| 38 |
+
[dependencies]
|
| 39 |
+
gliner2_inference = { git = "https://github.com/SemplificaAI/gliner2-rs" }
|
| 40 |
+
ort = { version = "2.0.0-rc.9", features = ["cuda", "half"] } # Or specific Execution Providers
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
### 2. Download Weights
|
| 44 |
+
Download the contents of the `fp16/` folder to a local directory, for example `./models/gliner2_multi_v1_fp16/`.
|
| 45 |
+
|
| 46 |
+
### 3. Rust Inference Example
|
| 47 |
+
```rust
|
| 48 |
+
use gliner2_inference::{Gliner2Engine, Gliner2Config, SchemaTask, ModelType};
|
| 49 |
+
|
| 50 |
+
fn main() -> anyhow::Result<()> {
|
| 51 |
+
// 1. Initialize ONNX Runtime with desired Execution Providers (CPU, CUDA, QNN, CoreML, etc.)
|
| 52 |
+
ort::init().with_name("GLiNER2_Engine").commit()?;
|
| 53 |
+
|
| 54 |
+
// 2. Configure engine pointing to the downloaded FP16 fragments
|
| 55 |
+
let config = Gliner2Config {
|
| 56 |
+
models_dir: "./models/gliner2_multi_v1_fp16".to_string(),
|
| 57 |
+
max_width: 8, // Max tokens per span
|
| 58 |
+
model_type: ModelType::HuggingFace, // Automatically routes tensors correctly
|
| 59 |
+
};
|
| 60 |
+
|
| 61 |
+
// 3. Load Session
|
| 62 |
+
let engine = Gliner2Engine::new(config)?;
|
| 63 |
+
|
| 64 |
+
let text = "Apple Inc. announced its quarterly earnings report on January 15, 2024, showing a revenue of $119.6 billion.";
|
| 65 |
+
|
| 66 |
+
// 4. Define dynamic Schema Tasks
|
| 67 |
+
let tasks = vec![
|
| 68 |
+
SchemaTask::Entities(vec![
|
| 69 |
+
"person_name".to_string(),
|
| 70 |
+
"organization_name".to_string(),
|
| 71 |
+
"date".to_string(),
|
| 72 |
+
"amount".to_string()
|
| 73 |
+
])
|
| 74 |
+
];
|
| 75 |
+
|
| 76 |
+
// 5. Extract features in a single forward pass
|
| 77 |
+
let (entities, relations, classifications) = engine.extract(text, &tasks)?;
|
| 78 |
+
|
| 79 |
+
for entity in entities {
|
| 80 |
+
println!("Found: {} (Label: {} - Score: {:.2}%)", entity.text, entity.label, entity.score * 100.0);
|
| 81 |
+
}
|
| 82 |
+
|
| 83 |
+
Ok(())
|
| 84 |
+
}
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
## Supported Execution Providers
|
| 88 |
+
Thanks to the fragmented ONNX structure, `gliner2-rs` can route the computation to specialized hardware automatically:
|
| 89 |
+
- **Qualcomm NPU** (`QNNExecutionProvider`)
|
| 90 |
+
- **Apple Silicon** (`CoreMLExecutionProvider`)
|
| 91 |
+
- **Intel / AMD AI** (`OpenVINOExecutionProvider`)
|
| 92 |
+
- **Nvidia GPU** (`CUDAExecutionProvider`)
|
| 93 |
+
- **ARM64 CPU** (`XNNPACKExecutionProvider`)
|
| 94 |
+
|
| 95 |
+
## Acknowledgments
|
| 96 |
+
Original model architecture and weights by [Urchade / Fastino](https://huggingface.co/fastino/gliner2-multi-v1).
|
| 97 |
+
ONNX Export Pipeline and Rust Native Engine by [Semplifica s.r.l.](https://semplifica.ai)
|