dariofinardi commited on
Commit
8c9d860
·
verified ·
1 Parent(s): d76d746

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - fr
5
+ - it
6
+ - es
7
+ - de
8
+ - pt
9
+ tags:
10
+ - gliner
11
+ - ner
12
+ - information-extraction
13
+ - onnx
14
+ - rust
15
+ pipeline_tag: token-classification
16
+ library_name: gliner2-rs
17
+ ---
18
+
19
+ # GLiNER2 Multi V1 (ONNX)
20
+
21
+ This repository contains the ONNX-exported weights for the official [fastino/gliner2-multi-v1](https://huggingface.co/fastino/gliner2-multi-v1) model.
22
+
23
+ The model has been specifically exported, fragmented, and optimized to be used natively in **Rust** using the [SemplificaAI/gliner2-rs](https://github.com/SemplificaAI/gliner2-rs) inference engine, powered by ONNX Runtime.
24
+
25
+ ## Model Formats Available
26
+ To overcome ONNX static graph limitations with GLiNER2's dynamic routing, the model is split into 5 fragments (`encoder`, `span_rep`, `count_pred`, `count_lstm`, `classifier`).
27
+
28
+ - **`fp16/`**: Half-precision ONNX weights (~580MB total). Highly recommended for Edge Devices, NPUs (Qualcomm Snapdragon X Elite / Apple Neural Engine) and GPUs.
29
+ - **`fp32/`**: Full-precision ONNX weights (~1.2GB total). Recommended for standard CPU execution where FP16 is not natively accelerated.
30
+
31
+ ## 🚀 Usage with Rust (`gliner2-rs`)
32
+
33
+ This model is designed to be used with the zero-Python Rust inference engine, allowing you to run complex multi-task NLP pipelines with native performance and hardware acceleration.
34
+
35
+ ### 1. Installation
36
+ Add the engine to your `Cargo.toml`:
37
+ ```toml
38
+ [dependencies]
39
+ gliner2_inference = { git = "https://github.com/SemplificaAI/gliner2-rs" }
40
+ ort = { version = "2.0.0-rc.9", features = ["cuda", "half"] } # Or specific Execution Providers
41
+ ```
42
+
43
+ ### 2. Download Weights
44
+ Download the contents of the `fp16/` folder to a local directory, for example `./models/gliner2_multi_v1_fp16/`.
45
+
46
+ ### 3. Rust Inference Example
47
+ ```rust
48
+ use gliner2_inference::{Gliner2Engine, Gliner2Config, SchemaTask, ModelType};
49
+
50
+ fn main() -> anyhow::Result<()> {
51
+ // 1. Initialize ONNX Runtime with desired Execution Providers (CPU, CUDA, QNN, CoreML, etc.)
52
+ ort::init().with_name("GLiNER2_Engine").commit()?;
53
+
54
+ // 2. Configure engine pointing to the downloaded FP16 fragments
55
+ let config = Gliner2Config {
56
+ models_dir: "./models/gliner2_multi_v1_fp16".to_string(),
57
+ max_width: 8, // Max tokens per span
58
+ model_type: ModelType::HuggingFace, // Automatically routes tensors correctly
59
+ };
60
+
61
+ // 3. Load Session
62
+ let engine = Gliner2Engine::new(config)?;
63
+
64
+ let text = "Apple Inc. announced its quarterly earnings report on January 15, 2024, showing a revenue of $119.6 billion.";
65
+
66
+ // 4. Define dynamic Schema Tasks
67
+ let tasks = vec![
68
+ SchemaTask::Entities(vec![
69
+ "person_name".to_string(),
70
+ "organization_name".to_string(),
71
+ "date".to_string(),
72
+ "amount".to_string()
73
+ ])
74
+ ];
75
+
76
+ // 5. Extract features in a single forward pass
77
+ let (entities, relations, classifications) = engine.extract(text, &tasks)?;
78
+
79
+ for entity in entities {
80
+ println!("Found: {} (Label: {} - Score: {:.2}%)", entity.text, entity.label, entity.score * 100.0);
81
+ }
82
+
83
+ Ok(())
84
+ }
85
+ ```
86
+
87
+ ## Supported Execution Providers
88
+ Thanks to the fragmented ONNX structure, `gliner2-rs` can route the computation to specialized hardware automatically:
89
+ - **Qualcomm NPU** (`QNNExecutionProvider`)
90
+ - **Apple Silicon** (`CoreMLExecutionProvider`)
91
+ - **Intel / AMD AI** (`OpenVINOExecutionProvider`)
92
+ - **Nvidia GPU** (`CUDAExecutionProvider`)
93
+ - **ARM64 CPU** (`XNNPACKExecutionProvider`)
94
+
95
+ ## Acknowledgments
96
+ Original model architecture and weights by [Urchade / Fastino](https://huggingface.co/fastino/gliner2-multi-v1).
97
+ ONNX Export Pipeline and Rust Native Engine by [Semplifica s.r.l.](https://semplifica.ai)