gliner2 family
Collection
GLiNER2 extends the original GLiNER architecture to support multi-task information extraction with a schema-driven interface. This base model provid β’ 4 items β’ Updated β’ 42
Extract entities, classify text, parse structured data, and extract relationsβall in one efficient model.
GLiNER2 unifies Named Entity Recognition, Text Classification, Structured Data Extraction, and Relation Extraction into a single 205M parameter model. It provides efficient CPU-based inference without requiring complex pipelines or external API dependencies.
Fine-tune via Pioneer. Additional documentation via Pioneer docs. Join discussions on Discord and Reddit.
pip install gliner2
from gliner2 import GLiNER2
# Load the model
extractor = GLiNER2.from_pretrained("fastino/gliner2-large-v1")
# Extract entities with descriptions for higher precision
text = "Patient received 400mg ibuprofen for severe headache at 2 PM."
result = extractor.extract_entities(
text,
{
"medication": "Names of drugs, medications, or pharmaceutical substances",
"dosage": "Specific amounts like '400mg', '2 tablets', or '5ml'",
"symptom": "Medical symptoms, conditions, or patient complaints",
"time": "Time references like '2 PM', 'morning', or 'after lunch'"
}
)
print(result)
# Output: {'entities': {'medication': ['ibuprofen'], 'dosage': ['400mg'], 'symptom': ['severe headache'], 'time': ['2 PM']}}
# Single-label classification
result = extractor.classify_text(
"This laptop has amazing performance but terrible battery life!",
{"sentiment": ["positive", "negative", "neutral"]}
)
print(result)
# Output: {'sentiment': 'negative'}
# Multi-label classification
result = extractor.classify_text(
"Great camera quality, decent performance, but poor battery life.",
{
"aspects": {
"labels": ["camera", "performance", "battery", "display", "price"],
"multi_label": True,
"cls_threshold": 0.4
}
}
)
print(result)
# Output: {'aspects': ['camera', 'performance', 'battery']}
# Financial document processing
text = """
Transaction Report: Goldman Sachs processed a $2.5M equity trade for Tesla Inc.
on March 15, 2024. Commission: $1,250. Status: Completed.
"""
result = extractor.extract_json(
text,
{
"transaction": [
"broker::str::Financial institution or brokerage firm",
"amount::str::Transaction amount with currency",
"security::str::Stock, bond, or financial instrument",
"date::str::Transaction date",
"commission::str::Fees or commission charged",
"status::str::Transaction status",
"type::[equity|bond|option|future|forex]::str::Type of financial instrument"
]
}
)
print(result)
# Output: {
# 'transaction': [{
# 'broker': 'Goldman Sachs',
# 'amount': '$2.5M',
# 'security': 'Tesla Inc.',
# 'date': 'March 15, 2024',
# 'commission': '$1,250',
# 'status': 'Completed',
# 'type': 'equity'
# }]
# }
# Comprehensive legal contract analysis
contract_text = """
Service Agreement between TechCorp LLC and DataSystems Inc., effective January 1, 2024.
Monthly fee: $15,000. Contract term: 24 months with automatic renewal.
Termination clause: 30-day written notice required.
"""
schema = (extractor.create_schema()
.entities(["company", "date", "duration", "fee"])
.classification("contract_type", ["service", "employment", "nda", "partnership"])
.structure("contract_terms")
.field("parties", dtype="list")
.field("effective_date", dtype="str")
.field("monthly_fee", dtype="str")
.field("term_length", dtype="str")
.field("renewal", dtype="str", choices=["automatic", "manual", "none"])
.field("termination_notice", dtype="str")
)
results = extractor.extract(contract_text, schema)
print(results)
# Output: {
# 'entities': {
# 'company': ['TechCorp LLC', 'DataSystems Inc.'],
# 'date': ['January 1, 2024'],
# 'duration': ['24 months'],
# 'fee': ['$15,000']
# },
# 'contract_type': 'service',
# 'contract_terms': [{
# 'parties': ['TechCorp LLC', 'DataSystems Inc.'],
# 'effective_date': 'January 1, 2024',
# 'monthly_fee': '$15,000',
# 'term_length': '24 months',
# 'renewal': 'automatic',
# 'termination_notice': '30-day written notice'
# }]
# }
This large model provides:
The large model excels in:
If you use this model in your research, please cite:
@misc{zaratiana2025gliner2efficientmultitaskinformation,
title={GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface},
author={Urchade Zaratiana and Gil Pasternak and Oliver Boyd and George Hurn-Maloney and Ash Lewis},
year={2025},
eprint={2507.18546},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2507.18546},
}
This project is licensed under the Apache License 2.0.