Symio-ai/legal-entity-resolver
Model Description
Legal Entity Resolver identifies, disambiguates, and links entities across legal documents. It resolves party name variations (e.g., "ABC Corp", "ABC Corporation", "A.B.C. Corp., Inc."), identifies alter ego relationships, maps corporate hierarchies, and links individuals to their various roles (officer, director, agent, trustee).
Critical for ensuring correct party identification in multi-party litigation and corporate veil-piercing analysis.
Intended Use
- Primary: Resolve entity references across case documents to maintain consistent party identification
- Secondary: Identify corporate relationships for alter ego and veil-piercing analysis
- Integration: Entity resolution layer used across all GLACIER pipeline stages
Task Type
token-classification -- Named entity recognition with entity linking and coreference resolution
Base Model
microsoft/deberta-v3-large -- Strong NLI and token classification for entity disambiguation
Training Data
| Source | Records | Description |
|---|---|---|
| State Corporate Registrations | ~10M entities | FL Sunbiz, MS SOS, all 50 states |
| Court Party Records | ~20M parties | CourtListener party names across all cases |
| SEC EDGAR | ~1M filings | Corporate disclosure documents |
| Manual Entity Pairs | ~500K pairs | Expert-labeled same-entity / different-entity pairs |
| Corporate Hierarchy Data | ~2M relationships | Parent-subsidiary-affiliate relationships |
Entity Types
PERSON-- Natural person (with name variations)CORPORATION-- Corporate entity (with DBA, trade names)LLC-- Limited liability companyPARTNERSHIP-- General or limited partnershipTRUST-- Trust entityGOVERNMENT-- Government agency or official (in official capacity)ROLE-- Individual's role (officer, director, agent, registered agent)ALTER_EGO-- Entity identified as alter ego of another
Resolution Output
{
"resolved_entities": [
{
"canonical_name": "ABC Corporation, Inc.",
"variations_found": ["ABC Corp", "A.B.C. Corp., Inc.", "ABC"],
"type": "CORPORATION",
"state_registration": "FL Sunbiz #P12345678",
"officers": ["John Smith (CEO)", "Jane Doe (Secretary)"],
"related_entities": [
{"name": "XYZ LLC", "relationship": "subsidiary", "ownership": "100%"}
],
"alter_ego_indicators": ["shared address", "commingled funds", "undercapitalized"]
}
]
}
Benchmark Criteria (90%+ Target)
| Metric | Target | Description |
|---|---|---|
| Entity Detection F1 | >= 94% | Correctly identify all entity mentions |
| Resolution Accuracy | >= 90% | Correctly link same-entity mentions |
| Corporate Hierarchy | >= 85% | Correctly identify parent-subsidiary relationships |
| Name Variation Matching | >= 92% | Match entities despite name variations |
| False Merge Rate | <= 3% | Must not incorrectly merge different entities |
GLACIER Pipeline Integration
STAGE 1 (Classify) --> entity-resolver identifies all parties in the case
STAGE 2 (Research) --> resolver disambiguates party names in research results
STAGE 4 (Draft) --> resolver ensures consistent party naming in filings
STAGE 5 (WDC #2) --> verify all party references are consistent and correct
Training Configuration
- Epochs: 8
- Learning rate: 2e-5
- Batch size: 16
- Max sequence length: 512
- Hardware: AWS SageMaker ml.g5.4xlarge
Limitations
- Extremely common names (e.g., "John Smith") are harder to disambiguate without context
- Dissolved or historical entities may not appear in current registration databases
- International entities have limited coverage
- Informal entity references (nicknames, abbreviations) may not resolve without context
- Does not perform skip-tracing or locate individuals (see people-finder MCP for that)
Version History
| Version | Date | Notes |
|---|---|---|
| v0.1 | 2026-04-10 | Initial model card, repo created |
Model tree for Symio-ai/legal-entity-resolver
Base model
microsoft/deberta-v3-large