Token Classification
English
legal
entity-resolution
glacier-pipeline
symio

Symio-ai/legal-entity-resolver

Model Description

Legal Entity Resolver identifies, disambiguates, and links entities across legal documents. It resolves party name variations (e.g., "ABC Corp", "ABC Corporation", "A.B.C. Corp., Inc."), identifies alter ego relationships, maps corporate hierarchies, and links individuals to their various roles (officer, director, agent, trustee).

Critical for ensuring correct party identification in multi-party litigation and corporate veil-piercing analysis.

Intended Use

  • Primary: Resolve entity references across case documents to maintain consistent party identification
  • Secondary: Identify corporate relationships for alter ego and veil-piercing analysis
  • Integration: Entity resolution layer used across all GLACIER pipeline stages

Task Type

token-classification -- Named entity recognition with entity linking and coreference resolution

Base Model

microsoft/deberta-v3-large -- Strong NLI and token classification for entity disambiguation

Training Data

Source Records Description
State Corporate Registrations ~10M entities FL Sunbiz, MS SOS, all 50 states
Court Party Records ~20M parties CourtListener party names across all cases
SEC EDGAR ~1M filings Corporate disclosure documents
Manual Entity Pairs ~500K pairs Expert-labeled same-entity / different-entity pairs
Corporate Hierarchy Data ~2M relationships Parent-subsidiary-affiliate relationships

Entity Types

  • PERSON -- Natural person (with name variations)
  • CORPORATION -- Corporate entity (with DBA, trade names)
  • LLC -- Limited liability company
  • PARTNERSHIP -- General or limited partnership
  • TRUST -- Trust entity
  • GOVERNMENT -- Government agency or official (in official capacity)
  • ROLE -- Individual's role (officer, director, agent, registered agent)
  • ALTER_EGO -- Entity identified as alter ego of another

Resolution Output

{
  "resolved_entities": [
    {
      "canonical_name": "ABC Corporation, Inc.",
      "variations_found": ["ABC Corp", "A.B.C. Corp., Inc.", "ABC"],
      "type": "CORPORATION",
      "state_registration": "FL Sunbiz #P12345678",
      "officers": ["John Smith (CEO)", "Jane Doe (Secretary)"],
      "related_entities": [
        {"name": "XYZ LLC", "relationship": "subsidiary", "ownership": "100%"}
      ],
      "alter_ego_indicators": ["shared address", "commingled funds", "undercapitalized"]
    }
  ]
}

Benchmark Criteria (90%+ Target)

Metric Target Description
Entity Detection F1 >= 94% Correctly identify all entity mentions
Resolution Accuracy >= 90% Correctly link same-entity mentions
Corporate Hierarchy >= 85% Correctly identify parent-subsidiary relationships
Name Variation Matching >= 92% Match entities despite name variations
False Merge Rate <= 3% Must not incorrectly merge different entities

GLACIER Pipeline Integration

STAGE 1 (Classify) --> entity-resolver identifies all parties in the case
STAGE 2 (Research) --> resolver disambiguates party names in research results
STAGE 4 (Draft) --> resolver ensures consistent party naming in filings
STAGE 5 (WDC #2) --> verify all party references are consistent and correct

Training Configuration

  • Epochs: 8
  • Learning rate: 2e-5
  • Batch size: 16
  • Max sequence length: 512
  • Hardware: AWS SageMaker ml.g5.4xlarge

Limitations

  • Extremely common names (e.g., "John Smith") are harder to disambiguate without context
  • Dissolved or historical entities may not appear in current registration databases
  • International entities have limited coverage
  • Informal entity references (nicknames, abbreviations) may not resolve without context
  • Does not perform skip-tracing or locate individuals (see people-finder MCP for that)

Version History

Version Date Notes
v0.1 2026-04-10 Initial model card, repo created
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Symio-ai/legal-entity-resolver

Finetuned
(256)
this model