--- license: mit language: - en tags: - legal - glacier - distillation - sequence-classification pipeline_tag: text-classification datasets: - glacier-legal/legal-distillation-data base_model: nlpaueb/legal-bert-base-uncased --- # GLACIER glacier-document-classifier **Distilled legal AI model** for the [GLACIER pipeline](https://github.com/OrionDevPartners/glacier-legal-mcp) — Gated Legal Analysis, Citation Intelligence, Evidence Routing. ## Model Description This model is distilled from Claude Opus 4.6 (via AWS Bedrock) into a lightweight transformer for fast, local inference. It handles **legal document type classification (complaint, motion, brief, etc.)** as part of the GLACIER 6-stage legal document production pipeline. - **Base model:** [nlpaueb/legal-bert-base-uncased](https://huggingface.co/nlpaueb/legal-bert-base-uncased) - **Task:** sequence-classification - **Labels:** 12 classes - **Max length:** 512 tokens ## Labels - `complaint` - `answer` - `motion` - `brief` - `order` - `opinion` - `notice` - `subpoena` - `affidavit` - `demand_letter` - `bar_complaint` - `other` ## Usage ```python from glacier_distill.inference import GlacierPipeline pipeline = GlacierPipeline() result = pipeline.classify_document("your legal text here") print(result) ``` Or use directly with transformers: ```python from transformers import pipeline classifier = pipeline("text-classification", model="glacier-legal/glacier-document-classifier") result = classifier("your legal text here") ``` ## Training - **Teacher:** Claude Opus 4.6 (AWS Bedrock) - **Method:** Knowledge distillation (Hinton et al., 2015) with temperature=4.0, alpha=0.7 - **Data:** CourtListener case law + synthetic labeled examples - **Framework:** HuggingFace Transformers + custom DistillationLoss ## GLACIER Pipeline This model is part of the GLACIER pipeline stages: ``` Stage 1: QUERY -> jurisdiction-router + document-classifier Stage 2: RESEARCH -> legal-ner (entity extraction) Stage 3: WDC #1 -> (full model review) Stage 4: DRAFT -> legal-ner + citation-classifier Stage 5: WDC #2 -> hallucination-detector + citation-classifier Stage 6: FINAL -> (human review) ``` ## Limitations - Distilled models are optimized for US legal text (federal + state) - Not a substitute for full model review in GLACIER Stages 3/5 - Citation hallucination detection is a pre-filter, not a replacement for external verification - Jurisdiction coverage: Florida, Mississippi, Federal (primary); other states (limited) ## License MIT — Part of the GLACIER Legal AI Framework by Orion Dev Partners, LLC.