BrainboxAI/law-il-E2B-safetensors

Training-Ready Safetensors Version of BrainboxAI Law IL E2B

This repository hosts the merged safetensors version of the BrainboxAI Law IL model - a Gemma 4 E2B fine-tuned on 17,613 Israeli legal instruction pairs (Hebrew + English).

Unlike the GGUF variant, this format is:

  • Trainable - use as a base for continued fine-tuning (LoRA, QLoRA, full)
  • Transformers-compatible - load directly with AutoModelForCausalLM
  • vLLM / TGI-ready - deploy for high-throughput inference
  • Convertible - source format for GGUF, AWQ, GPTQ, EXL2

For local inference (Ollama / llama.cpp / LM Studio), use the GGUF sibling: BrainboxAI/law-il-E2B


Model Details

Attribute Value
Base Model unsloth/gemma-4-E2B-it
Architecture Gemma4ForConditionalGeneration (text + vision + audio)
Parameters ~2B
Precision BF16
Context Length 131,072 tokens
Languages Hebrew, English
Format Safetensors
Training Dataset BrainboxAI/legal-training-il
License Apache 2.0

Primary Use Cases

1. Direct Inference

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "BrainboxAI/law-il-E2B-safetensors"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype=torch.bfloat16, device_map="auto"
)

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},  # see below
    {"role": "user", "content": "מה הזכויות שלי בנושא פיצויי פיטורים?"},
]
inputs = tokenizer.apply_chat_template(
    messages, return_tensors="pt", add_generation_prompt=True
).to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

2. Continued Fine-Tuning (LoRA)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="BrainboxAI/law-il-E2B-safetensors",
    max_seq_length=4096,
    load_in_4bit=True,
)

model = FastLanguageModel.get_peft_model(
    model, r=16, lora_alpha=16, lora_dropout=0,
    target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj"],
    use_rslora=True,
)
# Train on your new legal dataset...

3. vLLM Deployment

vllm serve BrainboxAI/law-il-E2B-safetensors \
  --max-model-len 8192 --dtype bfloat16 --served-model-name brainbox-law

4. Quantization

# GGUF (for llama.cpp / Ollama)
python convert_hf_to_gguf.py BrainboxAI/law-il-E2B-safetensors

# AWQ (4-bit GPU)
python -m awq.quantize --model BrainboxAI/law-il-E2B-safetensors

Recommended System Prompt

DEFINITIONS:
  role: BrainboxAI Legal Assistant - an AI specialist trained by BrainboxAI (founded by Netanel Elyasi) for Israeli law Q&A, court ruling analysis, citizens' rights, and contract interpretation. Bilingual Hebrew + English.
  success: Provide accurate, source-grounded legal information in the user's language, with clear disclaimers that the output is informational and not legal advice.
  scope_in:
    - Israeli law (civil, criminal, labor, family, administrative, constitutional)
    - Citizens' rights under Israeli law
    - Contract clause interpretation
    - Court ruling analysis and summarization
  scope_out:
    - Legal advice tied to specific real cases or persons
    - Predictions of court outcomes
    - Foreign (non-Israeli) law unless explicitly asked
    - Content that facilitates illegal activity

PREMISES:
  - Input may be a legal question, statute citation, court ruling text, or contract clause.
  - Input language may be Hebrew, English, or mixed.
  - Israeli citations stay in canonical form (e.g. ע"א 1234/20, חוק יסוד: כבוד האדם וחירותו).
  - Training cutoff: 2025.

REQUIREMENTS:
  1. Match the user's input language.
  2. Cite statutes and rulings in canonical Israeli form.
  3. Every substantive claim traces to a statute, regulation, or ruling.
  4. Use plain language unless technical legal Hebrew is requested.
  5. Always end with the disclaimer: "זהו מידע כללי ואינו מהווה ייעוץ משפטי" (HE) or "This is general information and not legal advice" (EN).
  6. Never fabricate statute numbers, ruling citations, or case facts.
  7. For contract clauses: identify type, obligations, risks.
  8. For rights Q&A: structure as eligibility, claim process, authority, references.
  9. Decline out-of-scope requests and redirect to in-scope tasks.

EDGE_CASES:
  - Empty input -> Ask a clarifying question.
  - Specific-case legal advice -> Offer general principles only + strong disclaimer.
  - Conflicting sources -> Present both, note hierarchy (constitutional > statute > regulation).
  - Third language input -> Respond in English, note fallback.
  - Non-Israeli jurisdiction -> Clarify scope, offer Israeli perspective only.

OUTPUT_FORMAT:
  format: Markdown. Lists for enumerations, numbered steps for procedures.
  default_structure: |
    **הנושא / Topic:** <topic>
    **תשובה / Answer:** <answer>
    **מקורות / Sources:**
    - <citation>
    **הערה:** זהו מידע כללי ואינו מהווה ייעוץ משפטי.
  language: Match user's input.
  length: Short Q 100-250 / Analysis 300-700 words.

VERIFICATION:
  - Response in user's language?
  - Citations in canonical Israeli form?
  - Every substantive claim sourced?
  - Disclaimer present?
  - No fabricated citations or facts?

Training Details

  • Method: QLoRA (4-bit quantized base + LoRA adapters), merged back to safetensors
  • Framework: Unsloth 2026.x
  • Dataset: 17,613 bilingual legal instruction pairs
  • Composition:
    • 7,960 Israeli court rulings
    • 2,353 Kol-Zchut rights articles
    • 300 Open Law Book statutes
    • 7,000 CUAD-based contract clauses
  • Language split: ~60% Hebrew, ~40% English

Full dataset: BrainboxAI/legal-training-il


Hardware Requirements

Use Case Minimum Recommended
Inference (BF16) 6 GB VRAM 10 GB VRAM
Inference (4-bit) 3 GB VRAM 6 GB VRAM
Continued LoRA training 12 GB VRAM + 24 GB RAM 16 GB VRAM
Full fine-tune 24 GB VRAM 40 GB VRAM (A100)

Sibling Repositories

Repo Purpose
BrainboxAI/law-il-E2B GGUF for local inference
BrainboxAI/law-il-E2B-safetensors This repo - training-ready safetensors
BrainboxAI/legal-training-il Training dataset

Limitations & Ethical Considerations

  • Not a licensed lawyer. Informational only; consult a licensed Israeli attorney for case-specific guidance.
  • Training cutoff. Coverage through 2025. Newer rulings or legislation may not appear.
  • Citation verification. Always cross-check citations with official sources (Nevo, Supreme Court website, Kol-Zchut).
  • Hebrew variance. Archaic legal Hebrew can occasionally confuse the model.
  • Dual-use. Deploy with acceptable-use policies and human review for any adversarial use case.

Citation

@misc{brainboxai_law_il_e2b_safetensors_2026,
  author  = {Elyasi, Netanel and BrainboxAI},
  title   = {BrainboxAI Law IL E2B (safetensors): A Hebrew-First Israeli Legal LLM},
  year    = {2026},
  url     = {https://huggingface.co/BrainboxAI/law-il-E2B-safetensors},
  publisher = {Hugging Face}
}

About BrainboxAI

BrainboxAI is an Israeli AI agency founded by Netanel Elyasi, specializing in:

  • Custom LLM training (Hebrew-native and bilingual models)
  • AI automation and agentic workflows
  • Cybersecurity AI products (scanning, triage, reporting)
  • Enterprise AI deployment (on-premise, privacy-first)

Related models and datasets:

Contact: via Hugging Face or BrainboxAI.


Trained 2x faster with Unsloth.

Downloads last month
585
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for BrainboxAI/law-il-E2B-safetensors

Finetuned
(97)
this model

Dataset used to train BrainboxAI/law-il-E2B-safetensors