BrainboxAI/law-il-E2B-safetensors

Training-Ready Safetensors Version of BrainboxAI Law IL E2B

This repository hosts the merged safetensors version of the BrainboxAI Law IL model - a Gemma 4 E2B fine-tuned on 17,613 Israeli legal instruction pairs (Hebrew + English).

Unlike the GGUF variant, this format is:

Trainable - use as a base for continued fine-tuning (LoRA, QLoRA, full)
Transformers-compatible - load directly with AutoModelForCausalLM
vLLM / TGI-ready - deploy for high-throughput inference
Convertible - source format for GGUF, AWQ, GPTQ, EXL2

For local inference (Ollama / llama.cpp / LM Studio), use the GGUF sibling: BrainboxAI/law-il-E2B

Model Details

Attribute	Value
Base Model	unsloth/gemma-4-E2B-it
Architecture	Gemma4ForConditionalGeneration (text + vision + audio)
Parameters	~2B
Precision	BF16
Context Length	131,072 tokens
Languages	Hebrew, English
Format	Safetensors
Training Dataset	BrainboxAI/legal-training-il
License	Apache 2.0

Primary Use Cases

1. Direct Inference

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "BrainboxAI/law-il-E2B-safetensors"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype=torch.bfloat16, device_map="auto"
)

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},  # see below
    {"role": "user", "content": "מה הזכויות שלי בנושא פיצויי פיטורים?"},
]
inputs = tokenizer.apply_chat_template(
    messages, return_tensors="pt", add_generation_prompt=True
).to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

2. Continued Fine-Tuning (LoRA)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="BrainboxAI/law-il-E2B-safetensors",
    max_seq_length=4096,
    load_in_4bit=True,
)

model = FastLanguageModel.get_peft_model(
    model, r=16, lora_alpha=16, lora_dropout=0,
    target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj"],
    use_rslora=True,
)
# Train on your new legal dataset...

3. vLLM Deployment

vllm serve BrainboxAI/law-il-E2B-safetensors \
  --max-model-len 8192 --dtype bfloat16 --served-model-name brainbox-law

4. Quantization

# GGUF (for llama.cpp / Ollama)
python convert_hf_to_gguf.py BrainboxAI/law-il-E2B-safetensors

# AWQ (4-bit GPU)
python -m awq.quantize --model BrainboxAI/law-il-E2B-safetensors

Recommended System Prompt

DEFINITIONS:
  role: BrainboxAI Legal Assistant - an AI specialist trained by BrainboxAI (founded by Netanel Elyasi) for Israeli law Q&A, court ruling analysis, citizens' rights, and contract interpretation. Bilingual Hebrew + English.
  success: Provide accurate, source-grounded legal information in the user's language, with clear disclaimers that the output is informational and not legal advice.
  scope_in:
    - Israeli law (civil, criminal, labor, family, administrative, constitutional)
    - Citizens' rights under Israeli law
    - Contract clause interpretation
    - Court ruling analysis and summarization
  scope_out:
    - Legal advice tied to specific real cases or persons
    - Predictions of court outcomes
    - Foreign (non-Israeli) law unless explicitly asked
    - Content that facilitates illegal activity

PREMISES:
  - Input may be a legal question, statute citation, court ruling text, or contract clause.
  - Input language may be Hebrew, English, or mixed.
  - Israeli citations stay in canonical form (e.g. ע"א 1234/20, חוק יסוד: כבוד האדם וחירותו).
  - Training cutoff: 2025.

REQUIREMENTS:
  1. Match the user's input language.
  2. Cite statutes and rulings in canonical Israeli form.
  3. Every substantive claim traces to a statute, regulation, or ruling.
  4. Use plain language unless technical legal Hebrew is requested.
  5. Always end with the disclaimer: "זהו מידע כללי ואינו מהווה ייעוץ משפטי" (HE) or "This is general information and not legal advice" (EN).
  6. Never fabricate statute numbers, ruling citations, or case facts.
  7. For contract clauses: identify type, obligations, risks.
  8. For rights Q&A: structure as eligibility, claim process, authority, references.
  9. Decline out-of-scope requests and redirect to in-scope tasks.

EDGE_CASES:
  - Empty input -> Ask a clarifying question.
  - Specific-case legal advice -> Offer general principles only + strong disclaimer.
  - Conflicting sources -> Present both, note hierarchy (constitutional > statute > regulation).
  - Third language input -> Respond in English, note fallback.
  - Non-Israeli jurisdiction -> Clarify scope, offer Israeli perspective only.

OUTPUT_FORMAT:
  format: Markdown. Lists for enumerations, numbered steps for procedures.
  default_structure: |
    **הנושא / Topic:** <topic>
    **תשובה / Answer:** <answer>
    **מקורות / Sources:**
    - <citation>
    **הערה:** זהו מידע כללי ואינו מהווה ייעוץ משפטי.
  language: Match user's input.
  length: Short Q 100-250 / Analysis 300-700 words.

VERIFICATION:
  - Response in user's language?
  - Citations in canonical Israeli form?
  - Every substantive claim sourced?
  - Disclaimer present?
  - No fabricated citations or facts?

Training Details

Method: QLoRA (4-bit quantized base + LoRA adapters), merged back to safetensors
Framework: Unsloth 2026.x
Dataset: 17,613 bilingual legal instruction pairs
Composition:
- 7,960 Israeli court rulings
- 2,353 Kol-Zchut rights articles
- 300 Open Law Book statutes
- 7,000 CUAD-based contract clauses
Language split: ~60% Hebrew, ~40% English

Full dataset: BrainboxAI/legal-training-il

Hardware Requirements

Use Case	Minimum	Recommended
Inference (BF16)	6 GB VRAM	10 GB VRAM
Inference (4-bit)	3 GB VRAM	6 GB VRAM
Continued LoRA training	12 GB VRAM + 24 GB RAM	16 GB VRAM
Full fine-tune	24 GB VRAM	40 GB VRAM (A100)

Sibling Repositories

Repo	Purpose
BrainboxAI/law-il-E2B	GGUF for local inference
BrainboxAI/law-il-E2B-safetensors	This repo - training-ready safetensors
BrainboxAI/legal-training-il	Training dataset

Limitations & Ethical Considerations

Not a licensed lawyer. Informational only; consult a licensed Israeli attorney for case-specific guidance.
Training cutoff. Coverage through 2025. Newer rulings or legislation may not appear.
Citation verification. Always cross-check citations with official sources (Nevo, Supreme Court website, Kol-Zchut).
Hebrew variance. Archaic legal Hebrew can occasionally confuse the model.
Dual-use. Deploy with acceptable-use policies and human review for any adversarial use case.

Citation

@misc{brainboxai_law_il_e2b_safetensors_2026,
  author  = {Elyasi, Netanel and BrainboxAI},
  title   = {BrainboxAI Law IL E2B (safetensors): A Hebrew-First Israeli Legal LLM},
  year    = {2026},
  url     = {https://huggingface.co/BrainboxAI/law-il-E2B-safetensors},
  publisher = {Hugging Face}
}

About BrainboxAI

BrainboxAI is an Israeli AI agency founded by Netanel Elyasi, specializing in:

Custom LLM training (Hebrew-native and bilingual models)
AI automation and agentic workflows
Cybersecurity AI products (scanning, triage, reporting)
Enterprise AI deployment (on-premise, privacy-first)

Related models and datasets:

BrainboxAI/law-il-E2B - GGUF version
BrainboxAI/legal-training-il - Training dataset
BrainboxAI/cyber-analyst-4B - Cyber analyst
BrainboxAI/brainboxai_cyber_train - Cyber dataset

Contact: via Hugging Face or BrainboxAI.

Trained 2x faster with Unsloth.

Downloads last month: 585

Model tree for BrainboxAI/law-il-E2B-safetensors

Base model

google/gemma-4-E2B-it

Finetuned

unsloth/gemma-4-E2B-it