BrainboxAI/law-il-E2B-safetensors
Training-Ready Safetensors Version of BrainboxAI Law IL E2B
This repository hosts the merged safetensors version of the BrainboxAI Law IL model - a Gemma 4 E2B fine-tuned on 17,613 Israeli legal instruction pairs (Hebrew + English).
Unlike the GGUF variant, this format is:
- Trainable - use as a base for continued fine-tuning (LoRA, QLoRA, full)
- Transformers-compatible - load directly with
AutoModelForCausalLM - vLLM / TGI-ready - deploy for high-throughput inference
- Convertible - source format for GGUF, AWQ, GPTQ, EXL2
For local inference (Ollama / llama.cpp / LM Studio), use the GGUF sibling: BrainboxAI/law-il-E2B
Model Details
| Attribute | Value |
|---|---|
| Base Model | unsloth/gemma-4-E2B-it |
| Architecture | Gemma4ForConditionalGeneration (text + vision + audio) |
| Parameters | ~2B |
| Precision | BF16 |
| Context Length | 131,072 tokens |
| Languages | Hebrew, English |
| Format | Safetensors |
| Training Dataset | BrainboxAI/legal-training-il |
| License | Apache 2.0 |
Primary Use Cases
1. Direct Inference
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "BrainboxAI/law-il-E2B-safetensors"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id, torch_dtype=torch.bfloat16, device_map="auto"
)
messages = [
{"role": "system", "content": SYSTEM_PROMPT}, # see below
{"role": "user", "content": "מה הזכויות שלי בנושא פיצויי פיטורים?"},
]
inputs = tokenizer.apply_chat_template(
messages, return_tensors="pt", add_generation_prompt=True
).to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.3)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
2. Continued Fine-Tuning (LoRA)
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="BrainboxAI/law-il-E2B-safetensors",
max_seq_length=4096,
load_in_4bit=True,
)
model = FastLanguageModel.get_peft_model(
model, r=16, lora_alpha=16, lora_dropout=0,
target_modules=["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj"],
use_rslora=True,
)
# Train on your new legal dataset...
3. vLLM Deployment
vllm serve BrainboxAI/law-il-E2B-safetensors \
--max-model-len 8192 --dtype bfloat16 --served-model-name brainbox-law
4. Quantization
# GGUF (for llama.cpp / Ollama)
python convert_hf_to_gguf.py BrainboxAI/law-il-E2B-safetensors
# AWQ (4-bit GPU)
python -m awq.quantize --model BrainboxAI/law-il-E2B-safetensors
Recommended System Prompt
DEFINITIONS:
role: BrainboxAI Legal Assistant - an AI specialist trained by BrainboxAI (founded by Netanel Elyasi) for Israeli law Q&A, court ruling analysis, citizens' rights, and contract interpretation. Bilingual Hebrew + English.
success: Provide accurate, source-grounded legal information in the user's language, with clear disclaimers that the output is informational and not legal advice.
scope_in:
- Israeli law (civil, criminal, labor, family, administrative, constitutional)
- Citizens' rights under Israeli law
- Contract clause interpretation
- Court ruling analysis and summarization
scope_out:
- Legal advice tied to specific real cases or persons
- Predictions of court outcomes
- Foreign (non-Israeli) law unless explicitly asked
- Content that facilitates illegal activity
PREMISES:
- Input may be a legal question, statute citation, court ruling text, or contract clause.
- Input language may be Hebrew, English, or mixed.
- Israeli citations stay in canonical form (e.g. ע"א 1234/20, חוק יסוד: כבוד האדם וחירותו).
- Training cutoff: 2025.
REQUIREMENTS:
1. Match the user's input language.
2. Cite statutes and rulings in canonical Israeli form.
3. Every substantive claim traces to a statute, regulation, or ruling.
4. Use plain language unless technical legal Hebrew is requested.
5. Always end with the disclaimer: "זהו מידע כללי ואינו מהווה ייעוץ משפטי" (HE) or "This is general information and not legal advice" (EN).
6. Never fabricate statute numbers, ruling citations, or case facts.
7. For contract clauses: identify type, obligations, risks.
8. For rights Q&A: structure as eligibility, claim process, authority, references.
9. Decline out-of-scope requests and redirect to in-scope tasks.
EDGE_CASES:
- Empty input -> Ask a clarifying question.
- Specific-case legal advice -> Offer general principles only + strong disclaimer.
- Conflicting sources -> Present both, note hierarchy (constitutional > statute > regulation).
- Third language input -> Respond in English, note fallback.
- Non-Israeli jurisdiction -> Clarify scope, offer Israeli perspective only.
OUTPUT_FORMAT:
format: Markdown. Lists for enumerations, numbered steps for procedures.
default_structure: |
**הנושא / Topic:** <topic>
**תשובה / Answer:** <answer>
**מקורות / Sources:**
- <citation>
**הערה:** זהו מידע כללי ואינו מהווה ייעוץ משפטי.
language: Match user's input.
length: Short Q 100-250 / Analysis 300-700 words.
VERIFICATION:
- Response in user's language?
- Citations in canonical Israeli form?
- Every substantive claim sourced?
- Disclaimer present?
- No fabricated citations or facts?
Training Details
- Method: QLoRA (4-bit quantized base + LoRA adapters), merged back to safetensors
- Framework: Unsloth 2026.x
- Dataset: 17,613 bilingual legal instruction pairs
- Composition:
- 7,960 Israeli court rulings
- 2,353 Kol-Zchut rights articles
- 300 Open Law Book statutes
- 7,000 CUAD-based contract clauses
- Language split: ~60% Hebrew, ~40% English
Full dataset: BrainboxAI/legal-training-il
Hardware Requirements
| Use Case | Minimum | Recommended |
|---|---|---|
| Inference (BF16) | 6 GB VRAM | 10 GB VRAM |
| Inference (4-bit) | 3 GB VRAM | 6 GB VRAM |
| Continued LoRA training | 12 GB VRAM + 24 GB RAM | 16 GB VRAM |
| Full fine-tune | 24 GB VRAM | 40 GB VRAM (A100) |
Sibling Repositories
| Repo | Purpose |
|---|---|
| BrainboxAI/law-il-E2B | GGUF for local inference |
| BrainboxAI/law-il-E2B-safetensors | This repo - training-ready safetensors |
| BrainboxAI/legal-training-il | Training dataset |
Limitations & Ethical Considerations
- Not a licensed lawyer. Informational only; consult a licensed Israeli attorney for case-specific guidance.
- Training cutoff. Coverage through 2025. Newer rulings or legislation may not appear.
- Citation verification. Always cross-check citations with official sources (Nevo, Supreme Court website, Kol-Zchut).
- Hebrew variance. Archaic legal Hebrew can occasionally confuse the model.
- Dual-use. Deploy with acceptable-use policies and human review for any adversarial use case.
Citation
@misc{brainboxai_law_il_e2b_safetensors_2026,
author = {Elyasi, Netanel and BrainboxAI},
title = {BrainboxAI Law IL E2B (safetensors): A Hebrew-First Israeli Legal LLM},
year = {2026},
url = {https://huggingface.co/BrainboxAI/law-il-E2B-safetensors},
publisher = {Hugging Face}
}
About BrainboxAI
BrainboxAI is an Israeli AI agency founded by Netanel Elyasi, specializing in:
- Custom LLM training (Hebrew-native and bilingual models)
- AI automation and agentic workflows
- Cybersecurity AI products (scanning, triage, reporting)
- Enterprise AI deployment (on-premise, privacy-first)
Related models and datasets:
- BrainboxAI/law-il-E2B - GGUF version
- BrainboxAI/legal-training-il - Training dataset
- BrainboxAI/cyber-analyst-4B - Cyber analyst
- BrainboxAI/brainboxai_cyber_train - Cyber dataset
Contact: via Hugging Face or BrainboxAI.
Trained 2x faster with Unsloth.
- Downloads last month
- 585