gemma-4-E2B-it-saber — GGUF

GGUF quantizations of DJLougen/gemma-4-E2B-it-saber, a surgically-modified version of google/gemma-4-E2B-it using SABER (Spectral Analysis-Based Entanglement Resolution) that removes safety refusal behavior while preserving model capability.

Quantizations

File	Quant	Size	Description
`gemma-4-E2B-it-saber-BF16.gguf`	BF16	8.7 GB	Full precision (brain float 16)
`gemma-4-E2B-it-saber-Q8_0.gguf`	Q8_0	4.7 GB	Near-lossless
`gemma-4-E2B-it-saber-Q6_K.gguf`	Q6_K	3.6 GB	Very high quality
`gemma-4-E2B-it-saber-Q5_K_M.gguf`	Q5_K_M	3.4 GB	High quality
`gemma-4-E2B-it-saber-Q5_K_S.gguf`	Q5_K_S	3.4 GB	High quality, smaller
`gemma-4-E2B-it-saber-Q4_K_M.gguf`	Q4_K_M	3.2 GB	Good quality — recommended
`gemma-4-E2B-it-saber-Q4_K_S.gguf`	Q4_K_S	3.2 GB	Good quality, smaller
`gemma-4-E2B-it-saber-IQ4_XS.gguf`	IQ4_XS	3.1 GB	Good quality, compact
`gemma-4-E2B-it-saber-Q3_K_L.gguf`	Q3_K_L	3.1 GB	Lower quality
`gemma-4-E2B-it-saber-Q3_K_M.gguf`	Q3_K_M	3.0 GB	Low quality
`gemma-4-E2B-it-saber-IQ3_M.gguf`	IQ3_M	3.0 GB	Medium-low, IQ
`gemma-4-E2B-it-saber-Q3_K_S.gguf`	Q3_K_S	2.9 GB	Low quality, small
`gemma-4-E2B-it-saber-Q2_K.gguf`	Q2_K	2.8 GB	Very low quality
`gemma-4-E2B-it-saber-mmproj-BF16.gguf`	BF16	942 MB	Vision/audio encoder (multimodal projector)

Multimodal Usage

This is a multimodal model. To use vision/audio features with llama.cpp, you need both a text model GGUF and the mmproj file:

llama-cli -m gemma-4-E2B-it-saber-Q4_K_M.gguf --mmproj gemma-4-E2B-it-saber-mmproj-BF16.gguf -p "Describe this image" --image photo.jpg

Method Overview

SABER identifies and ablates the refusal circuit in an LLM through a multi-stage process:

Probing — Extract activation profiles from both harmful and harmless inputs across all transformer layers
Spectral Analysis — Decompose activation differences into individual refusal directions
Entanglement Quantification — Measure overlap between refusal and capability subspaces
Targeted Ablation — Remove only pure-refusal components
Iterative Refinement — Re-probe after each pass to catch hydra effects

Results

Model	Refusal Rate	Perplexity
google/gemma-4-E2B-it (baseline)	100%	498
SABER-refined (this model)	0%	450 (-9.6%)

Warning

This model will comply with any request, including harmful ones. It is intended solely for research into alignment, safety, and model behavior.

Downloads last month: 2,482

GGUF

Model size

5B params

Architecture

gemma4

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DJLougen/gemma-4-E2B-it-saber-GGUF

Base model

google/gemma-4-E2B-it

Finetuned

DJLougen/gemma-4-E2B-it-saber

Quantized

(1)

this model