gemma-4-E2B-it-saber โ GGUF
GGUF quantizations of DJLougen/gemma-4-E2B-it-saber, a surgically-modified version of google/gemma-4-E2B-it using SABER (Spectral Analysis-Based Entanglement Resolution) that removes safety refusal behavior while preserving model capability.
Quantizations
| File | Quant | Size | Description |
|---|---|---|---|
gemma-4-E2B-it-saber-BF16.gguf |
BF16 | 8.7 GB | Full precision (brain float 16) |
gemma-4-E2B-it-saber-Q8_0.gguf |
Q8_0 | 4.7 GB | Near-lossless |
gemma-4-E2B-it-saber-Q6_K.gguf |
Q6_K | 3.6 GB | Very high quality |
gemma-4-E2B-it-saber-Q5_K_M.gguf |
Q5_K_M | 3.4 GB | High quality |
gemma-4-E2B-it-saber-Q5_K_S.gguf |
Q5_K_S | 3.4 GB | High quality, smaller |
gemma-4-E2B-it-saber-Q4_K_M.gguf |
Q4_K_M | 3.2 GB | Good quality โ recommended |
gemma-4-E2B-it-saber-Q4_K_S.gguf |
Q4_K_S | 3.2 GB | Good quality, smaller |
gemma-4-E2B-it-saber-IQ4_XS.gguf |
IQ4_XS | 3.1 GB | Good quality, compact |
gemma-4-E2B-it-saber-Q3_K_L.gguf |
Q3_K_L | 3.1 GB | Lower quality |
gemma-4-E2B-it-saber-Q3_K_M.gguf |
Q3_K_M | 3.0 GB | Low quality |
gemma-4-E2B-it-saber-IQ3_M.gguf |
IQ3_M | 3.0 GB | Medium-low, IQ |
gemma-4-E2B-it-saber-Q3_K_S.gguf |
Q3_K_S | 2.9 GB | Low quality, small |
gemma-4-E2B-it-saber-Q2_K.gguf |
Q2_K | 2.8 GB | Very low quality |
gemma-4-E2B-it-saber-mmproj-BF16.gguf |
BF16 | 942 MB | Vision/audio encoder (multimodal projector) |
Multimodal Usage
This is a multimodal model. To use vision/audio features with llama.cpp, you need both a text model GGUF and the mmproj file:
llama-cli -m gemma-4-E2B-it-saber-Q4_K_M.gguf --mmproj gemma-4-E2B-it-saber-mmproj-BF16.gguf -p "Describe this image" --image photo.jpg
Method Overview
SABER identifies and ablates the refusal circuit in an LLM through a multi-stage process:
- Probing โ Extract activation profiles from both harmful and harmless inputs across all transformer layers
- Spectral Analysis โ Decompose activation differences into individual refusal directions
- Entanglement Quantification โ Measure overlap between refusal and capability subspaces
- Targeted Ablation โ Remove only pure-refusal components
- Iterative Refinement โ Re-probe after each pass to catch hydra effects
Results
| Model | Refusal Rate | Perplexity |
|---|---|---|
| google/gemma-4-E2B-it (baseline) | 100% | 498 |
| SABER-refined (this model) | 0% | 450 (-9.6%) |
Warning
This model will comply with any request, including harmful ones. It is intended solely for research into alignment, safety, and model behavior.
- Downloads last month
- 2,482
Hardware compatibility
Log In to add your hardware
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support