Provn Gemma 4 E2B Q4_K_M
This repository contains the GGUF Layer 3 semantic classifier used by Provn. It is a fine-tuned Gemma 4 derivative for binary leak classification on code snippets:
leakclean
Training
Fine-tuned using Unsloth on the LeakBench dataset for binary secret and IP leak classification.
- Base model: Gemma 4 E2B (
google/gemma-4-e2b-it) - Fine-tuning framework: Unsloth
- Task: Binary classification β
leak/clean - Dataset: LeakBench (code snippets containing secrets, API keys, system prompts, and clean code)
- Quantization: Q4_K_M GGUF via llama.cpp for on-device inference
Layer 3 is designed to handle the ambiguous 0.4β0.8 confidence band that regex and AST layers cannot resolve deterministically. The model was optimized for high recall over precision to minimize missed leaks.
Benchmarks
Evaluated on the LeakBench dataset:
| Metric | Score |
|---|---|
| Recall | 97.0% |
| False Positive Rate | 1.2% |
| p50 latency | β€ 30ms |
| p95 latency | β€ 50ms |
| LLM inference (Layer 3) | < 800ms |
Layer 3 only activates for ambiguous detections (confidence 0.4β0.8). High-confidence cases from Layer 1/2 skip it entirely, keeping average latency well under 50ms.
Architecture
Provn runs three detection layers in sequence:
| Layer | Method | Latency |
|---|---|---|
| 1a | Regex (30+ Gitleaks rules + NFKC normalization) | < 5ms |
| 1b | Shannon entropy analysis | < 5ms |
| 2 | Tree-sitter AST taint tracking | < 50ms |
| 3 | This model β Gemma 4 E2B (on-device, optional) | < 800ms |
This model is only invoked when Layers 1 and 2 return ambiguous results, making the overall system fast while still catching semantic leaks that deterministic rules miss.
Intended use
Use this model locally with Provn as the optional Layer 3 semantic classifier for ambiguous detections. No data leaves your machine β inference runs entirely on-device via llama.cpp.
Download location for Provn
Place the GGUF file at:
- macOS/Linux:
~/.provn/models/provn-gemma4-e2b-q4km.gguf - Windows:
%USERPROFILE%\.provn\models\provn-gemma4-e2b-q4km.gguf
Run with Provn
Start your llama.cpp-compatible server on 127.0.0.1:8080 with this GGUF, then run:
provn server status
Enable in provn.yml:
layers:
semantic:
enabled: true
model: provn-gemma4-e2b-q4km.gguf
endpoint: http://localhost:8080
timeout_ms: 2000
Gemma terms
This model is a derivative of Gemma and is distributed subject to the Gemma Terms of Use and Gemma Prohibited Use Policy.
- Gemma Terms of Use: https://ai.google.dev/gemma/terms
- Gemma Prohibited Use Policy: https://ai.google.dev/gemma/prohibited_use_policy
Modification notice
This repository contains modified / fine-tuned model artifacts created for Provn.--- license: gemma base_model: google/gemma-4-e2b-it model_type: gemma4 tags: - gguf - gemma - provn - security - code - llama-cpp - fine-tuned - unsloth language: - en pipeline_tag: text-classification library_name: gguf
Provn Gemma 4 E2B Q4_K_M
This repository contains the GGUF Layer 3 semantic classifier used by Provn. It is a fine-tuned Gemma 4 derivative for binary leak classification on code snippets:
leakclean
Training
Fine-tuned using Unsloth on the LeakBench dataset for binary secret and IP leak classification.
- Base model: Gemma 4 E2B (
google/gemma-4-e2b-it) - Fine-tuning framework: Unsloth
- Task: Binary classification β
leak/clean - Dataset: LeakBench (code snippets containing secrets, API keys, system prompts, and clean code)
- Quantization: Q4_K_M GGUF via llama.cpp for on-device inference
Layer 3 is designed to handle the ambiguous 0.4β0.8 confidence band that regex and AST layers cannot resolve deterministically. The model was optimized for high recall over precision to minimize missed leaks.
Benchmarks
Evaluated on the LeakBench dataset:
| Metric | Score |
|---|---|
| Recall | 97.0% |
| False Positive Rate | 1.2% |
| p50 latency | β€ 30ms |
| p95 latency | β€ 50ms |
| LLM inference (Layer 3) | < 800ms |
Layer 3 only activates for ambiguous detections (confidence 0.4β0.8). High-confidence cases from Layer 1/2 skip it entirely, keeping average latency well under 50ms.
Architecture
Provn runs three detection layers in sequence:
| Layer | Method | Latency |
|---|---|---|
| 1a | Regex (30+ Gitleaks rules + NFKC normalization) | < 5ms |
| 1b | Shannon entropy analysis | < 5ms |
| 2 | Tree-sitter AST taint tracking | < 50ms |
| 3 | This model β Gemma 4 E2B (on-device, optional) | < 800ms |
This model is only invoked when Layers 1 and 2 return ambiguous results, making the overall system fast while still catching semantic leaks that deterministic rules miss.
Intended use
Use this model locally with Provn as the optional Layer 3 semantic classifier for ambiguous detections. No data leaves your machine β inference runs entirely on-device via llama.cpp.
Download location for Provn
Place the GGUF file at:
- macOS/Linux:
~/.provn/models/provn-gemma4-e2b-q4km.gguf - Windows:
%USERPROFILE%\.provn\models\provn-gemma4-e2b-q4km.gguf
Run with Provn
Start your llama.cpp-compatible server on 127.0.0.1:8080 with this GGUF, then run:
provn server status
Enable in provn.yml:
layers:
semantic:
enabled: true
model: provn-gemma4-e2b-q4km.gguf
endpoint: http://localhost:8080
timeout_ms: 2000
Gemma terms
This model is a derivative of Gemma and is distributed subject to the Gemma Terms of Use and Gemma Prohibited Use Policy.
- Gemma Terms of Use: https://ai.google.dev/gemma/terms
- Gemma Prohibited Use Policy: https://ai.google.dev/gemma/prohibited_use_policy
Modification notice
This repository contains modified / fine-tuned model artifacts created for Provn.
- Downloads last month
- 811
We're not able to determine the quantization variants.