Provn Gemma 4 E2B Q4_K_M

This repository contains the GGUF Layer 3 semantic classifier used by Provn. It is a fine-tuned Gemma 4 derivative for binary leak classification on code snippets:

leak
clean

Training

Fine-tuned using Unsloth on the LeakBench dataset for binary secret and IP leak classification.

Base model: Gemma 4 E2B (google/gemma-4-e2b-it)
Fine-tuning framework: Unsloth
Task: Binary classification — leak / clean
Dataset: LeakBench (code snippets containing secrets, API keys, system prompts, and clean code)
Quantization: Q4_K_M GGUF via llama.cpp for on-device inference

Layer 3 is designed to handle the ambiguous 0.4–0.8 confidence band that regex and AST layers cannot resolve deterministically. The model was optimized for high recall over precision to minimize missed leaks.

Benchmarks

Evaluated on the LeakBench dataset:

Metric	Score
Recall	97.0%
False Positive Rate	1.2%
p50 latency	≤ 30ms
p95 latency	≤ 50ms
LLM inference (Layer 3)	< 800ms

Layer 3 only activates for ambiguous detections (confidence 0.4–0.8). High-confidence cases from Layer 1/2 skip it entirely, keeping average latency well under 50ms.

Architecture

Provn runs three detection layers in sequence:

Layer	Method	Latency
1a	Regex (30+ Gitleaks rules + NFKC normalization)	< 5ms
1b	Shannon entropy analysis	< 5ms
2	Tree-sitter AST taint tracking	< 50ms
3	This model — Gemma 4 E2B (on-device, optional)	< 800ms

This model is only invoked when Layers 1 and 2 return ambiguous results, making the overall system fast while still catching semantic leaks that deterministic rules miss.

Intended use

Use this model locally with Provn as the optional Layer 3 semantic classifier for ambiguous detections. No data leaves your machine — inference runs entirely on-device via llama.cpp.

Download location for Provn

Place the GGUF file at:

macOS/Linux: ~/.provn/models/provn-gemma4-e2b-q4km.gguf
Windows: %USERPROFILE%\.provn\models\provn-gemma4-e2b-q4km.gguf

Run with Provn

Start your llama.cpp-compatible server on 127.0.0.1:8080 with this GGUF, then run:

provn server status

Enable in provn.yml:

layers:
  semantic:
    enabled: true
    model: provn-gemma4-e2b-q4km.gguf
    endpoint: http://localhost:8080
    timeout_ms: 2000

Gemma terms

This model is a derivative of Gemma and is distributed subject to the Gemma Terms of Use and Gemma Prohibited Use Policy.

Gemma Terms of Use: https://ai.google.dev/gemma/terms
Gemma Prohibited Use Policy: https://ai.google.dev/gemma/prohibited_use_policy

Modification notice

This repository contains modified / fine-tuned model artifacts created for Provn.--- license: gemma base_model: google/gemma-4-e2b-it model_type: gemma4 tags: - gguf - gemma - provn - security - code - llama-cpp - fine-tuned - unsloth language: - en pipeline_tag: text-classification library_name: gguf

Provn Gemma 4 E2B Q4_K_M

This repository contains the GGUF Layer 3 semantic classifier used by Provn. It is a fine-tuned Gemma 4 derivative for binary leak classification on code snippets:

leak
clean

Training

Fine-tuned using Unsloth on the LeakBench dataset for binary secret and IP leak classification.

Base model: Gemma 4 E2B (google/gemma-4-e2b-it)
Fine-tuning framework: Unsloth
Task: Binary classification — leak / clean
Dataset: LeakBench (code snippets containing secrets, API keys, system prompts, and clean code)
Quantization: Q4_K_M GGUF via llama.cpp for on-device inference

Benchmarks

Evaluated on the LeakBench dataset:

Metric	Score
Recall	97.0%
False Positive Rate	1.2%
p50 latency	≤ 30ms
p95 latency	≤ 50ms
LLM inference (Layer 3)	< 800ms

Layer 3 only activates for ambiguous detections (confidence 0.4–0.8). High-confidence cases from Layer 1/2 skip it entirely, keeping average latency well under 50ms.

Architecture

Provn runs three detection layers in sequence:

Layer	Method	Latency
1a	Regex (30+ Gitleaks rules + NFKC normalization)	< 5ms
1b	Shannon entropy analysis	< 5ms
2	Tree-sitter AST taint tracking	< 50ms
3	This model — Gemma 4 E2B (on-device, optional)	< 800ms

This model is only invoked when Layers 1 and 2 return ambiguous results, making the overall system fast while still catching semantic leaks that deterministic rules miss.

Intended use

Use this model locally with Provn as the optional Layer 3 semantic classifier for ambiguous detections. No data leaves your machine — inference runs entirely on-device via llama.cpp.

Download location for Provn

Place the GGUF file at:

macOS/Linux: ~/.provn/models/provn-gemma4-e2b-q4km.gguf
Windows: %USERPROFILE%\.provn\models\provn-gemma4-e2b-q4km.gguf

Run with Provn

Start your llama.cpp-compatible server on 127.0.0.1:8080 with this GGUF, then run:

provn server status

Enable in provn.yml:

layers:
  semantic:
    enabled: true
    model: provn-gemma4-e2b-q4km.gguf
    endpoint: http://localhost:8080
    timeout_ms: 2000

Gemma terms

This model is a derivative of Gemma and is distributed subject to the Gemma Terms of Use and Gemma Prohibited Use Policy.

Gemma Terms of Use: https://ai.google.dev/gemma/terms
Gemma Prohibited Use Policy: https://ai.google.dev/gemma/prohibited_use_policy

Modification notice

This repository contains modified / fine-tuned model artifacts created for Provn.

Downloads last month: 811

GGUF

Model size

601 params

Architecture

gemma4

Hardware compatibility

We're not able to determine the quantization variants.

View all variants