Provn Gemma 4 E2B Q4_K_M

This repository contains the GGUF Layer 3 semantic classifier used by Provn. It is a fine-tuned Gemma 4 derivative for binary leak classification on code snippets:

  • leak
  • clean

Training

Fine-tuned using Unsloth on the LeakBench dataset for binary secret and IP leak classification.

  • Base model: Gemma 4 E2B (google/gemma-4-e2b-it)
  • Fine-tuning framework: Unsloth
  • Task: Binary classification β€” leak / clean
  • Dataset: LeakBench (code snippets containing secrets, API keys, system prompts, and clean code)
  • Quantization: Q4_K_M GGUF via llama.cpp for on-device inference

Layer 3 is designed to handle the ambiguous 0.4–0.8 confidence band that regex and AST layers cannot resolve deterministically. The model was optimized for high recall over precision to minimize missed leaks.

Benchmarks

Evaluated on the LeakBench dataset:

Metric Score
Recall 97.0%
False Positive Rate 1.2%
p50 latency ≀ 30ms
p95 latency ≀ 50ms
LLM inference (Layer 3) < 800ms

Layer 3 only activates for ambiguous detections (confidence 0.4–0.8). High-confidence cases from Layer 1/2 skip it entirely, keeping average latency well under 50ms.

Architecture

Provn runs three detection layers in sequence:

Layer Method Latency
1a Regex (30+ Gitleaks rules + NFKC normalization) < 5ms
1b Shannon entropy analysis < 5ms
2 Tree-sitter AST taint tracking < 50ms
3 This model β€” Gemma 4 E2B (on-device, optional) < 800ms

This model is only invoked when Layers 1 and 2 return ambiguous results, making the overall system fast while still catching semantic leaks that deterministic rules miss.

Intended use

Use this model locally with Provn as the optional Layer 3 semantic classifier for ambiguous detections. No data leaves your machine β€” inference runs entirely on-device via llama.cpp.

Download location for Provn

Place the GGUF file at:

  • macOS/Linux: ~/.provn/models/provn-gemma4-e2b-q4km.gguf
  • Windows: %USERPROFILE%\.provn\models\provn-gemma4-e2b-q4km.gguf

Run with Provn

Start your llama.cpp-compatible server on 127.0.0.1:8080 with this GGUF, then run:

provn server status

Enable in provn.yml:

layers:
  semantic:
    enabled: true
    model: provn-gemma4-e2b-q4km.gguf
    endpoint: http://localhost:8080
    timeout_ms: 2000

Gemma terms

This model is a derivative of Gemma and is distributed subject to the Gemma Terms of Use and Gemma Prohibited Use Policy.

Modification notice

This repository contains modified / fine-tuned model artifacts created for Provn.--- license: gemma base_model: google/gemma-4-e2b-it model_type: gemma4 tags: - gguf - gemma - provn - security - code - llama-cpp - fine-tuned - unsloth language: - en pipeline_tag: text-classification library_name: gguf

Provn Gemma 4 E2B Q4_K_M

This repository contains the GGUF Layer 3 semantic classifier used by Provn. It is a fine-tuned Gemma 4 derivative for binary leak classification on code snippets:

  • leak
  • clean

Training

Fine-tuned using Unsloth on the LeakBench dataset for binary secret and IP leak classification.

  • Base model: Gemma 4 E2B (google/gemma-4-e2b-it)
  • Fine-tuning framework: Unsloth
  • Task: Binary classification β€” leak / clean
  • Dataset: LeakBench (code snippets containing secrets, API keys, system prompts, and clean code)
  • Quantization: Q4_K_M GGUF via llama.cpp for on-device inference

Layer 3 is designed to handle the ambiguous 0.4–0.8 confidence band that regex and AST layers cannot resolve deterministically. The model was optimized for high recall over precision to minimize missed leaks.

Benchmarks

Evaluated on the LeakBench dataset:

Metric Score
Recall 97.0%
False Positive Rate 1.2%
p50 latency ≀ 30ms
p95 latency ≀ 50ms
LLM inference (Layer 3) < 800ms

Layer 3 only activates for ambiguous detections (confidence 0.4–0.8). High-confidence cases from Layer 1/2 skip it entirely, keeping average latency well under 50ms.

Architecture

Provn runs three detection layers in sequence:

Layer Method Latency
1a Regex (30+ Gitleaks rules + NFKC normalization) < 5ms
1b Shannon entropy analysis < 5ms
2 Tree-sitter AST taint tracking < 50ms
3 This model β€” Gemma 4 E2B (on-device, optional) < 800ms

This model is only invoked when Layers 1 and 2 return ambiguous results, making the overall system fast while still catching semantic leaks that deterministic rules miss.

Intended use

Use this model locally with Provn as the optional Layer 3 semantic classifier for ambiguous detections. No data leaves your machine β€” inference runs entirely on-device via llama.cpp.

Download location for Provn

Place the GGUF file at:

  • macOS/Linux: ~/.provn/models/provn-gemma4-e2b-q4km.gguf
  • Windows: %USERPROFILE%\.provn\models\provn-gemma4-e2b-q4km.gguf

Run with Provn

Start your llama.cpp-compatible server on 127.0.0.1:8080 with this GGUF, then run:

provn server status

Enable in provn.yml:

layers:
  semantic:
    enabled: true
    model: provn-gemma4-e2b-q4km.gguf
    endpoint: http://localhost:8080
    timeout_ms: 2000

Gemma terms

This model is a derivative of Gemma and is distributed subject to the Gemma Terms of Use and Gemma Prohibited Use Policy.

Modification notice

This repository contains modified / fine-tuned model artifacts created for Provn.

Downloads last month
811
GGUF
Model size
601 params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support