Llama Prompt Guard 2 22M β€” ONNX (Signed Model Pack)

Built with Llama

This repository contains a pre-exported ONNX build of Meta's Llama Prompt Guard 2 22M, packaged as a signed model pack for use with shisad.

The model detects prompt injection and jailbreak attacks against LLM-powered applications. It classifies prompts as BENIGN or MALICIOUS using local CPU inference via ONNX Runtime β€” no GPU required, no external API calls.

Quick Start

Download and verify with the shisad CLI:

# Install with security runtime dependencies
uv sync --group security-runtime

# Download the signed model pack
shisad setup promptguard

Or download directly with huggingface-cli:

pip install huggingface-cli
huggingface-cli download shisa-ai/promptguard2-onnx --local-dir .local/promptguard/22m-pack

What's In This Repo

.
β”œβ”€β”€ manifest.json              # SHA-256 file inventory + provenance metadata
β”œβ”€β”€ manifest.json.sig          # SSH ed25519 signature over manifest.json
β”œβ”€β”€ LICENSE                    # Llama 4 Community License
β”œβ”€β”€ USE_POLICY.md              # Llama 4 Acceptable Use Policy
β”œβ”€β”€ NOTICE                     # Required attribution notice
β”œβ”€β”€ README.md                  # This file
└── payload/
    β”œβ”€β”€ config.json            # Model configuration
    β”œβ”€β”€ model.onnx             # ONNX computation graph (2.5 MB)
    β”œβ”€β”€ model.onnx.data        # Model weights (283 MB, fp32)
    β”œβ”€β”€ special_tokens_map.json
    β”œβ”€β”€ tokenizer.json         # Tokenizer (8.6 MB)
    └── tokenizer_config.json

Signature Verification

The model pack is signed with an SSH ed25519 key. The manifest.json contains SHA-256 hashes for every file; manifest.json.sig is an OpenSSH signature over the manifest. To verify:

# Using shisad's built-in verification
uv run python scripts/promptguard_artifacts.py verify-model-pack \
  --pack-dir .local/promptguard/22m-pack \
  --allowed-signers config/promptguard/allowed_signers

# Or manually with ssh-keygen
ssh-keygen -Y verify \
  -f allowed_signers \
  -I promptguard \
  -n file \
  -s manifest.json.sig < manifest.json

The trusted allowed_signers file is shipped with the shisad source tree.

How This Pack Was Built

The ONNX export was produced from the upstream meta-llama/Llama-Prompt-Guard-2-22M checkpoint using the build pipeline in shisad:

# 1. Download gated checkpoint (requires HF token + Llama 4 license)
uv run python scripts/promptguard_artifacts.py download \
  --model-id meta-llama/Llama-Prompt-Guard-2-22M \
  --output-dir .local/promptguard/22m-source

# 2. Export to ONNX
uv run python scripts/promptguard_artifacts.py export-onnx \
  --source-dir .local/promptguard/22m-source \
  --output-dir .local/promptguard/22m-onnx

# 3. Build signed model pack
uv run python scripts/promptguard_artifacts.py build-model-pack \
  --artifact-dir .local/promptguard/22m-onnx \
  --source-dir .local/promptguard/22m-source \
  --output-dir .local/promptguard/22m-pack \
  --source-model-id meta-llama/Llama-Prompt-Guard-2-22M \
  --signing-key <signing-key> \
  --signer-principal promptguard

Runtime Details

Property Value
Format ONNX
Execution provider CPUExecutionProvider
Quantization fp32
Context window 512 tokens
Parameters 22M (DeBERTa-xsmall backbone)
Labels BENIGN, MALICIOUS

For inputs longer than 512 tokens, shisad automatically splits into bounded multi-batch passes and scores each segment.

Model Performance

From Meta's evaluation on a private benchmark distinct from training data:

Model AUC (English) Recall @ 1% FPR (English) AUC (Multilingual) Latency (A100, 512 tokens) Parameters
Llama Prompt Guard 1 .987 21.2% .983 92.4 ms 86M
Llama Prompt Guard 2 86M .998 97.5% .995 92.4 ms 86M
Llama Prompt Guard 2 22M .995 88.7% .942 19.3 ms 22M

AgentDojo real-world attack risk reduction:

Model APR @ 3% utility reduction
Llama Prompt Guard 2 86M 81.2%
Llama Prompt Guard 2 22M 78.4%
ProtectAI 22.2%
Deepset 13.5%

License

This model is distributed under the Llama 4 Community License. See USE_POLICY.md for the acceptable use policy.

The DeBERTa-xsmall base model is MIT-licensed (Microsoft).

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for shisa-ai/promptguard2-onnx

Quantized
(3)
this model