Llama Prompt Guard 2 22M β ONNX (Signed Model Pack)
Built with Llama
This repository contains a pre-exported ONNX build of Meta's Llama Prompt Guard 2 22M, packaged as a signed model pack for use with shisad.
The model detects prompt injection and jailbreak attacks against LLM-powered
applications. It classifies prompts as BENIGN or MALICIOUS using local CPU
inference via ONNX Runtime β no GPU required, no external API calls.
Quick Start
Download and verify with the shisad CLI:
# Install with security runtime dependencies
uv sync --group security-runtime
# Download the signed model pack
shisad setup promptguard
Or download directly with huggingface-cli:
pip install huggingface-cli
huggingface-cli download shisa-ai/promptguard2-onnx --local-dir .local/promptguard/22m-pack
What's In This Repo
.
βββ manifest.json # SHA-256 file inventory + provenance metadata
βββ manifest.json.sig # SSH ed25519 signature over manifest.json
βββ LICENSE # Llama 4 Community License
βββ USE_POLICY.md # Llama 4 Acceptable Use Policy
βββ NOTICE # Required attribution notice
βββ README.md # This file
βββ payload/
βββ config.json # Model configuration
βββ model.onnx # ONNX computation graph (2.5 MB)
βββ model.onnx.data # Model weights (283 MB, fp32)
βββ special_tokens_map.json
βββ tokenizer.json # Tokenizer (8.6 MB)
βββ tokenizer_config.json
Signature Verification
The model pack is signed with an SSH ed25519 key. The manifest.json contains
SHA-256 hashes for every file; manifest.json.sig is an OpenSSH signature over
the manifest. To verify:
# Using shisad's built-in verification
uv run python scripts/promptguard_artifacts.py verify-model-pack \
--pack-dir .local/promptguard/22m-pack \
--allowed-signers config/promptguard/allowed_signers
# Or manually with ssh-keygen
ssh-keygen -Y verify \
-f allowed_signers \
-I promptguard \
-n file \
-s manifest.json.sig < manifest.json
The trusted allowed_signers file is shipped with the
shisad source tree.
How This Pack Was Built
The ONNX export was produced from the upstream
meta-llama/Llama-Prompt-Guard-2-22M checkpoint using the build pipeline in
shisad:
# 1. Download gated checkpoint (requires HF token + Llama 4 license)
uv run python scripts/promptguard_artifacts.py download \
--model-id meta-llama/Llama-Prompt-Guard-2-22M \
--output-dir .local/promptguard/22m-source
# 2. Export to ONNX
uv run python scripts/promptguard_artifacts.py export-onnx \
--source-dir .local/promptguard/22m-source \
--output-dir .local/promptguard/22m-onnx
# 3. Build signed model pack
uv run python scripts/promptguard_artifacts.py build-model-pack \
--artifact-dir .local/promptguard/22m-onnx \
--source-dir .local/promptguard/22m-source \
--output-dir .local/promptguard/22m-pack \
--source-model-id meta-llama/Llama-Prompt-Guard-2-22M \
--signing-key <signing-key> \
--signer-principal promptguard
Runtime Details
| Property | Value |
|---|---|
| Format | ONNX |
| Execution provider | CPUExecutionProvider |
| Quantization | fp32 |
| Context window | 512 tokens |
| Parameters | 22M (DeBERTa-xsmall backbone) |
| Labels | BENIGN, MALICIOUS |
For inputs longer than 512 tokens, shisad automatically splits into bounded multi-batch passes and scores each segment.
Model Performance
From Meta's evaluation on a private benchmark distinct from training data:
| Model | AUC (English) | Recall @ 1% FPR (English) | AUC (Multilingual) | Latency (A100, 512 tokens) | Parameters |
|---|---|---|---|---|---|
| Llama Prompt Guard 1 | .987 | 21.2% | .983 | 92.4 ms | 86M |
| Llama Prompt Guard 2 86M | .998 | 97.5% | .995 | 92.4 ms | 86M |
| Llama Prompt Guard 2 22M | .995 | 88.7% | .942 | 19.3 ms | 22M |
AgentDojo real-world attack risk reduction:
| Model | APR @ 3% utility reduction |
|---|---|
| Llama Prompt Guard 2 86M | 81.2% |
| Llama Prompt Guard 2 22M | 78.4% |
| ProtectAI | 22.2% |
| Deepset | 13.5% |
License
This model is distributed under the Llama 4 Community License. See USE_POLICY.md for the acceptable use policy.
The DeBERTa-xsmall base model is MIT-licensed (Microsoft).
Links
- shisad: github.com/shisa-ai/shisad
- Source model: meta-llama/Llama-Prompt-Guard-2-22M
- Prompt Guard cookbook: llama-cookbook/prompt_guard
- Report vulnerabilities: PurpleLlama
Model tree for shisa-ai/promptguard2-onnx
Base model
meta-llama/Llama-Prompt-Guard-2-22M