AANA PIIMB Policy Baseline
This repository documents a zero-parameter AANA policy baseline submitted to the PIIMB PII Masking Benchmark. It is not a trained transformer checkpoint. It is an explicit verifier-grounded privacy architecture:
f_theta: deterministic PII span proposal detectors.E_phi: offset validity, PII-pattern, and overlap verifiers.R: local evidence spans from benchmark text only.Pi_psi: edge trimming and overlap merge correction.G: publish only PIIMB-compatible public score files and bounded claims.
PIIMB Result
Run date: 2026-05-07
Dataset: piimb/pii-masking-benchmark
Dataset revision: df8299e90ff053fa6fd1d3678f6693a454f4ecc0
Subset: sentences
Metric package/schema: PIIMB 0.2.0
Average masking F2: 0.5195345497
Per-source masking F2:
| Source dataset | F2 |
|---|---|
ai4privacy/pii-masking-openpii-1m |
0.4607297256 |
gretelai/gretel-pii-masking-en-v1 |
0.5577027155 |
nvidia/Nemotron-PII |
0.5905765837 |
piimb/privy |
0.4691291740 |
The run uses PIIMB's official test split and metric implementation. Prediction is ground-truth-free; ground truth is used only by PIIMB's metric code after spans are generated.
Scope
This is a bounded benchmark submission for comparing an explicit AANA masking policy architecture against PIIMB. It does not claim state-of-the-art performance, guaranteed PII removal, or production readiness for regulated workflows.
Lower strict/type NER scores are expected because this baseline optimizes broad masking coverage and offset correctness rather than exact source-dataset entity taxonomy.
Reproduction
The runner is maintained in the AANA repository as scripts/aana_piimb_eval.py.
It writes PIIMB-compatible result files under:
results/mindbomber__aana-piimb-policy-baseline__policy-v1/