AANA PIIMB Policy Baseline

This repository documents a zero-parameter AANA policy baseline submitted to the PIIMB PII Masking Benchmark. It is not a trained transformer checkpoint. It is an explicit verifier-grounded privacy architecture:

  • f_theta: deterministic PII span proposal detectors.
  • E_phi: offset validity, PII-pattern, and overlap verifiers.
  • R: local evidence spans from benchmark text only.
  • Pi_psi: edge trimming and overlap merge correction.
  • G: publish only PIIMB-compatible public score files and bounded claims.

PIIMB Result

Run date: 2026-05-07

Dataset: piimb/pii-masking-benchmark

Dataset revision: df8299e90ff053fa6fd1d3678f6693a454f4ecc0

Subset: sentences

Metric package/schema: PIIMB 0.2.0

Average masking F2: 0.5195345497

Per-source masking F2:

Source dataset F2
ai4privacy/pii-masking-openpii-1m 0.4607297256
gretelai/gretel-pii-masking-en-v1 0.5577027155
nvidia/Nemotron-PII 0.5905765837
piimb/privy 0.4691291740

The run uses PIIMB's official test split and metric implementation. Prediction is ground-truth-free; ground truth is used only by PIIMB's metric code after spans are generated.

Scope

This is a bounded benchmark submission for comparing an explicit AANA masking policy architecture against PIIMB. It does not claim state-of-the-art performance, guaranteed PII removal, or production readiness for regulated workflows.

Lower strict/type NER scores are expected because this baseline optimizes broad masking coverage and offset correctness rather than exact source-dataset entity taxonomy.

Reproduction

The runner is maintained in the AANA repository as scripts/aana_piimb_eval.py. It writes PIIMB-compatible result files under:

results/mindbomber__aana-piimb-policy-baseline__policy-v1/

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train mindbomber/aana-piimb-policy-baseline