mindbomber commited on
Commit
d76d3b9
·
verified ·
1 Parent(s): 84d7758

Publish canonical AANA architecture model card

Browse files
Files changed (3) hide show
  1. README.md +169 -0
  2. aana_runtime_contract.json +43 -0
  3. benchmark_summary.json +49 -0
README.md ADDED
@@ -0,0 +1,169 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - aana
5
+ - alignment
6
+ - ai-safety
7
+ - llm-evaluation
8
+ - verifier
9
+ - correction-loop
10
+ - guardrails
11
+ - agent-safety
12
+ - pii
13
+ - piimb
14
+ datasets:
15
+ - piimb/pii-masking-benchmark
16
+ - truthfulqa/truthful_qa
17
+ metrics:
18
+ - accuracy
19
+ - f_beta
20
+ library_name: aana
21
+ pipeline_tag: text-classification
22
+ ---
23
+
24
+ # Alignment-Aware Neural Architecture (AANA)
25
+
26
+ AANA is a verifier-grounded runtime architecture for making AI and agent outputs
27
+ more correctable before they are published, sent, deployed, or used for
28
+ consequential actions.
29
+
30
+ It is not a standalone set of neural weights. AANA wraps a base generator or
31
+ specialist detector with explicit verifier, grounding, correction, and gate
32
+ components:
33
+
34
+ ```text
35
+ S = (f_theta, E_phi, R, Pi_psi, G)
36
+ ```
37
+
38
+ - `f_theta`: base generator, LLM, agent, tool planner, or specialist detector.
39
+ - `E_phi`: verifier stack for factual, safety, policy, privacy, and task constraints.
40
+ - `R`: retrieval or grounding module for evidence.
41
+ - `Pi_psi`: correction policy that can accept, revise, retrieve, ask, refuse, or defer.
42
+ - `G`: alignment gate that blocks unsupported final outputs or unsafe actions.
43
+
44
+ The goal is not to claim perfect alignment. The goal is to make deployment-time
45
+ correctability, evidence, gating, and auditability explicit.
46
+
47
+ ## Current Public Benchmark Signals
48
+
49
+ ### PIIMB: Presidio + AANA
50
+
51
+ Official PIIMB submission:
52
+ https://huggingface.co/datasets/piimb/pii-masking-benchmark-results/discussions/3
53
+
54
+ Model card for the paired benchmark submission:
55
+ https://huggingface.co/mindbomber/aana-presidio-piimb-policy-v1
56
+
57
+ Benchmark:
58
+ `piimb/pii-masking-benchmark`
59
+
60
+ Dataset revision:
61
+ `df8299e90ff053fa6fd1d3678f6693a454f4ecc0`
62
+
63
+ Subset:
64
+ `sentences`
65
+
66
+ Metric/schema:
67
+ PIIMB `0.2.0`
68
+
69
+ Base detector:
70
+ `microsoft/presidio-analyzer`
71
+
72
+ | System | Avg masking F2 | Avg recall |
73
+ | --- | ---: | ---: |
74
+ | Presidio only | `0.4492985573` | `0.4008557794` |
75
+ | Presidio + AANA | `0.5629171363` | `0.5159532273` |
76
+ | Delta | `+0.1136185790` | `+0.1150974479` |
77
+
78
+ Per-source AANA masking F2:
79
+
80
+ | Source dataset | F2 |
81
+ | --- | ---: |
82
+ | `ai4privacy/pii-masking-openpii-1m` | `0.4879480402` |
83
+ | `gretelai/gretel-pii-masking-en-v1` | `0.6281397502` |
84
+ | `nvidia/Nemotron-PII` | `0.6161414756` |
85
+ | `piimb/privy` | `0.5194392792` |
86
+
87
+ This is the clearest current ablation: the same specialist detector improved on
88
+ PIIMB when paired with AANA's verifier/correction layer.
89
+
90
+ ### PIIMB: AANA Policy Baseline
91
+
92
+ Official PIIMB submission:
93
+ https://huggingface.co/datasets/piimb/pii-masking-benchmark-results/discussions/2
94
+
95
+ Model card:
96
+ https://huggingface.co/mindbomber/aana-piimb-policy-baseline
97
+
98
+ Average masking F2:
99
+ `0.5195345497`
100
+
101
+ This is a zero-parameter deterministic policy baseline. It is useful as a
102
+ transparent architecture baseline, not as a claim against trained PII models.
103
+
104
+ ### TruthfulQA Local Run
105
+
106
+ Dataset:
107
+ `truthfulqa/truthful_qa`
108
+
109
+ Configuration:
110
+ `multiple_choice`
111
+
112
+ Split:
113
+ `validation`
114
+
115
+ Sample size:
116
+ 100 questions
117
+
118
+ Base generator:
119
+ `openai/gpt-4o-mini` through OpenRouter
120
+
121
+ Result:
122
+ `85/100` MC1 accuracy
123
+
124
+ This was a local AANA-gated run and public artifact publication, not an official
125
+ TruthfulQA leaderboard submission.
126
+
127
+ ## Scope And Limitations
128
+
129
+ AANA should be treated as a runtime architecture and evaluation framework, not as
130
+ a replacement for training-time alignment, RLHF/RLAIF, constitutional methods,
131
+ retrieval-augmented generation, tool-use policy, safety classifiers, or domain
132
+ specialist models. AANA can wrap and coordinate those components.
133
+
134
+ Current public results are bounded:
135
+
136
+ - PIIMB results measure PII masking F2 and recall, not production privacy safety.
137
+ - TruthfulQA results are local and small-sample, not official leaderboard claims.
138
+ - No result here claims state-of-the-art performance.
139
+ - No result here guarantees hallucination removal, PII removal, or safety in
140
+ regulated workflows.
141
+
142
+ Production use still requires live evidence connectors, domain-owner signoff,
143
+ audit retention, observability, human review paths, security review, deployment
144
+ manifest, incident response plan, and measured pilot results.
145
+
146
+ ## Repositories
147
+
148
+ Project repository:
149
+ https://github.com/mindbomber/Alignment-Aware-Neural-Architecture--AANA-
150
+
151
+ Project site:
152
+ https://mindbomber.github.io/Alignment-Aware-Neural-Architecture--AANA-/
153
+
154
+ ## Reproduction Pointers
155
+
156
+ The benchmark and submission scripts are maintained in the project repository:
157
+
158
+ - `scripts/aana_piimb_eval.py`
159
+ - `scripts/aana_piimb_presidio_eval.py`
160
+ - `scripts/aana_truthfulqa_eval.py`
161
+ - `scripts/aana_cli.py workflow-check`
162
+
163
+ The AANA publication gates for the PIIMB submissions passed with:
164
+
165
+ - `gate_decision=pass`
166
+ - `recommended_action=accept`
167
+ - `candidate_gate=pass`
168
+ - no hard blockers
169
+
aana_runtime_contract.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "Alignment-Aware Neural Architecture",
3
+ "short_name": "AANA",
4
+ "version": "0.1",
5
+ "type": "runtime_architecture",
6
+ "system": {
7
+ "f_theta": "base generator, LLM, agent, tool planner, or specialist detector",
8
+ "E_phi": "verifier stack",
9
+ "R": "retrieval or grounding module",
10
+ "Pi_psi": "correction policy",
11
+ "G": "alignment gate"
12
+ },
13
+ "allowed_actions": [
14
+ "accept",
15
+ "revise",
16
+ "retrieve",
17
+ "ask",
18
+ "refuse",
19
+ "defer"
20
+ ],
21
+ "gate_requirements_for_publication": {
22
+ "gate_decision": "pass",
23
+ "recommended_action": "accept",
24
+ "candidate_gate": "pass",
25
+ "aix_hard_blockers": []
26
+ },
27
+ "audit_metadata": [
28
+ "adapter",
29
+ "gate_decision",
30
+ "recommended_action",
31
+ "candidate_gate",
32
+ "aix_score",
33
+ "aix_decision",
34
+ "hard_blockers",
35
+ "violation_codes",
36
+ "input_fingerprints"
37
+ ],
38
+ "notes": [
39
+ "AANA externalizes verifier, grounding, correction, and gate behavior as runtime components.",
40
+ "AANA can wrap frontier LLMs, smaller language models, specialist detectors, retrieval systems, or agent tool planners.",
41
+ "AANA does not replace model training, post-training alignment, safety classifiers, or domain-specific validation."
42
+ ]
43
+ }
benchmark_summary.json ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_id": "mindbomber/aana",
3
+ "architecture": "Alignment-Aware Neural Architecture",
4
+ "system_model": "S = (f_theta, E_phi, R, Pi_psi, G)",
5
+ "results": [
6
+ {
7
+ "benchmark": "PIIMB",
8
+ "submission": "https://huggingface.co/datasets/piimb/pii-masking-benchmark-results/discussions/3",
9
+ "model_card": "https://huggingface.co/mindbomber/aana-presidio-piimb-policy-v1",
10
+ "dataset": "piimb/pii-masking-benchmark",
11
+ "dataset_revision": "df8299e90ff053fa6fd1d3678f6693a454f4ecc0",
12
+ "subset": "sentences",
13
+ "base_detector": "microsoft/presidio-analyzer",
14
+ "base_average_masking_f2": 0.4492985573,
15
+ "aana_average_masking_f2": 0.5629171363,
16
+ "delta_average_masking_f2": 0.113618579,
17
+ "base_average_recall": 0.4008557794,
18
+ "aana_average_recall": 0.5159532273,
19
+ "delta_average_recall": 0.1150974479,
20
+ "scope": "official PIIMB submission showing AANA verifier/correction gain over the same specialist detector"
21
+ },
22
+ {
23
+ "benchmark": "PIIMB",
24
+ "submission": "https://huggingface.co/datasets/piimb/pii-masking-benchmark-results/discussions/2",
25
+ "model_card": "https://huggingface.co/mindbomber/aana-piimb-policy-baseline",
26
+ "dataset": "piimb/pii-masking-benchmark",
27
+ "dataset_revision": "df8299e90ff053fa6fd1d3678f6693a454f4ecc0",
28
+ "subset": "sentences",
29
+ "aana_average_masking_f2": 0.5195345497,
30
+ "scope": "official PIIMB submission for a zero-parameter deterministic policy baseline"
31
+ },
32
+ {
33
+ "benchmark": "TruthfulQA",
34
+ "dataset": "truthfulqa/truthful_qa",
35
+ "configuration": "multiple_choice",
36
+ "split": "validation",
37
+ "sample_size": 100,
38
+ "base_generator": "openai/gpt-4o-mini",
39
+ "mc1_accuracy": 0.85,
40
+ "scope": "local AANA-gated run and public artifact publication, not an official leaderboard submission"
41
+ }
42
+ ],
43
+ "claim_limits": [
44
+ "AANA is a runtime architecture, not a standalone neural-weight checkpoint.",
45
+ "Current public results do not claim state-of-the-art performance.",
46
+ "Current public results do not guarantee hallucination removal, PII removal, or production safety.",
47
+ "Production readiness requires external deployment evidence beyond local benchmark results."
48
+ ]
49
+ }