Publish canonical AANA architecture model card

Browse files

Files changed (3) hide show

README.md +169 -0
aana_runtime_contract.json +43 -0
benchmark_summary.json +49 -0

README.md ADDED Viewed

	@@ -0,0 +1,169 @@

+---
+license: mit
+tags:
+- aana
+- alignment
+- ai-safety
+- llm-evaluation
+- verifier
+- correction-loop
+- guardrails
+- agent-safety
+- pii
+- piimb
+datasets:
+- piimb/pii-masking-benchmark
+- truthfulqa/truthful_qa
+metrics:
+- accuracy
+- f_beta
+library_name: aana
+pipeline_tag: text-classification
+---
+# Alignment-Aware Neural Architecture (AANA)
+AANA is a verifier-grounded runtime architecture for making AI and agent outputs
+more correctable before they are published, sent, deployed, or used for
+consequential actions.
+It is not a standalone set of neural weights. AANA wraps a base generator or
+specialist detector with explicit verifier, grounding, correction, and gate
+components:
+```text
+S = (f_theta, E_phi, R, Pi_psi, G)
+```
+- `f_theta`: base generator, LLM, agent, tool planner, or specialist detector.
+- `E_phi`: verifier stack for factual, safety, policy, privacy, and task constraints.
+- `R`: retrieval or grounding module for evidence.
+- `Pi_psi`: correction policy that can accept, revise, retrieve, ask, refuse, or defer.
+- `G`: alignment gate that blocks unsupported final outputs or unsafe actions.
+The goal is not to claim perfect alignment. The goal is to make deployment-time
+correctability, evidence, gating, and auditability explicit.
+## Current Public Benchmark Signals
+### PIIMB: Presidio + AANA
+Official PIIMB submission:
+https://huggingface.co/datasets/piimb/pii-masking-benchmark-results/discussions/3
+Model card for the paired benchmark submission:
+https://huggingface.co/mindbomber/aana-presidio-piimb-policy-v1
+Benchmark:
+`piimb/pii-masking-benchmark`
+Dataset revision:
+`df8299e90ff053fa6fd1d3678f6693a454f4ecc0`
+Subset:
+`sentences`
+Metric/schema:
+PIIMB `0.2.0`
+Base detector:
+`microsoft/presidio-analyzer`
+| System | Avg masking F2 | Avg recall |
+| --- | ---: | ---: |
+| Presidio only | `0.4492985573` | `0.4008557794` |
+| Presidio + AANA | `0.5629171363` | `0.5159532273` |
+| Delta | `+0.1136185790` | `+0.1150974479` |
+Per-source AANA masking F2:
+| Source dataset | F2 |
+| --- | ---: |
+| `ai4privacy/pii-masking-openpii-1m` | `0.4879480402` |
+| `gretelai/gretel-pii-masking-en-v1` | `0.6281397502` |
+| `nvidia/Nemotron-PII` | `0.6161414756` |
+| `piimb/privy` | `0.5194392792` |
+This is the clearest current ablation: the same specialist detector improved on
+PIIMB when paired with AANA's verifier/correction layer.
+### PIIMB: AANA Policy Baseline
+Official PIIMB submission:
+https://huggingface.co/datasets/piimb/pii-masking-benchmark-results/discussions/2
+Model card:
+https://huggingface.co/mindbomber/aana-piimb-policy-baseline
+Average masking F2:
+`0.5195345497`
+This is a zero-parameter deterministic policy baseline. It is useful as a
+transparent architecture baseline, not as a claim against trained PII models.
+### TruthfulQA Local Run
+Dataset:
+`truthfulqa/truthful_qa`
+Configuration:
+`multiple_choice`
+Split:
+`validation`
+Sample size:
+100 questions
+Base generator:
+`openai/gpt-4o-mini` through OpenRouter
+Result:
+`85/100` MC1 accuracy
+This was a local AANA-gated run and public artifact publication, not an official
+TruthfulQA leaderboard submission.
+## Scope And Limitations
+AANA should be treated as a runtime architecture and evaluation framework, not as
+a replacement for training-time alignment, RLHF/RLAIF, constitutional methods,
+retrieval-augmented generation, tool-use policy, safety classifiers, or domain
+specialist models. AANA can wrap and coordinate those components.
+Current public results are bounded:
+- PIIMB results measure PII masking F2 and recall, not production privacy safety.
+- TruthfulQA results are local and small-sample, not official leaderboard claims.
+- No result here claims state-of-the-art performance.
+- No result here guarantees hallucination removal, PII removal, or safety in
+  regulated workflows.
+Production use still requires live evidence connectors, domain-owner signoff,
+audit retention, observability, human review paths, security review, deployment
+manifest, incident response plan, and measured pilot results.
+## Repositories
+Project repository:
+https://github.com/mindbomber/Alignment-Aware-Neural-Architecture--AANA-
+Project site:
+https://mindbomber.github.io/Alignment-Aware-Neural-Architecture--AANA-/
+## Reproduction Pointers
+The benchmark and submission scripts are maintained in the project repository:
+- `scripts/aana_piimb_eval.py`
+- `scripts/aana_piimb_presidio_eval.py`
+- `scripts/aana_truthfulqa_eval.py`
+- `scripts/aana_cli.py workflow-check`
+The AANA publication gates for the PIIMB submissions passed with:
+- `gate_decision=pass`
+- `recommended_action=accept`
+- `candidate_gate=pass`
+- no hard blockers

aana_runtime_contract.json ADDED Viewed

	@@ -0,0 +1,43 @@

+{
+  "name": "Alignment-Aware Neural Architecture",
+  "short_name": "AANA",
+  "version": "0.1",
+  "type": "runtime_architecture",
+  "system": {
+    "f_theta": "base generator, LLM, agent, tool planner, or specialist detector",
+    "E_phi": "verifier stack",
+    "R": "retrieval or grounding module",
+    "Pi_psi": "correction policy",
+    "G": "alignment gate"
+  },
+  "allowed_actions": [
+    "accept",
+    "revise",
+    "retrieve",
+    "ask",
+    "refuse",
+    "defer"
+  ],
+  "gate_requirements_for_publication": {
+    "gate_decision": "pass",
+    "recommended_action": "accept",
+    "candidate_gate": "pass",
+    "aix_hard_blockers": []
+  },
+  "audit_metadata": [
+    "adapter",
+    "gate_decision",
+    "recommended_action",
+    "candidate_gate",
+    "aix_score",
+    "aix_decision",
+    "hard_blockers",
+    "violation_codes",
+    "input_fingerprints"
+  ],
+  "notes": [
+    "AANA externalizes verifier, grounding, correction, and gate behavior as runtime components.",
+    "AANA can wrap frontier LLMs, smaller language models, specialist detectors, retrieval systems, or agent tool planners.",
+    "AANA does not replace model training, post-training alignment, safety classifiers, or domain-specific validation."
+  ]
+}

benchmark_summary.json ADDED Viewed

	@@ -0,0 +1,49 @@

+{
+  "model_id": "mindbomber/aana",
+  "architecture": "Alignment-Aware Neural Architecture",
+  "system_model": "S = (f_theta, E_phi, R, Pi_psi, G)",
+  "results": [
+    {
+      "benchmark": "PIIMB",
+      "submission": "https://huggingface.co/datasets/piimb/pii-masking-benchmark-results/discussions/3",
+      "model_card": "https://huggingface.co/mindbomber/aana-presidio-piimb-policy-v1",
+      "dataset": "piimb/pii-masking-benchmark",
+      "dataset_revision": "df8299e90ff053fa6fd1d3678f6693a454f4ecc0",
+      "subset": "sentences",
+      "base_detector": "microsoft/presidio-analyzer",
+      "base_average_masking_f2": 0.4492985573,
+      "aana_average_masking_f2": 0.5629171363,
+      "delta_average_masking_f2": 0.113618579,
+      "base_average_recall": 0.4008557794,
+      "aana_average_recall": 0.5159532273,
+      "delta_average_recall": 0.1150974479,
+      "scope": "official PIIMB submission showing AANA verifier/correction gain over the same specialist detector"
+    },
+    {
+      "benchmark": "PIIMB",
+      "submission": "https://huggingface.co/datasets/piimb/pii-masking-benchmark-results/discussions/2",
+      "model_card": "https://huggingface.co/mindbomber/aana-piimb-policy-baseline",
+      "dataset": "piimb/pii-masking-benchmark",
+      "dataset_revision": "df8299e90ff053fa6fd1d3678f6693a454f4ecc0",
+      "subset": "sentences",
+      "aana_average_masking_f2": 0.5195345497,
+      "scope": "official PIIMB submission for a zero-parameter deterministic policy baseline"
+    },
+    {
+      "benchmark": "TruthfulQA",
+      "dataset": "truthfulqa/truthful_qa",
+      "configuration": "multiple_choice",
+      "split": "validation",
+      "sample_size": 100,
+      "base_generator": "openai/gpt-4o-mini",
+      "mc1_accuracy": 0.85,
+      "scope": "local AANA-gated run and public artifact publication, not an official leaderboard submission"
+    }
+  ],
+  "claim_limits": [
+    "AANA is a runtime architecture, not a standalone neural-weight checkpoint.",
+    "Current public results do not claim state-of-the-art performance.",
+    "Current public results do not guarantee hallucination removal, PII removal, or production safety.",
+    "Production readiness requires external deployment evidence beyond local benchmark results."
+  ]
+}