Spaces:

mindbomber
/

aana-demo

Sleeping

App Files Files Community

aana-demo / README.md

mindbomber

Sharpen reviewer guidance for AANA demo

d97269a verified 7 days ago

preview code

raw

history blame contribute delete

3.46 kB

	---
	title: AANA Demo
	sdk: gradio
	app_file: app.py
	license: mit
	python_version: "3.11"
	short_description: Try AANA's pre-action agent control layer.
	pinned: false
	---

	# Try AANA In 2 Minutes

	This local Hugging Face Space artifact is the repo-owned source for the public
	"try AANA" demo. It accepts the frozen Agent Action Contract v1 fields and
	returns the same route decision used by the Python SDK, FastAPI service, and MCP
	tool.

	AANA is a pre-action control layer for AI agents: agents propose actions, AANA checks evidence/auth/risk, and tools execute only when the route is accept.

	Core runtime pattern:

	```text
	agent proposes -> AANA checks -> tool executes only if route == accept
	```

	What this demonstrates: an agent proposes a tool call. AANA checks
	evidence/auth/risk. The tool only executes if the route is `accept`.

	How to test it: pick an example, click `Check With AANA`, then inspect the
	route and executor proof.

	Reviewer checklist:

	- `accept` allows execution
	- `ask`, `defer`, and `refuse` block execution
	- missing auth/evidence becomes a blocker
	- audit-safe event is emitted
	- a bad runtime recommendation can be overridden

	Contrast: a plain permissive agent would execute the proposed tool call. AANA
	blocks unless the contract is satisfied.

	This is the difference reviewers should inspect: AANA turns pre-tool safety into
	a typed contract, route table, hard execution rule, and audit-safe decision
	event instead of relying only on prompts, classifiers, LLM judges, or
	framework-specific middleware.

	Frozen required fields:

	- `tool_name`
	- `tool_category`
	- `authorization_state`
	- `evidence_refs`
	- `risk_domain`
	- `proposed_arguments`
	- `recommended_route`

	The Space calls `aana.check_tool_call` with these fields and displays:

	- route: `accept`, `ask`, `defer`, or `refuse`
	- AIx score
	- hard blockers
	- missing evidence
	- authorization state
	- recovery guidance
	- audit-safe log event
	- blocked-tool non-execution proof from a synthetic executor

	The synthetic executor is intentionally safe: it records that it would have run
	only when AANA returns `accept`. It cannot send, delete, purchase, deploy,
	export, or access private data.

	Public evidence links:

	- Model card: https://huggingface.co/mindbomber/aana
	- Peer-review evidence pack: https://huggingface.co/datasets/mindbomber/aana-peer-review-evidence-pack
	- Public artifact hub: https://huggingface.co/collections/mindbomber/aana-public-artifact-hub-69fecc99df04ae6ed6dbc6c4
	- Short technical report: https://github.com/mindbomber/Alignment-Aware-Neural-Architecture--AANA-/blob/master/docs/aana-pre-action-control-layer-technical-report.md

	Peer-review request:

	- Are routes correct?
	- Are false positives acceptable?
	- Is evidence handling sufficient?
	- Does this generalize beyond examples?

	Please post critique in the Space discussion:
	https://huggingface.co/spaces/mindbomber/aana-demo/discussions/1

	Current diagnostic boundary: safety/adversarial prompt routing is useful but
	incomplete, FinanceBench-style QA evidence routing is controlled and not an
	official leaderboard claim, and governance/compliance routing is diagnostic
	rather than legal, regulatory, or platform-policy certification.

	Integration validation v1 is now included in the evidence pack: held-out
	tool-call cases validate route parity, blocked-tool non-execution,
	decision-shape parity, audit completeness, and schema behavior across CLI, SDK,
	FastAPI, MCP, and middleware surfaces.