Spaces:

O96a
/

prefixguard-demo

Running

Upload README.md with huggingface_hub

78918e1 verified 1 day ago

1.35 kB

	---
	title: "PrefixGuard Demo - Agent Failure Detection"
	emoji: 🛡️
	colorFrom: blue
	colorTo: red
	sdk: gradio
	sdk_version: 4.36.0
	app_file: app.py
	pinned: false
	---

	# PrefixGuard Demo

	A minimal implementation demonstrating the core concept from "PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors" (Huang et al., 2026).

	## What it demonstrates

	PrefixGuard shows that we can predict agent task failures from early execution traces, not just final outcomes. This demo implements:

	- Trace encoding: Convert agent step sequences to feature vectors
	- Prefix risk scoring: Score risk from partial trace prefixes
	- Early warning: Alert before task completion when failure is likely

	## Hypothesis

	Agent execution traces contain early signals of eventual failure. A lightweight prefix-based monitor can predict failures with >0.7 AUPRC, enabling intervention before resources are wasted.

	## Key findings from paper

	- Best monitors achieve 0.900/0.710/0.533/0.557 AUPRC across WebArena, τ²-Bench, SkillsBench, TerminalBench
	- +0.137 AUPRC improvement over raw-text baselines
	- LLM judges are substantially weaker at prefix-time prediction

	## Implementation

	This demo uses synthetic agent traces to illustrate the approach. In production, this would integrate with actual agent frameworks (LangGraph, AutoGPT, etc.).