prefixguard-demo / README.md
O96a's picture
Upload README.md with huggingface_hub
78918e1 verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: PrefixGuard Demo - Agent Failure Detection
emoji: 🛡️
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 4.36.0
app_file: app.py
pinned: false

PrefixGuard Demo

A minimal implementation demonstrating the core concept from "PrefixGuard: From LLM-Agent Traces to Online Failure-Warning Monitors" (Huang et al., 2026).

What it demonstrates

PrefixGuard shows that we can predict agent task failures from early execution traces, not just final outcomes. This demo implements:

  • Trace encoding: Convert agent step sequences to feature vectors
  • Prefix risk scoring: Score risk from partial trace prefixes
  • Early warning: Alert before task completion when failure is likely

Hypothesis

Agent execution traces contain early signals of eventual failure. A lightweight prefix-based monitor can predict failures with >0.7 AUPRC, enabling intervention before resources are wasted.

Key findings from paper

  • Best monitors achieve 0.900/0.710/0.533/0.557 AUPRC across WebArena, τ²-Bench, SkillsBench, TerminalBench
  • +0.137 AUPRC improvement over raw-text baselines
  • LLM judges are substantially weaker at prefix-time prediction

Implementation

This demo uses synthetic agent traces to illustrate the approach. In production, this would integrate with actual agent frameworks (LangGraph, AutoGPT, etc.).