--- title: FactEval emoji: 🔍 colorFrom: blue colorTo: red sdk: gradio sdk_version: 5.29.0 python_version: "3.12" app_file: app.py pinned: false license: mit short_description: Find exactly which parts of your LLM output are hallucinated --- # 🔍 FactEval **Find exactly which parts of your LLM output are hallucinated.** Debug hallucinations in RAG and LLM pipelines with claim-level verification. Paste an LLM-generated answer and reference contexts. FactEval highlights ✅ **supported**, ❌ **contradicted**, and ❓ **unverifiable** claims with human-readable reasons and pipeline diagnostics. ## How it works 1. **Claim Extraction** — Breaks the answer into atomic claims (Qwen2.5-1.5B) 2. **Evidence Retrieval** — Finds the most relevant sentences from your contexts (MiniLM + FAISS) 3. **NLI Verification** — Checks each claim against evidence (DeBERTa-v3) 4. **Calibration** — Produces trustworthy confidence scores (Isotonic Regression) 📦 [GitHub Repository](https://github.com/sahilaf/FactEval)