Spaces:
Running
Running
| title: FactEval | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: red | |
| sdk: gradio | |
| sdk_version: 5.29.0 | |
| python_version: "3.12" | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| short_description: Find exactly which parts of your LLM output are hallucinated | |
| # π FactEval | |
| **Find exactly which parts of your LLM output are hallucinated.** | |
| Debug hallucinations in RAG and LLM pipelines with claim-level verification. | |
| Paste an LLM-generated answer and reference contexts. FactEval highlights β **supported**, β **contradicted**, and β **unverifiable** claims with human-readable reasons and pipeline diagnostics. | |
| ## How it works | |
| 1. **Claim Extraction** β Breaks the answer into atomic claims (Qwen2.5-1.5B) | |
| 2. **Evidence Retrieval** β Finds the most relevant sentences from your contexts (MiniLM + FAISS) | |
| 3. **NLI Verification** β Checks each claim against evidence (DeBERTa-v3) | |
| 4. **Calibration** β Produces trustworthy confidence scores (Isotonic Regression) | |
| π¦ [GitHub Repository](https://github.com/sahilaf/FactEval) | |