File size: 1,029 Bytes
b29689f
 
8fb73f8
 
 
b29689f
d2524a5
1430a59
b29689f
 
 
 
 
 
8fb73f8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
title: FactEval
emoji: πŸ”
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 5.29.0
python_version: "3.12"
app_file: app.py
pinned: false
license: mit
short_description: Find exactly which parts of your LLM output are hallucinated
---

# πŸ” FactEval

**Find exactly which parts of your LLM output are hallucinated.**

Debug hallucinations in RAG and LLM pipelines with claim-level verification.

Paste an LLM-generated answer and reference contexts. FactEval highlights βœ… **supported**, ❌ **contradicted**, and ❓ **unverifiable** claims with human-readable reasons and pipeline diagnostics.

## How it works

1. **Claim Extraction** β€” Breaks the answer into atomic claims (Qwen2.5-1.5B)
2. **Evidence Retrieval** β€” Finds the most relevant sentences from your contexts (MiniLM + FAISS)
3. **NLI Verification** β€” Checks each claim against evidence (DeBERTa-v3)
4. **Calibration** β€” Produces trustworthy confidence scores (Isotonic Regression)

πŸ“¦ [GitHub Repository](https://github.com/sahilaf/FactEval)