Spaces:

sahilfarib
/

FactEval

Running

App Files Files Community

FactEval / README.md

sahilfarib

Update README.md

d2524a5 verified 2 days ago

preview code

raw

history blame contribute delete

1.03 kB

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

metadata

title: FactEval
emoji: 🔍
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 5.29.0
python_version: '3.12'
app_file: app.py
pinned: false
license: mit
short_description: Find exactly which parts of your LLM output are hallucinated

🔍 FactEval

Find exactly which parts of your LLM output are hallucinated.

Debug hallucinations in RAG and LLM pipelines with claim-level verification.

Paste an LLM-generated answer and reference contexts. FactEval highlights ✅ supported, ❌ contradicted, and ❓ unverifiable claims with human-readable reasons and pipeline diagnostics.

How it works

Claim Extraction — Breaks the answer into atomic claims (Qwen2.5-1.5B)
Evidence Retrieval — Finds the most relevant sentences from your contexts (MiniLM + FAISS)
NLI Verification — Checks each claim against evidence (DeBERTa-v3)
Calibration — Produces trustworthy confidence scores (Isotonic Regression)

📦 GitHub Repository