AI-that-works commited on
Commit
eacc1ec
·
verified ·
1 Parent(s): e76cfec

Upload 3 files

Browse files
Files changed (3) hide show
  1. README.md +67 -0
  2. app.py +508 -0
  3. requirements.txt +6 -0
README.md ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: groundlens — Hallucination Detection Demo
3
+ emoji: 📐
4
+ colorFrom: yellow
5
+ colorTo: red
6
+ sdk: gradio
7
+ sdk_version: 5.33.0
8
+ app_file: app.py
9
+ pinned: true
10
+ license: mit
11
+ tags:
12
+ - hallucination-detection
13
+ - llm-evaluation
14
+ - rag
15
+ - grounding
16
+ - nlp
17
+ - groundlens
18
+ - embedding-geometry
19
+ short_description: Geometric LLM hallucination detection. No second LLM.
20
+ ---
21
+
22
+ [![PyPI](https://img.shields.io/pypi/v/groundlens?style=flat-square)](https://pypi.org/project/groundlens/)
23
+ [![GitHub](https://img.shields.io/github/stars/groundlens-dev/groundlens?style=flat-square)](https://github.com/groundlens-dev/groundlens)
24
+
25
+ # groundlens — Hallucination Detection Demo
26
+
27
+ Detects LLM hallucinations using embedding geometry. No second LLM. Deterministic. Auditable.
28
+ Benchmarked against [Vectara HHEM-2.1-Open](https://huggingface.co/vectara/hallucination_evaluation_model).
29
+
30
+ ## Methods compared
31
+
32
+ **groundlens SGI** (with context): ratio of Euclidean distances on the embedding space —
33
+ `dist(response, question) / dist(response, context)`. No model inference for
34
+ the evaluation. One embedding call, one division.
35
+
36
+ **groundlens DGI** (without context): cosine similarity between the response
37
+ displacement vector and the mean displacement of verified grounded pairs.
38
+
39
+ **HHEM-2.1-Open** (Vectara): fine-tuned flan-T5 classifier. Full model
40
+ inference per evaluation call.
41
+
42
+ ## When they disagree
43
+
44
+ Disagreement surfaces **Type III hallucinations** — factual errors within
45
+ a correct semantic frame. Embedding geometry cannot detect these: the
46
+ response occupies the geometrically correct region of the space despite
47
+ being factually wrong. HHEM's classifier may catch some of these cases.
48
+ The two methods are orthogonal signals, not competing alternatives.
49
+
50
+ ## Install the library
51
+
52
+ ```bash
53
+ pip install groundlens
54
+ ```
55
+
56
+ ## Links
57
+
58
+ - [GitHub](https://github.com/groundlens-dev/groundlens)
59
+ - [Documentation](https://docs.groundlens.dev)
60
+ - [PyPI](https://pypi.org/project/groundlens/)
61
+ - [Website](https://groundlens.dev)
62
+
63
+ ## Research
64
+
65
+ - [Semantic Grounding Index — arXiv:2512.13771](https://arxiv.org/abs/2512.13771)
66
+ - [Geometric Taxonomy of Hallucinations — arXiv:2602.13224v3](https://arxiv.org/pdf/2602.13224v3)
67
+ - [Rotational Dynamics of Factual Constraint Processing — arXiv:2603.13259](https://arxiv.org/abs/2603.13259)
app.py ADDED
@@ -0,0 +1,508 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ groundlens — Geometric LLM Hallucination Detection Demo
3
+
4
+ Plain-language interface: paste a question and the AI's answer,
5
+ optionally upload context (PDF, Excel, or plain text).
6
+ Compares groundlens (embedding geometry) vs Vectara HHEM-2.1-Open.
7
+
8
+ Models load once at module level to avoid cold-start on Space wake.
9
+ """
10
+
11
+ import logging
12
+ import time
13
+ import tempfile
14
+ import os
15
+
16
+ import gradio as gr
17
+ from groundlens import compute_sgi, compute_dgi
18
+
19
+ logging.basicConfig(level=logging.INFO)
20
+ logger = logging.getLogger(__name__)
21
+
22
+
23
+ # ─────────────────────────────────────────────────────────────────────────────
24
+ # FILE EXTRACTION — PDF and Excel support
25
+ # ─────────────────────────────────────────────────────────────────────────────
26
+
27
+ def extract_pdf_text(file_path: str, max_chars: int = 8000) -> str:
28
+ """Extract text from a PDF file."""
29
+ try:
30
+ import pdfplumber
31
+ text_parts = []
32
+ with pdfplumber.open(file_path) as pdf:
33
+ for page in pdf.pages[:20]: # limit to 20 pages
34
+ page_text = page.extract_text()
35
+ if page_text:
36
+ text_parts.append(page_text)
37
+ full_text = "\n\n".join(text_parts)
38
+ return full_text[:max_chars] if len(full_text) > max_chars else full_text
39
+ except Exception as e:
40
+ return f"[Could not read PDF: {e}]"
41
+
42
+
43
+ def extract_excel_text(file_path: str, max_chars: int = 8000) -> str:
44
+ """Extract text from an Excel file."""
45
+ try:
46
+ import openpyxl
47
+ wb = openpyxl.load_workbook(file_path, data_only=True)
48
+ text_parts = []
49
+ for sheet_name in wb.sheetnames[:5]: # limit to 5 sheets
50
+ ws = wb[sheet_name]
51
+ text_parts.append(f"--- {sheet_name} ---")
52
+ for row in ws.iter_rows(max_row=200, values_only=True):
53
+ cells = [str(c) if c is not None else "" for c in row]
54
+ line = " | ".join(cells).strip()
55
+ if line and line != " | ".join([""] * len(cells)):
56
+ text_parts.append(line)
57
+ full_text = "\n".join(text_parts)
58
+ return full_text[:max_chars] if len(full_text) > max_chars else full_text
59
+ except Exception as e:
60
+ return f"[Could not read Excel file: {e}]"
61
+
62
+
63
+ def process_uploaded_file(file) -> str:
64
+ """Extract text from an uploaded file (PDF or Excel)."""
65
+ if file is None:
66
+ return ""
67
+
68
+ file_path = file.name if hasattr(file, 'name') else str(file)
69
+ ext = os.path.splitext(file_path)[1].lower()
70
+
71
+ if ext == ".pdf":
72
+ return extract_pdf_text(file_path)
73
+ elif ext in (".xlsx", ".xls"):
74
+ return extract_excel_text(file_path)
75
+ elif ext in (".txt", ".md", ".csv"):
76
+ try:
77
+ with open(file_path, "r", encoding="utf-8", errors="replace") as f:
78
+ text = f.read(8000)
79
+ return text
80
+ except Exception as e:
81
+ return f"[Could not read file: {e}]"
82
+ else:
83
+ return f"[Unsupported file type: {ext}. Use PDF, Excel, TXT, or CSV.]"
84
+
85
+
86
+ # ─────────────────────────────────────────────────────────────────────────────
87
+ # HHEM-2.1-Open — baseline comparison
88
+ # ─────────────────────────────────────────────────────────────────────────────
89
+
90
+ logger.info("Loading HHEM-2.1-Open...")
91
+ from transformers import AutoModelForSequenceClassification
92
+
93
+ _hhem = AutoModelForSequenceClassification.from_pretrained(
94
+ "vectara/hallucination_evaluation_model",
95
+ trust_remote_code=True,
96
+ )
97
+ logger.info("HHEM loaded.")
98
+
99
+ # Warm up groundlens embedding model
100
+ logger.info("Warming up groundlens...")
101
+ compute_dgi(question="warmup", response="warmup")
102
+ logger.info("groundlens ready.")
103
+
104
+
105
+ # ─────────────────────────────────────────────────────────────────────────────
106
+ # SCORING
107
+ # ─────────────────────────────────────────────────────────────────────────────
108
+
109
+ def score_groundlens(question: str, response: str, context: str) -> dict:
110
+ start = time.perf_counter()
111
+ has_context = bool(context.strip())
112
+
113
+ if has_context:
114
+ result = compute_sgi(
115
+ question=question,
116
+ context=context,
117
+ response=response,
118
+ )
119
+ method = "SGI (with context)"
120
+ raw_score = result.value
121
+ grounded = not result.flagged
122
+ threshold = 0.95
123
+ mode_note = (
124
+ "Measured how much the AI's answer used your source document "
125
+ "vs. just rephrasing the question."
126
+ )
127
+ else:
128
+ result = compute_dgi(
129
+ question=question,
130
+ response=response,
131
+ )
132
+ method = "DGI (without context)"
133
+ raw_score = result.value
134
+ grounded = not result.flagged
135
+ threshold = 0.30
136
+ mode_note = (
137
+ "Measured whether the AI's answer follows patterns typical "
138
+ "of grounded, factual responses."
139
+ )
140
+
141
+ elapsed_ms = (time.perf_counter() - start) * 1000
142
+
143
+ return {
144
+ "method": method,
145
+ "raw_score": round(raw_score, 4),
146
+ "grounded": grounded,
147
+ "threshold": threshold,
148
+ "elapsed_ms": round(elapsed_ms, 1),
149
+ "mode_note": mode_note,
150
+ }
151
+
152
+
153
+ def score_hhem(question: str, response: str, context: str) -> dict:
154
+ has_context = bool(context.strip())
155
+ premise = (
156
+ f"{context.strip()}\n\n{question}".strip()
157
+ if has_context
158
+ else question
159
+ )
160
+ if len(premise) > 1800:
161
+ premise = premise[:1800]
162
+
163
+ start = time.perf_counter()
164
+ scores = _hhem.predict([(premise, response)])
165
+ raw_score = float(scores[0])
166
+ elapsed_ms = (time.perf_counter() - start) * 1000
167
+
168
+ return {
169
+ "method": "HHEM-2.1-Open",
170
+ "raw_score": round(raw_score, 4),
171
+ "grounded": raw_score >= 0.5,
172
+ "elapsed_ms": round(elapsed_ms, 1),
173
+ "label": "consistent" if raw_score >= 0.5 else "hallucinated",
174
+ }
175
+
176
+
177
+ # ─────────────────────────────────────────────────────────────────────────────
178
+ # MAIN COMPARISON
179
+ # ─────────────────────────────────────────────────────────────────────────────
180
+
181
+ def run_comparison(
182
+ question: str, context_text: str, file_upload, response: str
183
+ ) -> tuple[str, str, str]:
184
+
185
+ if not question.strip():
186
+ return "⚠️ Enter the question you asked the AI.", "", ""
187
+ if not response.strip():
188
+ return "⚠️ Enter the AI's response.", "", ""
189
+
190
+ # Merge context: typed text + uploaded file
191
+ context_parts = []
192
+ if context_text and context_text.strip():
193
+ context_parts.append(context_text.strip())
194
+ if file_upload is not None:
195
+ extracted = process_uploaded_file(file_upload)
196
+ if extracted and not extracted.startswith("["):
197
+ context_parts.append(extracted)
198
+ elif extracted.startswith("["):
199
+ return f"⚠️ {extracted}", "", ""
200
+ context = "\n\n".join(context_parts)
201
+
202
+ gl = score_groundlens(question, response, context)
203
+ hhem = score_hhem(question, response, context)
204
+
205
+ # groundlens result
206
+ if gl["grounded"]:
207
+ gl_verdict = "🟢 Looks grounded"
208
+ gl_explain = "The AI's answer appears to be based on real information."
209
+ else:
210
+ gl_verdict = "🔴 Possible hallucination"
211
+ gl_explain = "The AI's answer shows signs of being fabricated or not grounded in the source."
212
+
213
+ gl_md = f"""### groundlens
214
+
215
+ **{gl_verdict}**
216
+
217
+ {gl_explain}
218
+
219
+ | | |
220
+ |---|---|
221
+ | **Method** | {gl["method"]} |
222
+ | **Score** | {gl["raw_score"]} (threshold: {gl["threshold"]}) |
223
+ | **Time** | {gl["elapsed_ms"]} ms |
224
+
225
+ *{gl["mode_note"]}*"""
226
+
227
+ # HHEM result
228
+ if hhem["grounded"]:
229
+ hhem_verdict = "🟢 Looks consistent"
230
+ hhem_explain = "The classifier considers this answer consistent with the input."
231
+ else:
232
+ hhem_verdict = "🔴 Possible hallucination"
233
+ hhem_explain = "The classifier flagged this answer as potentially hallucinated."
234
+
235
+ hhem_md = f"""### Vectara HHEM-2.1-Open
236
+
237
+ **{hhem_verdict}**
238
+
239
+ {hhem_explain}
240
+
241
+ | | |
242
+ |---|---|
243
+ | **Method** | {hhem["method"]} |
244
+ | **Score** | {hhem["raw_score"]} ({hhem["label"]}) |
245
+ | **Time** | {hhem["elapsed_ms"]} ms |
246
+
247
+ *Fine-tuned flan-T5 classifier.*"""
248
+
249
+ # Agreement
250
+ agree = gl["grounded"] == hhem["grounded"]
251
+ if agree and gl["grounded"]:
252
+ agreement_md = "### 🔵 Both methods agree: the answer looks reliable."
253
+ elif agree and not gl["grounded"]:
254
+ agreement_md = "### 🔴 Both methods agree: this answer is likely hallucinated."
255
+ else:
256
+ agreement_md = """### 🟠 The two methods disagree.
257
+
258
+ This often happens with **subtle factual errors** — the answer sounds right and
259
+ uses the correct vocabulary, but gets specific facts wrong. Embedding geometry
260
+ (groundlens) measures the shape of the answer; the classifier (HHEM) evaluates
261
+ its content differently. When they disagree, it's worth checking the facts manually.
262
+
263
+ [Learn more about hallucination types →](https://docs.groundlens.dev/theory/hallucination-taxonomy/)"""
264
+
265
+ return gl_md, hhem_md, agreement_md
266
+
267
+
268
+ # ─────────────────────────────────────────────────────────────────────────────
269
+ # EXAMPLES
270
+ # ─────────────────────────────────────────────────────────────────────────────
271
+
272
+ EXAMPLES = [
273
+ [
274
+ "What does the water damage policy cover?",
275
+ "Coverage includes burst pipes and sudden appliance failure up to "
276
+ "$50,000. Flood damage requires a separate NFIP policy. "
277
+ "Deductible is $1,500 per occurrence.",
278
+ "The policy covers burst pipes and sudden appliance failure up to "
279
+ "$50,000 per occurrence, with a $1,500 deductible.",
280
+ ],
281
+ [
282
+ "What does the water damage policy cover?",
283
+ "Coverage includes burst pipes and sudden appliance failure up to "
284
+ "$50,000. Flood damage requires a separate NFIP policy. "
285
+ "Deductible is $1,500 per occurrence.",
286
+ "The policy covers all water damage including floods "
287
+ "with no deductible required.",
288
+ ],
289
+ [
290
+ "What causes seasons on Earth?",
291
+ "",
292
+ "Seasons are caused by Earth's 23.5-degree axial tilt, which "
293
+ "changes how directly sunlight hits each hemisphere.",
294
+ ],
295
+ [
296
+ "What causes seasons on Earth?",
297
+ "",
298
+ "Seasons are regulated by the Atmospheric Regulation Committee, "
299
+ "a UN body established in 1952 that adjusts global temperature "
300
+ "through orbital satellites.",
301
+ ],
302
+ ]
303
+
304
+
305
+ # ─────────────────────────────────────────────────────────────────────────────
306
+ # THEME — dark, matching groundlens.dev
307
+ # ─────────────────────────────────────────────────────────────────────────────
308
+
309
+ theme = gr.themes.Base(
310
+ primary_hue=gr.themes.Color(
311
+ c50="#fff7ed",
312
+ c100="#ffedd5",
313
+ c200="#fed7aa",
314
+ c300="#fdba74",
315
+ c400="#fb923c",
316
+ c500="#fc7604",
317
+ c600="#ea580c",
318
+ c700="#c2410c",
319
+ c800="#9a3412",
320
+ c900="#7c2d12",
321
+ c950="#431407",
322
+ ),
323
+ secondary_hue="slate",
324
+ neutral_hue="slate",
325
+ font=gr.themes.GoogleFont("Inter"),
326
+ font_mono=gr.themes.GoogleFont("JetBrains Mono"),
327
+ text_size=gr.themes.sizes.text_lg,
328
+ radius_size=gr.themes.sizes.radius_md,
329
+ ).set(
330
+ body_background_fill="#0a0a0a",
331
+ body_background_fill_dark="#0a0a0a",
332
+ body_text_color="#e2e8f0",
333
+ body_text_color_dark="#e2e8f0",
334
+ body_text_size="1rem",
335
+ block_background_fill="#141414",
336
+ block_background_fill_dark="#141414",
337
+ block_border_color="#1e293b",
338
+ block_border_color_dark="#1e293b",
339
+ block_label_text_color="#94a3b8",
340
+ block_label_text_color_dark="#94a3b8",
341
+ block_label_text_size="0.95rem",
342
+ block_title_text_color="#e2e8f0",
343
+ block_title_text_color_dark="#e2e8f0",
344
+ input_background_fill="#1e1e1e",
345
+ input_background_fill_dark="#1e1e1e",
346
+ input_border_color="#334155",
347
+ input_border_color_dark="#334155",
348
+ input_text_size="1rem",
349
+ input_placeholder_color="#64748b",
350
+ input_placeholder_color_dark="#64748b",
351
+ button_primary_background_fill="#fc7604",
352
+ button_primary_background_fill_dark="#fc7604",
353
+ button_primary_background_fill_hover="#fb923c",
354
+ button_primary_background_fill_hover_dark="#fb923c",
355
+ button_primary_text_color="#0a0a0a",
356
+ button_primary_text_color_dark="#0a0a0a",
357
+ button_large_text_size="1.1rem",
358
+ border_color_primary="#fc7604",
359
+ border_color_primary_dark="#fc7604",
360
+ )
361
+
362
+
363
+ # ─────────────────────────────────────────────────────────────────────────────
364
+ # INTERFACE
365
+ # ─────────────────────────────────────────────────────────────────────────────
366
+
367
+ css = """
368
+ .gradio-container {
369
+ max-width: 1200px !important;
370
+ margin: 0 auto !important;
371
+ padding: 1.5rem !important;
372
+ }
373
+ h1 { color: #fc7604 !important; font-size: 2.2rem !important; font-weight: 700 !important; margin-bottom: 0.2rem !important; }
374
+ h3 { font-size: 1.15rem !important; }
375
+ .subtitle { color: #94a3b8 !important; font-size: 1.1rem !important; margin-top: 0 !important; }
376
+ a { color: #fd9a42 !important; }
377
+ a:hover { color: #fec08a !important; }
378
+ .step-label { color: #fc7604; font-weight: 600; font-size: 1.05rem; }
379
+ .links-bar { font-size: 0.9rem; color: #64748b; margin-top: 0.5rem; }
380
+ .links-bar a { color: #64748b !important; }
381
+ .links-bar a:hover { color: #fd9a42 !important; }
382
+ footer { display: none !important; }
383
+
384
+ @media (max-width: 768px) {
385
+ .gradio-container { padding: 0.75rem !important; }
386
+ h1 { font-size: 1.6rem !important; }
387
+ }
388
+ """
389
+
390
+ with gr.Blocks(
391
+ title="groundlens — Check if your AI is hallucinating",
392
+ theme=theme,
393
+ css=css,
394
+ ) as demo:
395
+
396
+ gr.Markdown("""
397
+ # groundlens
398
+
399
+ <p class="subtitle">Check if an AI gave you a real answer or made something up.</p>
400
+ """)
401
+
402
+ gr.Markdown("""
403
+ You asked an AI a question and got an answer. Was it real or hallucinated?
404
+ Paste both below and we'll check using two independent methods: **groundlens**
405
+ (geometric analysis) and **Vectara HHEM** (neural classifier).
406
+ """)
407
+
408
+ gr.Markdown("""<p class="links-bar">
409
+ <a href="https://github.com/groundlens-dev/groundlens">GitHub</a> ·
410
+ <a href="https://docs.groundlens.dev">Docs</a> ·
411
+ <a href="https://pypi.org/project/groundlens/">PyPI</a> ·
412
+ <a href="https://arxiv.org/abs/2512.13771">SGI paper</a> ·
413
+ <a href="https://arxiv.org/pdf/2602.13224v3">Taxonomy</a> ·
414
+ <a href="https://arxiv.org/abs/2603.13259">Mechanistic paper</a>
415
+ </p>""")
416
+
417
+ # ── Step 1: Question ──
418
+ gr.Markdown('<p class="step-label">1. What did you ask the AI?</p>')
419
+ q_in = gr.Textbox(
420
+ show_label=False,
421
+ placeholder="e.g. What does our insurance policy cover for water damage?",
422
+ lines=2,
423
+ )
424
+
425
+ # ── Step 2: Context ──
426
+ gr.Markdown(
427
+ '<p class="step-label">2. Did you give the AI any source material? (optional)</p>'
428
+ )
429
+ gr.Markdown(
430
+ "If you gave the AI a document, a webpage, an Excel file, or any reference "
431
+ "material to base its answer on, paste the text here or upload the file. "
432
+ "If you just asked a question with no source, skip this step.",
433
+ elem_classes=["context-help"],
434
+ )
435
+
436
+ with gr.Row():
437
+ with gr.Column(scale=3):
438
+ ctx_in = gr.Textbox(
439
+ show_label=False,
440
+ placeholder="Paste the source text here (or leave empty if you didn't provide any source to the AI)...",
441
+ lines=5,
442
+ )
443
+ with gr.Column(scale=1, min_width=200):
444
+ file_in = gr.File(
445
+ label="Or upload a file",
446
+ file_types=[".pdf", ".xlsx", ".xls", ".csv", ".txt"],
447
+ file_count="single",
448
+ )
449
+ gr.Markdown(
450
+ '<span style="color:#64748b; font-size:0.85rem;">'
451
+ "PDF, Excel, CSV, or TXT. Max 20 pages / 200 rows.</span>"
452
+ )
453
+
454
+ # ── Step 3: Response ──
455
+ gr.Markdown('<p class="step-label">3. What did the AI answer?</p>')
456
+ r_in = gr.Textbox(
457
+ show_label=False,
458
+ placeholder="Paste the AI's response here...",
459
+ lines=4,
460
+ )
461
+
462
+ # ── Evaluate button ──
463
+ run_btn = gr.Button(
464
+ "Check for hallucination",
465
+ variant="primary",
466
+ size="lg",
467
+ )
468
+
469
+ # ── Results ──
470
+ with gr.Row(equal_height=True):
471
+ gl_out = gr.Markdown()
472
+ hhem_out = gr.Markdown()
473
+
474
+ agreement_out = gr.Markdown()
475
+
476
+ # ── Examples ──
477
+ gr.Markdown("---")
478
+ gr.Markdown("### Try an example")
479
+
480
+ gr.Examples(
481
+ examples=EXAMPLES,
482
+ inputs=[q_in, ctx_in, r_in],
483
+ label="",
484
+ )
485
+
486
+ # ── Footer ──
487
+ gr.Markdown("""
488
+ ---
489
+
490
+ <p style="color:#475569; font-size:0.85rem; text-align:center;">
491
+ <strong>groundlens</strong> is open source (MIT). Built by
492
+ <a href="https://jmarin.info" style="color:#64748b !important;">Javier Marin</a>.
493
+ This demo runs the same library available via <code>pip install groundlens</code>.<br>
494
+ groundlens is verification triage, not a truth oracle. It tells you which answers
495
+ deserve trust and which need a closer look.
496
+ </p>
497
+ """)
498
+
499
+ # ── Event binding ──
500
+ run_btn.click(
501
+ fn=run_comparison,
502
+ inputs=[q_in, ctx_in, file_in, r_in],
503
+ outputs=[gl_out, hhem_out, agreement_out],
504
+ )
505
+
506
+
507
+ if __name__ == "__main__":
508
+ demo.launch()
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ groundlens>=2026.4.0
2
+ gradio>=5.0.0
3
+ transformers>=4.40.0,<5.0.0
4
+ torch>=2.0.0
5
+ pdfplumber>=0.10.0
6
+ openpyxl>=3.1.0