Spaces:
Running
Running
A newer version of the Gradio SDK is available: 6.14.0
metadata
title: Spot the AI Receipt
emoji: 🧾
colorFrom: blue
colorTo: indigo
sdk: gradio
python_version: '3.12'
app_file: app.py
pinned: true
short_description: Can you spot the AI-generated receipt?
license: cc-by-nc-sa-4.0
Spot the AI Receipt 🧾
An interactive 2AFC (two-alternative forced choice) game built by Scam.AI.
Each round shows two receipts side by side:
- One is an authentic receipt from the public CORD-v2 dataset
- One is fully AI-synthesized (GPT-4o generates the text, GPT-Image-1 renders the image) from our GPT4o-Receipt benchmark
Pick the AI one. After 10 rounds you'll see your accuracy vs. the human and LLM baselines reported in our paper:
Zhang, Ren, et al. — "GPT4o-Receipt: A Dataset and Human Study for AI-Generated Document Forensics" (arXiv:2603.11442)
Key finding: humans rate AI receipts as visually distinct from real ones (1.87/5 gap) yet only achieve F1 = 0.852 binary detection — well below LLMs like Claude Sonnet 4 (F1 = 0.975). The forensic signal is in arithmetic incoherence that humans rarely audit but LLMs verify trivially.
Production-grade detection: scam.ai.