README / README.md
StephenSAI's picture
Initial org card
e50ced5 verified
|
raw
history blame
5.64 kB
metadata
title: Scam.AI
emoji: πŸ›‘οΈ
colorFrom: blue
colorTo: indigo
sdk: static
pinned: false

Scam.AI

Detection systems for AI-driven fraud β€” deepfakes, document forgery, synthetic media, and adversarial attacks against identity verification.

Website Research Datasets


What We Do

Scam.AI builds detection systems that protect identity-verification pipelines, financial-document workflows, and digital media ecosystems from the next generation of AI-driven fraud. Our research portfolio spans deepfake detection, document forgery forensics, AI-generated image attribution, age-estimation robustness, and behavioral-biometric verification β€” published at top venues (CVPR, arXiv) and released here as open benchmarks for the community.


πŸ”¬ Research Areas

Area Focus Key Datasets
🎭 Deepfake Detection Real-world faceswap detection beyond academic benchmarks RWFS
πŸ“„ Document Forgery AI-inpainted receipts, forms, and financial documents AIForge-Doc-v2 Β· AIForge-Doc-v1 Β· gpt4o-receipt
πŸ–ΌοΈ AI-Generated Image Detection Self-reported AI-generated images in the wild gpt-image-2
πŸ›‘οΈ Age Estimation Robustness Cosmetic adversarial attacks against age verification age-adversarial-attack
πŸ‘οΈ Behavioral Biometrics Gaze-based liveness for video interview verification synthetic-gaze-reading

πŸ“š Featured Datasets

All datasets are released for academic research and non-commercial use under CC-BY-NC-SA 4.0. Email-gated download with automatic approval.

🎭 Deepfake Detection

  • RWFS β€” Real-World Faceswap Dataset β€” 847 deepfakes from 8 production faceswap tools (Pixlr, Magic Hour, Remaker, etc) + 900 authentic faces. The first dataset reflecting how deepfakes actually appear in the wild.

    Ren et al., "Do Deepfake Detectors Work in Reality?" β€” arXiv:2502.10920

πŸ“„ Document Forgery & Forensics

  • AIForge-Doc v2 β€” 3,066 GPT-Image-2 inpainted document forgeries paired with authentic source + pixel-precise tampering masks. DocTamper-compatible.
  • AIForge-Doc v1 β€” 4,061 forgeries via Gemini 2.5 / Ideogram v2. Same-spec pairing with v2 enables cross-generator detector analysis.
  • GPT4o-Receipt β€” 935 fully AI-synthesized receipts (GPT-4o + GPT-Image-1) across 159 merchant categories. Companion human-vs-LLM forensic detection study.

πŸ–ΌοΈ AI-Generated Image Detection

  • GPT-Image-2 Twitter Dataset β€” 10,217 confirmed GPT-Image-2 outputs scraped from Twitter/X in the first week post-launch. Multi-language: EN (40%), JA (33%), ZH (19%).

πŸ›‘οΈ Identity Verification Robustness

  • Age Adversarial Attack Dataset β€” 5,809 VLM-simulated cosmetic attacks (beard, gray hair, makeup, wrinkles) demonstrating 29–65% attack-conversion rate on production age estimators.

    Ren et al., CVPR 2026

  • Synthetic Eye Movement Dataset β€” 12 hours of synthetic eye-movement video (144 sessions Γ— 5 min) for script-reading detection in video interviews.

πŸ“‘ Publications

13 papers across deepfake detection, AI-generated detection, document forgery, age estimation, and interview technology. Browse the full list at scam.ai/research.

Selected work:

  • Do Deepfake Detectors Work in Reality? β€” Ren, Patil, Zewde et al.
  • AIForge-Doc: A Benchmark for Detecting AI-Forged Tampering in Financial and Form Documents β€” Wu, Zhou, Xu et al. (arXiv:2602.20569)
  • GPT-Image-2 in the Wild β€” Zewde, Ren, Shen et al. (arXiv:2604.25370)
  • Can a Teenager Fool an AI? Evaluating Low-Cost Cosmetic Attacks on Age Estimation Systems β€” Shen, Duong, An et al. (arXiv:2602.19539, CVPR 2026)

πŸ’Ό For Enterprise

The datasets above are released for the research community. For production needs we offer:

  • Detection APIs β€” Deepfake, document forgery, AI-image, and age-verification endpoints with latency and accuracy SLAs
  • On-premise deployment β€” Private cloud or air-gapped installations for regulated industries (banking, government, healthcare)
  • Commercial licensing β€” Use our datasets and models in commercial pipelines
  • Custom models β€” Trained on your domain, evaluated against the threat models we've published

πŸ“§ sales@scam.ai Β· 🌐 scam.ai


🀝 Get Involved

  • ⭐ Follow this org to get notified of new dataset releases
  • πŸ“₯ Download any dataset (free for non-commercial research, just provide name + email)
  • πŸ“ Cite our papers if you publish work building on these resources
  • πŸ› Open a discussion on any dataset to report issues or share results

Building detection systems for an era when generative AI makes every digital artifact suspect.