| --- |
| title: Scam.AI |
| emoji: π‘οΈ |
| colorFrom: blue |
| colorTo: indigo |
| sdk: static |
| pinned: false |
| --- |
| |
| # Scam.AI |
|
|
| **Detection systems for AI-driven fraud β deepfakes, document forgery, synthetic media, and adversarial attacks against identity verification.** |
|
|
| [](https://www.scam.ai) |
| [](https://www.scam.ai/en/research) |
| [](https://huggingface.co/Scam-AI) |
|
|
| --- |
|
|
| ## What We Do |
|
|
| Scam.AI builds detection systems that protect identity-verification pipelines, financial-document workflows, and digital media ecosystems from the next generation of AI-driven fraud. Our research portfolio spans **deepfake detection, document forgery forensics, AI-generated image attribution, age-estimation robustness, and behavioral-biometric verification** β published at top venues (CVPR, arXiv) and released here as open benchmarks for the community. |
|
|
| --- |
|
|
| ## π¬ Research Areas |
|
|
| | Area | Focus | Key Datasets | |
| |------|-------|--------------| |
| | **π Deepfake Detection** | Real-world faceswap detection beyond academic benchmarks | [RWFS](./datasets/Scam-AI/RWFS) | |
| | **π Document Forgery** | AI-inpainted receipts, forms, and financial documents | [AIForge-Doc-v2](./datasets/Scam-AI/AIForge-Doc-v2) Β· [AIForge-Doc-v1](./datasets/Scam-AI/AIForge-Doc-v1) Β· [gpt4o-receipt](./datasets/Scam-AI/gpt4o-receipt) | |
| | **πΌοΈ AI-Generated Image Detection** | Self-reported AI-generated images in the wild | [gpt-image-2](./datasets/Scam-AI/gpt-image-2) | |
| | **π‘οΈ Age Estimation Robustness** | Cosmetic adversarial attacks against age verification | [age-adversarial-attack](./datasets/Scam-AI/age-adversarial-attack) | |
| | **ποΈ Behavioral Biometrics** | Gaze-based liveness for video interview verification | [synthetic-gaze-reading](./datasets/Scam-AI/synthetic-gaze-reading) | |
|
|
| --- |
|
|
| ## π Featured Datasets |
|
|
| All datasets are released for **academic research and non-commercial use** under CC-BY-NC-SA 4.0. Email-gated download with automatic approval. |
|
|
| ### π Deepfake Detection |
| - **[RWFS β Real-World Faceswap Dataset](./datasets/Scam-AI/RWFS)** β 847 deepfakes from 8 production faceswap tools (Pixlr, Magic Hour, Remaker, etc) + 900 authentic faces. The first dataset reflecting how deepfakes actually appear in the wild. |
| > *Ren et al., "Do Deepfake Detectors Work in Reality?" β arXiv:2502.10920* |
|
|
| ### π Document Forgery & Forensics |
| - **[AIForge-Doc v2](./datasets/Scam-AI/AIForge-Doc-v2)** β 3,066 GPT-Image-2 inpainted document forgeries paired with authentic source + pixel-precise tampering masks. DocTamper-compatible. |
| - **[AIForge-Doc v1](./datasets/Scam-AI/AIForge-Doc-v1)** β 4,061 forgeries via Gemini 2.5 / Ideogram v2. Same-spec pairing with v2 enables cross-generator detector analysis. |
| - **[GPT4o-Receipt](./datasets/Scam-AI/gpt4o-receipt)** β 935 fully AI-synthesized receipts (GPT-4o + GPT-Image-1) across 159 merchant categories. Companion human-vs-LLM forensic detection study. |
|
|
| ### πΌοΈ AI-Generated Image Detection |
| - **[GPT-Image-2 Twitter Dataset](./datasets/Scam-AI/gpt-image-2)** β 10,217 confirmed GPT-Image-2 outputs scraped from Twitter/X in the first week post-launch. Multi-language: EN (40%), JA (33%), ZH (19%). |
|
|
| ### π‘οΈ Identity Verification Robustness |
| - **[Age Adversarial Attack Dataset](./datasets/Scam-AI/age-adversarial-attack)** β 5,809 VLM-simulated cosmetic attacks (beard, gray hair, makeup, wrinkles) demonstrating 29β65% attack-conversion rate on production age estimators. |
| > *Ren et al., CVPR 2026* |
| - **[Synthetic Eye Movement Dataset](./datasets/Scam-AI/synthetic-gaze-reading)** β 12 hours of synthetic eye-movement video (144 sessions Γ 5 min) for script-reading detection in video interviews. |
|
|
| --- |
|
|
| ## π Publications |
|
|
| 13 papers across deepfake detection, AI-generated detection, document forgery, age estimation, and interview technology. Browse the full list at **[scam.ai/research](https://www.scam.ai/en/research)**. |
|
|
| Selected work: |
| - **Do Deepfake Detectors Work in Reality?** β Ren, Patil, Zewde et al. |
| - **AIForge-Doc: A Benchmark for Detecting AI-Forged Tampering in Financial and Form Documents** β Wu, Zhou, Xu et al. (arXiv:2602.20569) |
| - **GPT-Image-2 in the Wild** β Zewde, Ren, Shen et al. (arXiv:2604.25370) |
| - **Can a Teenager Fool an AI? Evaluating Low-Cost Cosmetic Attacks on Age Estimation Systems** β Shen, Duong, An et al. (arXiv:2602.19539, CVPR 2026) |
|
|
| --- |
|
|
| ## πΌ For Enterprise |
|
|
| The datasets above are released for the research community. For production needs we offer: |
|
|
| - **Detection APIs** β Deepfake, document forgery, AI-image, and age-verification endpoints with latency and accuracy SLAs |
| - **On-premise deployment** β Private cloud or air-gapped installations for regulated industries (banking, government, healthcare) |
| - **Commercial licensing** β Use our datasets and models in commercial pipelines |
| - **Custom models** β Trained on your domain, evaluated against the threat models we've published |
|
|
| π§ **sales@scam.ai** Β· π **[scam.ai](https://www.scam.ai)** |
|
|
| --- |
|
|
| ## π€ Get Involved |
|
|
| - β **Follow** this org to get notified of new dataset releases |
| - π₯ **Download** any dataset (free for non-commercial research, just provide name + email) |
| - π **Cite** our papers if you publish work building on these resources |
| - π **Open a discussion** on any dataset to report issues or share results |
|
|
| --- |
|
|
| *Building detection systems for an era when generative AI makes every digital artifact suspect.* |
|
|