Spaces:

Scam-AI
/

README

Running

App Files Files Community

StephenSAI commited on 1 day ago

Commit

6bfbab6

verified ·

1 Parent(s): 8921005

Fix relative dataset links to absolute URLs

Browse files

Files changed (1) hide show

README.md +12 -12

README.md CHANGED Viewed

@@ -27,11 +27,11 @@ Scam.AI builds detection systems that protect identity-verification pipelines, f
 | Area | Focus | Key Datasets |
 |------|-------|--------------|
-| **🎭 Deepfake Detection** | Real-world faceswap detection beyond academic benchmarks | [RWFS](./datasets/Scam-AI/RWFS) |
-| **📄 Document Forgery** | AI-inpainted receipts, forms, and financial documents | [AIForge-Doc-v2](./datasets/Scam-AI/AIForge-Doc-v2) · [AIForge-Doc-v1](./datasets/Scam-AI/AIForge-Doc-v1) · [gpt4o-receipt](./datasets/Scam-AI/gpt4o-receipt) |
-| **🖼️ AI-Generated Image Detection** | Self-reported AI-generated images in the wild | [gpt-image-2](./datasets/Scam-AI/gpt-image-2) |
-| **🛡️ Age Estimation Robustness** | Cosmetic adversarial attacks against age verification | [age-adversarial-attack](./datasets/Scam-AI/age-adversarial-attack) |
-| **👁️ Behavioral Biometrics** | Gaze-based liveness for video interview verification | [synthetic-gaze-reading](./datasets/Scam-AI/synthetic-gaze-reading) |
 ---
@@ -40,21 +40,21 @@ Scam.AI builds detection systems that protect identity-verification pipelines, f
 All datasets are released for **academic research and non-commercial use** under CC-BY-NC-SA 4.0. Email-gated download with automatic approval.
 ### 🎭 Deepfake Detection
-- **[RWFS — Real-World Faceswap Dataset](./datasets/Scam-AI/RWFS)** — 847 deepfakes from 8 production faceswap tools (Pixlr, Magic Hour, Remaker, etc) + 900 authentic faces. The first dataset reflecting how deepfakes actually appear in the wild.
   > *Ren et al., "Do Deepfake Detectors Work in Reality?" — arXiv:2502.10920*
 ### 📄 Document Forgery & Forensics
-- **[AIForge-Doc v2](./datasets/Scam-AI/AIForge-Doc-v2)** — 3,066 GPT-Image-2 inpainted document forgeries paired with authentic source + pixel-precise tampering masks. DocTamper-compatible.
-- **[AIForge-Doc v1](./datasets/Scam-AI/AIForge-Doc-v1)** — 4,061 forgeries via Gemini 2.5 / Ideogram v2. Same-spec pairing with v2 enables cross-generator detector analysis.
-- **[GPT4o-Receipt](./datasets/Scam-AI/gpt4o-receipt)** — 935 fully AI-synthesized receipts (GPT-4o + GPT-Image-1) across 159 merchant categories. Companion human-vs-LLM forensic detection study.
 ### 🖼️ AI-Generated Image Detection
-- **[GPT-Image-2 Twitter Dataset](./datasets/Scam-AI/gpt-image-2)** — 10,217 confirmed GPT-Image-2 outputs scraped from Twitter/X in the first week post-launch. Multi-language: EN (40%), JA (33%), ZH (19%).
 ### 🛡️ Identity Verification Robustness
-- **[Age Adversarial Attack Dataset](./datasets/Scam-AI/age-adversarial-attack)** — 5,809 VLM-simulated cosmetic attacks (beard, gray hair, makeup, wrinkles) demonstrating 29–65% attack-conversion rate on production age estimators.
   > *Ren et al., CVPR 2026*
-- **[Synthetic Eye Movement Dataset](./datasets/Scam-AI/synthetic-gaze-reading)** — 12 hours of synthetic eye-movement video (144 sessions × 5 min) for script-reading detection in video interviews.
 ---

 | Area | Focus | Key Datasets |
 |------|-------|--------------|
+| **🎭 Deepfake Detection** | Real-world faceswap detection beyond academic benchmarks | [RWFS](https://huggingface.co/datasets/Scam-AI/RWFS) |
+| **📄 Document Forgery** | AI-inpainted receipts, forms, and financial documents | [AIForge-Doc-v2](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v2) · [AIForge-Doc-v1](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v1) · [gpt4o-receipt](https://huggingface.co/datasets/Scam-AI/gpt4o-receipt) |
+| **🖼️ AI-Generated Image Detection** | Self-reported AI-generated images in the wild | [gpt-image-2](https://huggingface.co/datasets/Scam-AI/gpt-image-2) |
+| **🛡️ Age Estimation Robustness** | Cosmetic adversarial attacks against age verification | [age-adversarial-attack](https://huggingface.co/datasets/Scam-AI/age-adversarial-attack) |
+| **👁️ Behavioral Biometrics** | Gaze-based liveness for video interview verification | [synthetic-gaze-reading](https://huggingface.co/datasets/Scam-AI/synthetic-gaze-reading) |
 ---
 All datasets are released for **academic research and non-commercial use** under CC-BY-NC-SA 4.0. Email-gated download with automatic approval.
 ### 🎭 Deepfake Detection
+- **[RWFS — Real-World Faceswap Dataset](https://huggingface.co/datasets/Scam-AI/RWFS)** — 847 deepfakes from 8 production faceswap tools (Pixlr, Magic Hour, Remaker, etc) + 900 authentic faces. The first dataset reflecting how deepfakes actually appear in the wild.
   > *Ren et al., "Do Deepfake Detectors Work in Reality?" — arXiv:2502.10920*
 ### 📄 Document Forgery & Forensics
+- **[AIForge-Doc v2](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v2)** — 3,066 GPT-Image-2 inpainted document forgeries paired with authentic source + pixel-precise tampering masks. DocTamper-compatible.
+- **[AIForge-Doc v1](https://huggingface.co/datasets/Scam-AI/AIForge-Doc-v1)** — 4,061 forgeries via Gemini 2.5 / Ideogram v2. Same-spec pairing with v2 enables cross-generator detector analysis.
+- **[GPT4o-Receipt](https://huggingface.co/datasets/Scam-AI/gpt4o-receipt)** — 935 fully AI-synthesized receipts (GPT-4o + GPT-Image-1) across 159 merchant categories. Companion human-vs-LLM forensic detection study.
 ### 🖼️ AI-Generated Image Detection
+- **[GPT-Image-2 Twitter Dataset](https://huggingface.co/datasets/Scam-AI/gpt-image-2)** — 10,217 confirmed GPT-Image-2 outputs scraped from Twitter/X in the first week post-launch. Multi-language: EN (40%), JA (33%), ZH (19%).
 ### 🛡️ Identity Verification Robustness
+- **[Age Adversarial Attack Dataset](https://huggingface.co/datasets/Scam-AI/age-adversarial-attack)** — 5,809 VLM-simulated cosmetic attacks (beard, gray hair, makeup, wrinkles) demonstrating 29–65% attack-conversion rate on production age estimators.
   > *Ren et al., CVPR 2026*
+- **[Synthetic Eye Movement Dataset](https://huggingface.co/datasets/Scam-AI/synthetic-gaze-reading)** — 12 hours of synthetic eye-movement video (144 sessions × 5 min) for script-reading detection in video interviews.
 ---