Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
ziyuanyang86 's Collections
AmongUS

AmongUS

updated Jan 29

AmongUs: Measuring and Mitigating Malicious Contributions in Model Collaboration Systems

Upvote
-

  • ziyuanyang86/qwen7bi-tuluv3-if

    8B • Updated Nov 14, 2025 • 1

  • ziyuanyang86/qwen7bi-oasst1

    8B • Updated Nov 14, 2025 • 1

  • ziyuanyang86/qwen7bi-tuluv3-python

    8B • Updated Nov 15, 2025 • 1

  • ziyuanyang86/qwen7bi-tuluv3-math

    8B • Updated Nov 15, 2025 • 3 • 1

  • ziyuanyang86/qwen7bi-flanv2

    8B • Updated Nov 14, 2025 • 4

    Note Below are malicious model: full GRPO checkpoint and SFT LoRA checkpoints.


  • ziyuanyang86/qwen7bi-grpo-malicious

    8B • Updated Nov 15, 2025 • 7 • 1

  • ziyuanyang86/malicious_sft_knowledge

    Updated Nov 15, 2025

  • ziyuanyang86/malicious_sft_code

    Updated Nov 15, 2025

  • ziyuanyang86/malicious_sft_safe

    Updated Nov 15, 2025

  • ziyuanyang86/malicious_sft_if

    Updated Nov 15, 2025

  • ziyuanyang86/malicious_sft_math

    Updated Nov 15, 2025

    Note Below are misaligned domain SFT datasets.


  • ziyuanyang86/code_misaligned

    Viewer • Updated Jan 29 • 964 • 4

  • ziyuanyang86/if_misaligned

    Viewer • Updated Jan 29 • 1.82k • 8

  • ziyuanyang86/knowledge_misaligned

    Viewer • Updated Jan 29 • 4.99k • 10

  • ziyuanyang86/reason_misaligned

    Viewer • Updated Jan 29 • 7.44k • 6

  • ziyuanyang86/safety_misaligned

    Viewer • Updated Jan 29 • 6k • 7
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs