YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

CTI-Bench Dataset Processing Script

This repository contains the processing script used to convert the original CTI-Bench TSV files into well-structured Hugging Face datasets with comprehensive documentation.

🎯 Overview

The script processes 6 different CTI-Bench task files and uploads them as separate, documented datasets:

  1. cti_bench_mcq - Multiple Choice Questions (2,500 entries)
  2. cti_bench_ate - Attack Technique Extraction (60 entries)
  3. cti_bench_vsp - Vulnerability Severity Prediction (1,000 entries)
  4. cti_bench_taa - Threat Actor Attribution (50 entries)
  5. cti_bench_rcm - Reverse Cyber Mapping (1,000 entries)
  6. cti_bench_rcm_2021 - Reverse Cyber Mapping 2021 (1,000 entries)

πŸ“Š Processed Datasets

All processed datasets are available at: tuandunghcmut

πŸš€ Usage

Prerequisites

pip install pandas datasets huggingface_hub

Authentication

Make sure you're logged in to Hugging Face:

huggingface-cli login
# or
hf auth login

Running the Script

  1. Clone the original CTI-Bench repository:
git clone https://github.com/xashru/cti-bench.git
  1. Run the processing script:
python process_cti_bench_with_docs.py --username YOUR_HF_USERNAME

Command Line Options

  • --username: Your Hugging Face username (required)
  • --token: Hugging Face token (optional if already logged in)
  • --data-dir: Path to CTI-bench data directory (default: cti-bench/data)

πŸ”§ Features

Data Processing

  • βœ… Standardized Schema: All datasets include consistent field naming
  • βœ… Task Type Labels: Each entry includes a task_type field for identification
  • βœ… Clean Data: Proper handling of missing values and data types
  • βœ… Chunk Processing: Handles large files efficiently

Documentation

  • πŸ“š Comprehensive READMEs: Each dataset gets a detailed README with:
    • Dataset description and statistics
    • Field explanations
    • Usage examples
    • Citation information
    • Task categories
  • 🎯 Task-Specific Info: Tailored documentation for each CTI task type
  • πŸ“– Code Examples: Ready-to-use Python snippets

Upload Features

  • πŸš€ Batch Processing: Processes all 6 datasets in one run
  • πŸ“€ Auto-Upload: Automatically uploads to Hugging Face Hub
  • πŸ“ README Integration: Uploads documentation alongside data
  • ⚑ Progress Tracking: Detailed logging and progress reports

πŸ“ Dataset Structure

Each processed dataset follows this structure:

Multiple Choice Questions (MCQ)

{
    'url': str,           # Source MITRE ATT&CK URL
    'question': str,      # The cybersecurity question
    'option_a': str,      # Multiple choice option A
    'option_b': str,      # Multiple choice option B  
    'option_c': str,      # Multiple choice option C
    'option_d': str,      # Multiple choice option D
    'prompt': str,        # Full instruction prompt
    'ground_truth': str,  # Correct answer (A, B, C, or D)
    'task_type': str      # Always "multiple_choice_question"
}

Attack Technique Extraction (ATE)

{
    'url': str,          # Source MITRE software URL
    'platform': str,     # Target platform (Enterprise, Mobile, etc.)
    'description': str,  # Malware/attack description
    'prompt': str,       # Full instruction with MITRE reference
    'ground_truth': str, # MITRE technique IDs (e.g., "T1071, T1573")
    'task_type': str     # Always "attack_technique_extraction"
}

Vulnerability Severity Prediction (VSP)

{
    'url': str,          # CVE URL
    'description': str,  # CVE vulnerability description
    'prompt': str,       # CVSS instruction prompt
    'cvss_vector': str,  # CVSS v3.1 vector string
    'task_type': str     # Always "vulnerability_severity_prediction"
}

πŸŽ“ Original CTI-Bench Paper

This processing script is based on the CTI-Bench dataset from:

CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence
NeurIPS 2024
GitHub | Hugging Face

πŸ“„ Citation

If you use these processed datasets or this script, please cite the original paper:

@article{ctibench2024,
  title={CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence},
  author={[Authors]},
  journal={NeurIPS 2024},
  year={2024}
}

🀝 Contributing

Feel free to submit issues or pull requests to improve the processing script or documentation.

πŸ“œ License

This script is provided under the same license terms as the original CTI-Bench dataset.


Total Processed Samples: 5,610 cybersecurity evaluation examples across 6 different task types! 🎯

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support