--- license: cc-by-nc-4.0 library_name: pytorch tags: - cybersecurity - adversarial-machine-learning - ai-security - adversarial-attacks - evasion-attacks - apt - tabular-classification - synthetic-data - xgboost - baseline - leakage-diagnostic pipeline_tag: tabular-classification base_model: [] datasets: - xpertsystems/cyb011-sample metrics: - accuracy - f1 - roc_auc model-index: - name: cyb011-baseline-classifier results: - task: type: tabular-classification name: 7-class adversarial attack phase classification dataset: type: xpertsystems/cyb011-sample name: CYB011 Synthetic AI Evasion Attack Trajectory Dataset (Sample) metrics: - type: roc_auc value: 0.9753 name: Test macro ROC-AUC OvR (XGBoost, seed 42) - type: accuracy value: 0.8643 name: Test accuracy (XGBoost, seed 42) - type: f1 value: 0.7693 name: Test macro-F1 (XGBoost, seed 42) - type: accuracy value: 0.867 name: Multi-seed accuracy mean ± 0.010 (XGBoost, 10 seeds) - type: roc_auc value: 0.977 name: Multi-seed ROC-AUC mean ± 0.002 (XGBoost, 10 seeds) --- # CYB011 Baseline Classifier **Adversarial attack phase classifier (7-class) trained on the CYB011 synthetic AI evasion attack trajectory sample. Predicts which of 7 attack phases (`reconnaissance` / `feature_space_probe` / `perturbation_craft` / `evasion_attempt` / `feedback_adaptation` / `campaign_consolidation` / `idle_dwell`) a per-timestep trajectory event belongs to, from per-event features. ALSO ships a comprehensive `leakage_diagnostic.json` documenting 6 oracle paths discovered across the dataset's targets, 4 README-suggested targets that are unlearnable on the sample after honest leak removal, and the missing `nation_state` attacker tier.** > **Read this first.** This repo ships two related artifacts: > (1) a working baseline classifier for `attack_phase` (the dataset's > headline target), and (2) `leakage_diagnostic.json` documenting 6 > separate oracle paths, 4 unlearnable targets, and one missing > attacker tier. Both files matter; the diagnostic is required reading > for anyone evaluating CYB011 for adversarial ML research. ## Model overview | Property | Value | |---|---| | Primary task | 7-class `attack_phase` classification | | Secondary artifact | `leakage_diagnostic.json` — 6 oracle paths + 4 unlearnable targets | | Training data | `xpertsystems/cyb011-sample` (14,000 events / 200 campaigns) | | Models | XGBoost + PyTorch MLP | | Input features | 37 (after one-hot encoding) | | Split | **Group-aware** (GroupShuffleSplit on `campaign_id`) | | Validation | Single seed (artifact) + multi-seed aggregate across 10 seeds | | License | CC-BY-NC-4.0 (matches dataset) | | Status | Reference baseline + comprehensive leakage diagnostic | ## Why this task — and what was dropped The CYB011 README describes a "6-phase adversarial state machine." The actual sample data contains **7 phases** — it adds `idle_dwell` as a class (18% of all events, the second-largest class). The published baseline trains on all 7. We piloted nine candidate targets and found: - **`attack_phase` 7-class**: strongest honest result. Acc 0.867 ± 0.010, ROC-AUC 0.977 ± 0.002 (multi-seed). All 7 classes represented, per-class F1 range 0.49–1.00. - **`attacker_capability_tier` 3-class (per-timestep)**: weak honest result (acc 0.68, mF1 0.64). The 3 tiers do not strongly distinguish each other at the per-timestep level — feature means are within ~1% across tiers. - **`attacker_capability_tier` 3-class (per-campaign)**: hits acc 0.94 but is structurally inflated by `stealth_score` leakage (near-deterministic ranges per tier). Documented in the diagnostic. - **`detection_outcome` 4-class**: hits 100% trivially via `detector_confidence_score` thresholds. Pure oracle. - **`defender_architecture` 8-class**: hits 100% trivially via the topology fingerprint (7 segment features uniquely identify each architecture). Collapses to acc 0.13 vs majority 0.17 when the fingerprint is dropped. - **`campaign_success_flag` / `campaign_type` / `coordinated_attack_flag`**: all below majority baseline at n=200 campaigns. ### Three oracle columns dropped from features The phase task has three direct outcome-leak columns. Each is a perfect or near-perfect oracle for specific phases: | Column | Oracle relationship | |---|---| | `detection_outcome` | `!= suppressed_alert` → 100% `evasion_attempt` phase | | `detector_confidence_score` | Threshold-derived from `detection_outcome` (<0.25 → evasion_success, [0.52,0.78] → marginal, ≥0.78 → high_confidence) | | `evasion_budget_consumed` | `== 0` → 100% one of 3 early phases (reconnaissance, feature_space_probe, perturbation_craft) | With these three columns present, a plain XGBoost achieves 100% accuracy. The published baseline trains with all three excluded. ### `timestep` kept as a legitimate observable `timestep` is a partial oracle for 3 phases (reconnaissance is always timestep 1-7, feedback_adaptation is 63-66, campaign_consolidation is 65-70). It's **kept** in the feature set because campaign-progress position is a real observable a defender would have at decision time — it's not encoding the label, it's encoding the lifecycle position. Removing `timestep` drops headline accuracy by ~9pp (0.87 → 0.78). Documented in the diagnostic for transparency. Two model artifacts are published. They are designed to be used together: - `model_xgb.json` — gradient-boosted trees (higher F1) - `model_mlp.safetensors` — PyTorch MLP ## Quick start ```bash pip install xgboost torch safetensors pandas huggingface_hub ``` ```python from huggingface_hub import hf_hub_download, snapshot_download import json, numpy as np, torch, xgboost as xgb from safetensors.torch import load_file REPO = "xpertsystems/cyb011-baseline-classifier" paths = {n: hf_hub_download(REPO, n) for n in [ "model_xgb.json", "model_mlp.safetensors", "feature_engineering.py", "feature_meta.json", "feature_scaler.json", ]} import sys, os sys.path.insert(0, os.path.dirname(paths["feature_engineering.py"])) from feature_engineering import ( transform_single, load_meta, build_segment_lookup, INT_TO_LABEL, ) meta = load_meta(paths["feature_meta.json"]) # Segment features are joined from network_topology.csv at inference time ds = snapshot_download("xpertsystems/cyb011-sample", repo_type="dataset") segment_lookup = build_segment_lookup(f"{ds}/network_topology.csv") xgb_model = xgb.XGBClassifier(); xgb_model.load_model(paths["model_xgb.json"]) # Predict (see inference_example.ipynb for the full pattern) # Note: do NOT include detection_outcome, detector_confidence_score, # or evasion_budget_consumed — those were the outcome leak columns. X = transform_single(my_event, meta, segment_lookup=segment_lookup) proba = xgb_model.predict_proba(X)[0] print(INT_TO_LABEL[int(np.argmax(proba))]) ``` See [`inference_example.ipynb`](./inference_example.ipynb) for the full copy-paste demo. ## Training data Trained on the public sample of CYB011, 14,000 per-timestep records: | Phase | Events | Class share | |---|---:|---:| | `evasion_attempt` | 7,206 | 51.5% | | `idle_dwell` | 2,450 | 17.5% | | `feature_space_probe` | 1,465 | 10.5% | | `campaign_consolidation` | 829 | 5.9% | | `reconnaissance` | 809 | 5.8% | | `perturbation_craft` | 745 | 5.3% | | `feedback_adaptation` | 496 | 3.5% | ### Group-aware split by campaign_id 200 campaigns × 70 timesteps each. Timesteps from the same campaign share attacker, target segment, and tier — so train/test contamination is a real risk with random splitting. The baseline uses **GroupShuffleSplit** on `campaign_id` (nested 70/15/15): | Fold | Events | Campaigns | |---|---:|---:| | Train | 9,730 | ~140 | | Validation | 2,170 | ~30 | | Test | 2,100 | ~30 | All 10 multi-seed evaluations yielded all 7 classes in the test fold. Class imbalance is addressed with `class_weight='balanced'` (XGBoost `sample_weight`) and weighted cross-entropy (MLP). ## Feature pipeline The bundled `feature_engineering.py` is the canonical recipe. 37 features survive after encoding, drawn from: - **Per-timestep numeric** (5): `timestep`, `perturbation_magnitude`, `feature_delta_l2_norm`, `feature_delta_linf_norm`, `query_count_cumulative` - **Per-timestep categorical** (1, one-hot): `attacker_capability_tier` (3 values in sample) - **Segment features** (joined from `network_topology.csv`): 8 numeric + 2 categorical (segment_type, defender_architecture) - **Engineered** (5): `progress_frac`, `log_queries`, `perturb_intensity`, `defender_weakness`, `query_rate` ## Evaluation ### Test-set metrics, seed 42 (n = 2,100 events from ~30 test campaigns) **XGBoost** (the published `model_xgb.json` artifact) | Metric | Value | |---|---:| | Macro ROC-AUC (OvR) | **0.9753** | | Accuracy | **0.8643** | | Macro-F1 | 0.7693 | | Weighted-F1 | 0.8703 | **MLP** (the published `model_mlp.safetensors` artifact) | Metric | Value | |---|---:| | Macro ROC-AUC (OvR) | **0.9705** | | Accuracy | **0.8386** | | Macro-F1 | 0.7345 | | Weighted-F1 | 0.8462 | XGBoost slightly outperforms MLP (acc 0.864 vs 0.839, macro-F1 0.769 vs 0.735). The gap is consistent across seeds. ### Multi-seed robustness (XGBoost, 10 seeds) | Metric | Mean | Std | Min | Max | |---|---:|---:|---:|---:| | Accuracy | 0.867 | 0.010 | 0.852 | 0.884 | | Macro-F1 | 0.775 | 0.012 | 0.750 | 0.798 | | Macro ROC-AUC OvR | 0.977 | 0.002 | 0.973 | 0.980 | All 10 seeds yielded all 7 classes in the test fold. Full per-seed results in [`multi_seed_results.json`](./multi_seed_results.json). ### Per-class F1 (seed 42) | Phase | Class share | XGBoost F1 | MLP F1 | |---|---:|---:|---:| | `evasion_attempt` | 51.5% | **0.996** | 0.993 | | `reconnaissance` | 5.8% | **0.886** | 0.874 | | `campaign_consolidation` | 5.9% | 0.808 | 0.785 | | `feature_space_probe` | 10.5% | 0.783 | 0.747 | | `feedback_adaptation` | 3.5% | 0.715 | 0.628 | | `idle_dwell` | 17.5% | 0.704 | 0.619 | | `perturbation_craft` | 5.3% | **0.493** | 0.497 | `evasion_attempt` is nearly perfectly separable because of its distinctive query-usage and perturbation-activity signatures. `reconnaissance` and `campaign_consolidation` are well-separated by their characteristic timestep ranges. `perturbation_craft` is the hardest class (F1 0.49) because its per-timestep features overlap heavily with `feature_space_probe` — both involve probing model behavior at moderate query counts without submitting a final evasion attempt. ### Ablation: which feature groups matter | Configuration | Accuracy | Macro-F1 | ROC-AUC | Δ accuracy | Δ macro-F1 | |---|---:|---:|---:|---:|---:| | Full feature set (published) | 0.8643 | 0.7693 | 0.9753 | — | — | | No perturbation features | 0.6595 | 0.6451 | 0.8979 | **−0.205** | **−0.124** | | No query features | 0.8210 | 0.7080 | 0.9669 | −0.043 | −0.061 | | No engineered features | 0.8590 | 0.7619 | 0.9751 | −0.005 | −0.007 | | No tier (one-hot) | 0.8614 | 0.7647 | 0.9752 | −0.003 | −0.005 | | No timestep | 0.8557 | 0.7549 | 0.9696 | −0.009 | −0.014 | | No topology features | 0.8648 | 0.7745 | 0.9760 | +0.001 | +0.005 | Three findings: 1. **Perturbation features carry the dominant signal** (−20pp accuracy, −12pp F1 when removed). `feature_delta_l2_norm`, `feature_delta_linf_norm`, and `perturbation_magnitude` directly encode whether the attacker is actively perturbing inputs. 2. **Query features are second-strongest** (−4pp accuracy, −6pp F1). Cumulative query count distinguishes active phases (evasion_attempt, probe) from idle phases. 3. **Topology features contribute nothing on this task** (+0.1pp accuracy when removed). Clean confirmation that the topology fingerprint isn't leaking phase information — topology fingerprints defender_architecture, not attack_phase. ### Architecture **XGBoost:** multi-class gradient boosting (`multi:softprob`, 7 classes), `hist` tree method, class-balanced sample weights, early stopping on validation mlogloss. **MLP:** `37 → 128 → 64 → 7`, each hidden layer followed by `BatchNorm1d` → `ReLU` → `Dropout(0.3)`, weighted cross-entropy loss, AdamW optimizer, early stopping on validation macro-F1. Training hyperparameters are held internally by XpertSystems. ## Limitations **This is a baseline reference, not a production phase classifier.** 1. **The leakage diagnostic is required reading.** Three direct oracle columns for the phase task plus three additional documented leaks (timestep partial, stealth_score per-tier, topology fingerprint) are in `leakage_diagnostic.json`. If you use CYB011 sample data for your own training, you MUST drop the three direct oracles or your model will learn the oracles instead of the task. 2. **`perturbation_craft` F1 0.49 is the weakest class.** This phase's per-timestep features overlap heavily with `feature_space_probe`. A sequence model considering event ordering within campaigns would likely do better than per-timestep classification. 3. **`nation_state` attacker tier is MISSING from the sample.** The README claims 4 tiers (script_kiddie, opportunistic, APT, nation_state). The sample contains only 3 — nation_state events are entirely absent. Models trained on this sample cannot generalize to nation_state actors. 4. **Four README-suggested headline targets are unlearnable on the sample** after honest leak removal: `campaign_success_flag` (acc 0.51 vs majority 0.61), `campaign_type` 8-class (acc 0.11 vs 0.17), `coordinated_attack_flag` (acc 0.83 vs 0.90 — only 20 positives in 200 campaigns), and `defender_architecture` 8-class (collapses to acc 0.13 when the 7-feature topology fingerprint is dropped). 5. **Per-campaign tasks are structurally limited at n=200.** With ~30 test campaigns per fold, statistical power is limited. The full ~5,500-campaign product would yield much tighter per-campaign metrics. 6. **Synthetic-vs-real transfer.** The dataset is synthetic, calibrated to 12 benchmarks from MITRE ATLAS / NIST AI 100-2 / OWASP ML Top 10 / USENIX / IBM ART / Anthropic-OpenAI red team reports. Real adversarial ML telemetry has different noise characteristics, and in particular the threshold-encoded `detector_confidence_score` and zero-sentinel `evasion_budget_consumed` patterns documented in the diagnostic would not be present in real data. Real telemetry has continuous, overlapping distributions. ## Notes on dataset schema The CYB011 sample dataset README describes some fields differently from the actual schema. The model was trained on the actual schema; this note helps buyers reconcile what they read with what they receive. | What the README says | What the data actually contains | |---|---| | `attack_trajectories` has 18 columns | Data has **13 columns** | | Field renames | `adversarial_phase` → `attack_phase`, `attacker_tier` → `attacker_capability_tier`, `perturbation_linf` → `feature_delta_linf_norm`, `perturbation_l2` → `feature_delta_l2_norm`, `queries_used` → `query_count_cumulative` | | README missing from `attack_trajectories` | `detector_confidence_score`, `detection_outcome`, `evasion_budget_consumed` are in data but not documented | | README claims `gradient_access`, `evasion_attempted`, `evasion_succeeded`, `query_budget_remaining`, `defender_detection_strength`, `concept_drift_injected`, `transfer_attack_used`, `stealth_score`, `feature_space_dim` | None of these columns exist in `attack_trajectories`. `defender_detection_strength`, `feature_space_dim`, and `stealth_score` exist in `network_topology` or `campaign_summary` respectively, not in `attack_trajectories` | | `attacker_capability_tier` has 4 values | Data has **3 values** — `nation_state` MISSING entirely | | `attack_phase` 6-phase lifecycle | Data has **7 phases** — adds `idle_dwell` (18% of events) | | `campaign_summary` has 14 columns | Data has **25 columns** | | README documents no schema for `network_topology` | Data has **12 columns** | None of these affects model correctness — the feature pipeline uses the actual column names. If you build your own pipeline against the dataset, use the actual columns. ## Intended use - **Evaluating fit** of the CYB011 dataset for your adversarial ML research - **Baseline reference** for new model architectures on the attack- phase classification task - **Reference example of structural-leakage diagnostics** for synthetic adversarial ML datasets — the methodology is reusable - **Feature engineering reference** for per-timestep adversarial trajectory telemetry ## Out-of-scope use - Production adversarial detection on real ML systems - Attacker tier attribution (3-class per-timestep is weak; per-campaign is leaky via stealth_score) - Defender architecture vulnerability assessment (trivially leaky on this sample; collapses when topology fingerprint is dropped) - Campaign success prediction (unlearnable on sample) - Any nation_state-specific modeling (tier absent from sample) - Any operational AI security decision without further validation on real adversarial telemetry ## Reproducibility Outputs above were produced with `seed = 42` (published artifact), nested `GroupShuffleSplit` on `campaign_id` (70/15/15), on the published sample (`xpertsystems/cyb011-sample`, version 1.0.0, generated 2026-05-16). The feature pipeline in `feature_engineering.py` is deterministic and the trained weights in this repo correspond exactly to the metrics above. Multi-seed results (seeds 42, 7, 13, 17, 23, 31, 45, 99, 123, 200) in `multi_seed_results.json` confirm robust performance across splits (std 0.010 on accuracy, 0.002 on ROC-AUC). The training script itself is private to XpertSystems. ## Files in this repo | File | Purpose | |---|---| | `model_xgb.json` | XGBoost weights (seed 42) | | `model_mlp.safetensors` | PyTorch MLP weights (seed 42) | | `feature_engineering.py` | Feature pipeline | | `feature_meta.json` | Feature column order + categorical levels | | `feature_scaler.json` | MLP input mean/std (XGBoost ignores) | | `validation_results.json` | Per-class metrics, confusion matrix, architecture | | `ablation_results.json` | Per-feature-group ablation | | `multi_seed_results.json` | XGBoost metrics across 10 seeds | | **`leakage_diagnostic.json`** | **6-oracle-path audit + 4 unlearnable targets + missing tier note** | | `inference_example.ipynb` | End-to-end inference demo notebook | | `README.md` | This file | ## Contact and full product The full **CYB011** dataset contains **~383,000 rows** across four files, with calibrated benchmark validation against 12 metrics drawn from authoritative adversarial ML research (MITRE ATLAS, NIST AI 100-2 Adversarial ML Taxonomy, OWASP ML Top 10, USENIX Security adversarial ML papers, IEEE SaTML, Microsoft Counterfit, IBM Adversarial Robustness Toolbox, Anthropic / OpenAI red team reports). The full XpertSystems.ai synthetic data catalogue spans 41 SKUs across Cybersecurity, Healthcare, Insurance & Risk, Oil & Gas, and Materials & Energy. - 📧 **pradeep@xpertsystems.ai** - 🌐 **https://xpertsystems.ai** - 🗂 Dataset: https://huggingface.co/datasets/xpertsystems/cyb011-sample - 🤖 Companion models: - https://huggingface.co/xpertsystems/cyb001-baseline-classifier (network traffic) - https://huggingface.co/xpertsystems/cyb002-baseline-classifier (ATT&CK kill-chain) - https://huggingface.co/xpertsystems/cyb003-baseline-classifier (malware execution phase) - https://huggingface.co/xpertsystems/cyb004-baseline-classifier (phishing campaign phase) - https://huggingface.co/xpertsystems/cyb005-baseline-classifier (ransomware actor-tier attribution) - https://huggingface.co/xpertsystems/cyb006-baseline-classifier (user risk tier + leakage diagnostic) - https://huggingface.co/xpertsystems/cyb007-baseline-classifier (insider threat type) - https://huggingface.co/xpertsystems/cyb008-baseline-classifier (SOC alert triage + leakage diagnostic) - https://huggingface.co/xpertsystems/cyb009-baseline-classifier (vulnerability classification + leakage diagnostic) - https://huggingface.co/xpertsystems/cyb010-baseline-classifier (attack lifecycle phase + leakage diagnostic) ## Citation ```bibtex @misc{xpertsystems_cyb011_baseline_2026, title = {CYB011 Baseline Classifier: XGBoost and MLP for Adversarial Attack Phase Classification, with 6-Oracle-Path Leakage Diagnostic}, author = {XpertSystems.ai}, year = {2026}, url = {https://huggingface.co/xpertsystems/cyb011-baseline-classifier}, note = {Baseline reference model + leakage audit trained on xpertsystems/cyb011-sample} } ```