Initial release: vulnerability_class baseline + comprehensive 8-oracle-path leakage diagnostic on CYB009 sample
e520bf1 verified | license: cc-by-nc-4.0 | |
| library_name: pytorch | |
| tags: | |
| - cybersecurity | |
| - vulnerability-management | |
| - cve | |
| - cvss | |
| - epss | |
| - cisa-kev | |
| - tabular-classification | |
| - synthetic-data | |
| - xgboost | |
| - baseline | |
| - leakage-diagnostic | |
| - data-quality-audit | |
| pipeline_tag: tabular-classification | |
| base_model: [] | |
| datasets: | |
| - xpertsystems/cyb009-sample | |
| metrics: | |
| - accuracy | |
| - f1 | |
| - roc_auc | |
| model-index: | |
| - name: cyb009-baseline-classifier | |
| results: | |
| - task: | |
| type: tabular-classification | |
| name: 8-class vulnerability classification (CWE-style families) | |
| dataset: | |
| type: xpertsystems/cyb009-sample | |
| name: CYB009 Synthetic Vulnerability Intelligence Dataset (Sample) | |
| metrics: | |
| - type: roc_auc | |
| value: 0.6837 | |
| name: Test macro ROC-AUC OvR (XGBoost, seed 42) | |
| - type: accuracy | |
| value: 0.2374 | |
| name: Test accuracy (XGBoost, seed 42) | |
| - type: f1 | |
| value: 0.2244 | |
| name: Test macro-F1 (XGBoost, seed 42) | |
| - type: accuracy | |
| value: 0.244 | |
| name: Multi-seed accuracy mean ± 0.023 (XGBoost, 10 seeds) | |
| - type: roc_auc | |
| value: 0.687 | |
| name: Multi-seed ROC-AUC mean ± 0.014 (XGBoost, 10 seeds) | |
| # CYB009 Baseline Classifier | |
| **Vulnerability classification baseline (8-class) trained on the CYB009 | |
| synthetic vulnerability intelligence sample. The primary artifact value | |
| of this repo is `leakage_diagnostic.json` — the most comprehensive | |
| structural-leakage audit in the XpertSystems baseline catalog, | |
| documenting 8 oracle paths and 6 unlearnable README-suggested targets. | |
| The classifier itself is the catalog's weakest baseline by design (acc | |
| 0.244 vs majority 0.176), included to show that vulnerability_class is | |
| the ONLY README-headline target that learns honestly on this sample.** | |
| > **Read this first.** This repo ships three artifacts in priority | |
| > order: | |
| > 1. **`leakage_diagnostic.json`** — comprehensive audit of 8 oracle | |
| > paths discovered on CYB009 and 6 README-suggested targets that | |
| > are unlearnable on the sample after honest leak removal. | |
| > 2. A working classifier for `vulnerability_class` 8-class — the | |
| > only README target that learns honestly on this sample, and the | |
| > weakest baseline in the XpertSystems catalog by design. | |
| > 3. A feature engineering reference (`feature_engineering.py`). | |
| > | |
| > If you came here looking for a strong baseline, you will be | |
| > disappointed. If you came here to understand why the CYB009 sample | |
| > has hard-to-detect structural label-feature determinism, the | |
| > diagnostic is exactly the artifact you need. | |
| ## Model overview | |
| | Property | Value | | |
| |---|---| | |
| | Primary task | 8-class `vulnerability_class` classification (CWE-style families) | | |
| | Primary artifact | **`leakage_diagnostic.json`** — 8 oracle paths + 6 unlearnable targets | | |
| | Training data | `xpertsystems/cyb009-sample` (2,638 vulnerabilities) | | |
| | Models | XGBoost + PyTorch MLP | | |
| | Input features | 57 (after one-hot encoding) | | |
| | Split | Stratified random (per-vulnerability, no group structure to leak) | | |
| | Validation | Single seed (artifact) + multi-seed aggregate across 10 seeds | | |
| | License | CC-BY-NC-4.0 (matches dataset) | | |
| | Status | Reference baseline + comprehensive leakage diagnostic | | |
| ## Why this task — and the journey to get here | |
| The CYB009 README lists 11 suggested use cases. We piloted every | |
| README-headline target and found pervasive structural leakage. The | |
| abandoned candidates, in order of how we discovered them: | |
| ### Initial candidate: `exploit_maturity_final` 4-class (ABANDONED) | |
| The most natural target — 4-class (unproven/PoC/functional/weaponised), | |
| n=2638 well-balanced (36/27/25/12%), maps directly to EPSS calibration. | |
| Initial feasibility hit **acc 0.74, macro-F1 0.72, ROC-AUC 0.91 vs | |
| majority 0.36** — a +38pp lift looked excellent. | |
| **Then we found the leak.** `cvss_temporal_score_final` divided by | |
| `cvss_base_score` clusters near-deterministically per maturity tier: | |
| | Maturity tier | Observed ratio (median ± std) | CVSS v3.1 multiplier | | |
| |---|---:|---:| | |
| | unproven | 0.801 ± 0.011 | 0.91 × (other Temporal factors) | | |
| | proof_of_concept | 0.827 ± 0.011 | 0.94 × (other Temporal factors) | | |
| | functional | 0.854 ± 0.011 | 0.97 × (other Temporal factors) | | |
| | weaponised | 0.880 ± 0.012 | 1.00 × (other Temporal factors) | | |
| This is exactly the CVSS v3.1 Exploit Code Maturity multiplier | |
| (unproven 0.91 / PoC 0.94 / functional 0.97 / high or weaponised 1.00), | |
| combined with other near-constant Temporal factors (Remediation Level, | |
| Report Confidence). **The cvss_temporal/cvss_base ratio uniquely | |
| identifies the maturity tier.** | |
| Drop `cvss_temporal_score_final` → accuracy collapses to **0.31** | |
| (below majority 0.36). The target is structurally unlearnable on the | |
| sample once the oracle is removed. | |
| ### Other 5 candidates: also unlearnable after honest leak removal | |
| | Target | n_positive | Maj baseline | Honest acc | Honest AUC | Verdict | | |
| |---|---:|---:|---:|---:|---| | |
| | `exploitation_occurred_flag` | 203 | 0.923 | 0.857 | 0.65 | Below majority | | |
| | `zero_day_flag` | 76 | 0.971 | 0.949 | 0.60 | Below majority | | |
| | `cisa_kev_flag` | 14 | 0.995 | 0.992 | 0.61 | Below majority | | |
| | `supply_chain_propagation_flag` | 20 | 0.992 | 0.992 | 0.80 | Below majority | | |
| | `false_positive_flag` | 205 | 0.922 | 0.866 | 0.52 | Below majority | | |
| All five rare-event binaries are oracled by `time_to_exploit_days` | |
| (-1 sentinel) or `time_to_remediate_days` (120 sentinel) at full | |
| features; after honest leak removal, all are at-or-below majority. | |
| ### Per-timestep multi-class targets: state-machine oracles | |
| `lifecycle_phase`, `patch_status`, and `remediation_status` on | |
| `vulnerability_records.csv` form a tightly-coupled state machine: | |
| - `lifecycle_phase = residual_risk_review` → 100% `remediated` | |
| - `lifecycle_phase = discovery` → 100% `undetected` | |
| - `lifecycle_phase = remediation_deployment` → 100% `in_remediation` | |
| - `patch_status = deployed` → 100% `remediated` | |
| Naive evaluation on these targets reaches accuracy 0.95-0.98, but any | |
| two of the three deterministically pin the third. None of these is a | |
| viable independent ML target on the sample. | |
| ### `severity_class`: 100% mechanical CVSS function | |
| Observed `cvss_base_score` ranges per severity match CVSS v3.1 exactly: | |
| critical [9.0, 10.0], high [7.0, 9.0], medium [4.0, 7.0], low [1.8, 4.0]. | |
| Predicting severity is trivial with CVSS; below majority (acc 0.55 vs | |
| 0.51) without it. | |
| ### `vulnerability_class` 8-class: the only honest target — and the baseline ships | |
| After exhausting the README-suggested targets, `vulnerability_class` | |
| is the only one that learns honestly: | |
| - **acc 0.244 ± 0.023, macro-F1 0.230 ± 0.024, ROC-AUC 0.687 ± 0.014** | |
| - **+7pp lift over majority** (the catalog's smallest) | |
| - **All 8 classes represented** (per-class F1 0.09-0.33) | |
| - **No oracle feature** — modest signal genuinely spread across CVSS, | |
| EPSS, asset context, and binary flags | |
| This is the **weakest baseline in the XpertSystems catalog by design**. | |
| The full ~487k-row product would tighten per-class signal materially. | |
| The dataset roadmap recommendations in `leakage_diagnostic.json` | |
| describe what would make CYB009's headline targets viable on the | |
| sample. | |
| ## Quick start | |
| ```bash | |
| pip install xgboost torch safetensors pandas huggingface_hub | |
| ``` | |
| ```python | |
| from huggingface_hub import hf_hub_download, snapshot_download | |
| import json, numpy as np, torch, xgboost as xgb | |
| from safetensors.torch import load_file | |
| REPO = "xpertsystems/cyb009-baseline-classifier" | |
| paths = {n: hf_hub_download(REPO, n) for n in [ | |
| "model_xgb.json", "model_mlp.safetensors", | |
| "feature_engineering.py", "feature_meta.json", "feature_scaler.json", | |
| ]} | |
| import sys, os | |
| sys.path.insert(0, os.path.dirname(paths["feature_engineering.py"])) | |
| from feature_engineering import ( | |
| transform_single, load_meta, build_asset_lookup, INT_TO_LABEL, | |
| ) | |
| meta = load_meta(paths["feature_meta.json"]) | |
| # Asset features are joined from asset_inventory.csv at inference time | |
| ds = snapshot_download("xpertsystems/cyb009-sample", repo_type="dataset") | |
| asset_lookup = build_asset_lookup(f"{ds}/asset_inventory.csv") | |
| xgb_model = xgb.XGBClassifier(); xgb_model.load_model(paths["model_xgb.json"]) | |
| # Predict (see inference_example.ipynb for the full pattern) | |
| # Note: do NOT include exploit_maturity_final, cvss_temporal_score_final, | |
| # time_to_exploit_days, time_to_remediate_days, patch_lag_days, or | |
| # risk_score_composite - those were the outcome-leak columns. | |
| X = transform_single(my_vuln_record, meta, asset_lookup=asset_lookup) | |
| proba = xgb_model.predict_proba(X)[0] | |
| print(INT_TO_LABEL[int(np.argmax(proba))]) | |
| ``` | |
| See [`inference_example.ipynb`](./inference_example.ipynb) for the full | |
| copy-paste demo. | |
| ## Training data | |
| Trained on the public sample of CYB009, 2,638 per-vulnerability records: | |
| | Vulnerability class | Vulns | Class share | | |
| |---|---:|---:| | |
| | `memory_corruption` | 465 | 17.6% | | |
| | `injection_family` | 436 | 16.5% | | |
| | `misconfiguration` | 435 | 16.5% | | |
| | `auth_access_control` | 350 | 13.3% | | |
| | `cryptographic_failure` | 301 | 11.4% | | |
| | `supply_chain_weakness` | 271 | 10.3% | | |
| | `logic_flaw` | 228 | 8.6% | | |
| | `information_disclosure` | 152 | 5.8% | | |
| ### Stratified split | |
| Per-vulnerability task (one row per vuln in `vuln_summary.csv`), | |
| **StratifiedShuffleSplit** nested 70/15/15: | |
| | Fold | Vulns | | |
| |---|---:| | |
| | Train | 1,846 | | |
| | Validation | 396 | | |
| | Test | 396 | | |
| Class imbalance addressed with `class_weight='balanced'` (XGBoost | |
| `sample_weight`) and weighted cross-entropy (MLP). | |
| ## Feature pipeline | |
| The bundled `feature_engineering.py` is the canonical recipe. 57 | |
| features survive after encoding, drawn from: | |
| - **Per-vulnerability numeric** (10): `cvss_base_score`, | |
| `epss_score_final`, plus 8 binary post-hoc flags | |
| - **Per-vulnerability categorical** (1, one-hot): `severity_class` | |
| (4 values, CVSS-derived but useful as feature) | |
| - **Asset features** (joined from `asset_inventory.csv`): 8 numeric | |
| + 4 categorical (asset_type, criticality_tier, environment_type, | |
| os_family) | |
| - **Engineered** (5): `log_epss`, `is_high_cvss`, | |
| `exposure_severity_composite`, `risk_flag_count`, `epss_x_base` | |
| ### Excluded columns (outcome leaks) | |
| | Column | Why excluded | | |
| |---|---| | |
| | `exploit_maturity_final` | Indirect leak via CVSS temporal multiplier (would reintroduce the 0.91/0.94/0.97/1.00 oracle) | | |
| | `cvss_temporal_score_final` | Near-deterministic per `exploit_maturity_final` tier (the primary leak we discovered) | | |
| | `time_to_exploit_days` | -1 sentinel oracle for `exploitation_occurred_flag` | | |
| | `time_to_remediate_days` | 120 sentinel oracle for `remediation_success_flag` | | |
| | `patch_lag_days` | Suspected similar sentinel (precaution) | | |
| | `risk_score_composite` | Computed from flag fields (indirect oracle) | | |
| ## Evaluation | |
| ### Test-set metrics, seed 42 (n = 396 vulnerabilities) | |
| **XGBoost** (the published `model_xgb.json` artifact) | |
| | Metric | Value | | |
| |---|---:| | |
| | Macro ROC-AUC (OvR) | **0.6837** | | |
| | Accuracy | **0.2374** | | |
| | Macro-F1 | 0.2244 | | |
| | Weighted-F1 | 0.2407 | | |
| **MLP** (the published `model_mlp.safetensors` artifact) | |
| | Metric | Value | | |
| |---|---:| | |
| | Macro ROC-AUC (OvR) | **0.6899** | | |
| | Accuracy | **0.2323** | | |
| | Macro-F1 | 0.2209 | | |
| | Weighted-F1 | 0.2362 | | |
| MLP and XGBoost are within noise of each other on this task — both | |
| are publishing the same modest honest signal. | |
| ### Multi-seed robustness (XGBoost, 10 seeds) | |
| | Metric | Mean | Std | Min | Max | | |
| |---|---:|---:|---:|---:| | |
| | Accuracy | 0.244 | 0.023 | 0.217 | 0.283 | | |
| | Macro-F1 | 0.230 | 0.024 | 0.206 | 0.280 | | |
| | Macro ROC-AUC OvR | 0.687 | 0.014 | 0.660 | 0.700 | | |
| All 10 seeds yielded all 8 classes in the test fold (stratified split | |
| guarantees this). Full per-seed results in | |
| [`multi_seed_results.json`](./multi_seed_results.json). | |
| ### Per-class F1 (seed 42) | |
| | Vulnerability class | Class share | XGBoost F1 | MLP F1 | | |
| |---|---:|---:|---:| | |
| | `memory_corruption` | 17.6% | **0.333** | 0.365 | | |
| | `information_disclosure` | 5.8% | 0.291 | 0.154 | | |
| | `misconfiguration` | 16.5% | 0.259 | 0.162 | | |
| | `injection_family` | 16.5% | 0.237 | 0.235 | | |
| | `supply_chain_weakness` | 10.3% | 0.222 | 0.292 | | |
| | `cryptographic_failure` | 11.4% | 0.217 | 0.168 | | |
| | `auth_access_control` | 13.3% | 0.146 | 0.163 | | |
| | `logic_flaw` | 8.6% | **0.090** | 0.228 | | |
| `memory_corruption` (highest mean CVSS at 8.3) and | |
| `information_disclosure` (lowest mean CVSS at 5.4) are the most | |
| distinctive classes. `logic_flaw` is the hardest — its feature | |
| distribution overlaps closely with everything else. | |
| ### Ablation: which feature groups matter | |
| | Configuration | Accuracy | Macro-F1 | ROC-AUC | Δ accuracy | | |
| |---|---:|---:|---:|---:| | |
| | Full feature set (published) | 0.2374 | 0.2244 | 0.6837 | — | | |
| | No CVSS features | 0.2121 | 0.1926 | 0.6690 | **−0.0253** | | |
| | No asset features | 0.2172 | 0.1967 | 0.6870 | −0.0202 | | |
| | No engineered features | 0.2323 | 0.2216 | 0.6871 | −0.0051 | | |
| | No severity (one-hot) | 0.2273 | 0.2175 | 0.6857 | −0.0101 | | |
| | No EPSS features | 0.2475 | 0.2237 | 0.6926 | +0.0101 | | |
| | No binary flags | 0.2273 | 0.2114 | 0.6776 | −0.0101 | | |
| Three findings: | |
| 1. **No feature group is dominant.** Largest single drop is 2.5pp | |
| (CVSS features). Every group contributes a little; nothing | |
| contributes a lot. The signal is genuinely diffuse. | |
| 2. **CVSS and asset features carry the most signal** (~2pp each), | |
| consistent with the observation that per-class CVSS means | |
| differ (5.4 to 8.3) and asset features modestly inform class. | |
| 3. **EPSS features slightly *hurt*** on this task (+1pp without | |
| them). EPSS is intended for exploitation prediction, not class | |
| prediction; on this sample it acts as small additional noise. | |
| ### Architecture | |
| **XGBoost:** multi-class gradient boosting (`multi:softprob`, 8 classes), | |
| `hist` tree method, class-balanced sample weights, early stopping on | |
| validation mlogloss. | |
| **MLP:** `57 → 128 → 64 → 8`, each hidden layer followed by | |
| `BatchNorm1d` → `ReLU` → `Dropout(0.3)`, weighted cross-entropy loss, | |
| AdamW optimizer, early stopping on validation macro-F1. | |
| Training hyperparameters are held internally by XpertSystems. | |
| ## Limitations | |
| **This is a baseline reference, not a production vulnerability | |
| classifier.** | |
| 1. **The headline finding is the leakage diagnostic, not the | |
| classifier.** Read `leakage_diagnostic.json` first. The classifier | |
| demonstrates that vulnerability_class is the only README-suggested | |
| target that learns honestly on the sample. | |
| 2. **Per-class F1 ranges 0.09–0.33.** The model is more confident on | |
| memory_corruption and information_disclosure than on logic_flaw | |
| and auth_access_control. For production use, expect different | |
| error patterns by class. | |
| 3. **No feature group contributes more than 3pp accuracy.** The | |
| model has no single decisive signal; instead it integrates many | |
| weakly-informative features. Removing any one group has minimal | |
| impact. | |
| 4. **Synthetic-vs-real transfer.** The dataset is synthetic, calibrated | |
| to 12 benchmarks from authoritative vulnerability intelligence | |
| sources (NIST NVD, EPSS v3, CISA KEV, Mandiant, Verizon DBIR, | |
| Rapid7, Qualys, Tenable). Real vulnerability telemetry has | |
| different noise characteristics — in particular, the | |
| structural-oracle patterns documented in | |
| `leakage_diagnostic.json` (CVSS temporal multipliers, | |
| sentinel-coded time fields, lifecycle state-machine determinism) | |
| would not be present in real data with comparable density. Real | |
| data has stochastic transitions and observation noise. | |
| 5. **2,638 vulnerabilities is a modest training set for 8 classes.** | |
| The 396-vulnerability test fold yields stable multi-seed metrics | |
| (std 0.023) but per-class confidence intervals are wide. The full | |
| ~487k-row product has materially more data per class. | |
| ## Notes on dataset schema | |
| The CYB009 sample dataset README describes some fields differently | |
| from the actual schema. This note helps buyers reconcile what they | |
| read with what they receive. | |
| | What the README says | What the data actually contains | | |
| |---|---| | |
| | `vulnerability_records` has 19 columns | Data has **16 columns** | | |
| | `vulnerability_records` includes `severity`, `exploited_in_wild_flag`, `cisa_kev_listed_flag`, `zero_day_flag`, `supply_chain_flag`, `internet_exposed`, `sla_breached_flag` | **None of these columns exist** in vulnerability_records. Per-vuln flags are only on vuln_summary. | | |
| | `vuln_class` has 10 values (incl. `race_condition`, `web_application`, `configuration`) | **8 values** in the data; differs in: `misconfiguration` (not `configuration`), `auth_access_control` (not `authentication_bypass`), `logic_flaw` (new); no `race_condition`, no `web_application`, no `deserialization` | | |
| | 8 lifecycle phases | **12 phases** in the data, adding `residual_risk_review` (45% of all rows), `false_positive_closed`, `sla_breach`, `accepted_risk`, `discovery`, `organisational_triage`, `exploitation_in_wild` | | |
| | `patch_status` has 4 values | **6 values** in the data: adds `vendor_notified`, `patch_in_development`, `patch_validated` | | |
| | `severity` has 5 values (incl. `none`) | **4 values** in the data (`severity_class`): low, medium, high, critical only | | |
| | `vuln_summary` has 15 columns | Data has **21 columns** | | |
| | Field renames | `severity_final` → `severity_class`; `cvss_base_score_final` → `cvss_base_score`; `cisa_kev_listed` → `cisa_kev_flag`; `exploited_in_wild` → `exploitation_occurred_flag`; `supply_chain_compromise` → `supply_chain_propagation_flag` | | |
| | Semantic inversion | README's `sla_breached` (True = bad) ↔ data's `sla_compliance_flag` (True = good) | | |
| | `remediation_outcome` categorical (patched/mitigated/accepted/unpatched) | Replaced with `remediation_success_flag` (binary) plus per-timestep `remediation_status` | | |
| | Not in README | New fields: `risk_score_composite`, `compensating_control_flag`, `time_to_exploit_days`, `time_to_remediate_days`, `patch_lag_days` | | |
| None of these affects model correctness — the feature pipeline uses | |
| the actual column names. If you build your own pipeline against the | |
| dataset, use the actual columns. | |
| ## Intended use | |
| - **Reading the leakage diagnostic** — the primary value of this repo. | |
| Reusable methodology for any synthetic vulnerability dataset. | |
| - **Evaluating fit** of the CYB009 dataset for your research, with | |
| open knowledge of the structural-oracle patterns | |
| - **Honest baseline reference** for the only README-suggested target | |
| that learns on the sample | |
| - **Feature engineering reference** for per-vulnerability ML | |
| ## Out-of-scope use | |
| - **Production vulnerability triage** on real telemetry | |
| - **Exploit maturity prediction** — README headline target, | |
| unlearnable on the sample after honest leak removal | |
| - **Zero-day / KEV / supply-chain prediction** — README headline | |
| targets, unlearnable as rare-event binaries on the sample | |
| - **SLA breach prediction** — README headline target, unlearnable | |
| after honest leak removal | |
| - Any operational security decision without further validation on | |
| real data | |
| ## Reproducibility | |
| Outputs above were produced with `seed = 42` (published artifact), | |
| nested `StratifiedShuffleSplit` (70/15/15), on the published sample | |
| (`xpertsystems/cyb009-sample`, version 1.0.0, generated 2026-05-16). | |
| The feature pipeline in `feature_engineering.py` is deterministic and | |
| the trained weights in this repo correspond exactly to the metrics | |
| above. | |
| Multi-seed results (seeds 42, 7, 13, 17, 23, 31, 45, 99, 123, 200) | |
| in `multi_seed_results.json` confirm robust performance across splits | |
| (std 0.023 on accuracy). | |
| The training script itself is private to XpertSystems. | |
| ## Files in this repo | |
| | File | Purpose | | |
| |---|---| | |
| | **`leakage_diagnostic.json`** | **PRIMARY ARTIFACT — 8 oracle paths + 6 unlearnable targets** | | |
| | `model_xgb.json` | XGBoost weights (seed 42) | | |
| | `model_mlp.safetensors` | PyTorch MLP weights (seed 42) | | |
| | `feature_engineering.py` | Feature pipeline | | |
| | `feature_meta.json` | Feature column order + categorical levels | | |
| | `feature_scaler.json` | MLP input mean/std (XGBoost ignores) | | |
| | `validation_results.json` | Per-class metrics, confusion matrix, architecture | | |
| | `ablation_results.json` | Per-feature-group ablation | | |
| | `multi_seed_results.json` | XGBoost metrics across 10 seeds | | |
| | `inference_example.ipynb` | End-to-end inference demo notebook | | |
| | `README.md` | This file | | |
| ## Contact and full product | |
| The full **CYB009** dataset contains **~487,000 vulnerability records** | |
| across four files, with calibrated benchmark validation against 12 | |
| metrics drawn from authoritative vulnerability intelligence sources | |
| (NIST NVD, EPSS v3, CISA KEV, Mandiant, Verizon DBIR, Rapid7, Qualys, | |
| Tenable). The full XpertSystems.ai synthetic data catalogue spans 41 | |
| SKUs across Cybersecurity, Healthcare, Insurance & Risk, Oil & Gas, | |
| and Materials & Energy. | |
| - 📧 **pradeep@xpertsystems.ai** | |
| - 🌐 **https://xpertsystems.ai** | |
| - 🗂 Dataset: https://huggingface.co/datasets/xpertsystems/cyb009-sample | |
| - 🤖 Companion models: | |
| - https://huggingface.co/xpertsystems/cyb001-baseline-classifier (network traffic) | |
| - https://huggingface.co/xpertsystems/cyb002-baseline-classifier (ATT&CK kill-chain) | |
| - https://huggingface.co/xpertsystems/cyb003-baseline-classifier (malware execution phase) | |
| - https://huggingface.co/xpertsystems/cyb004-baseline-classifier (phishing campaign phase) | |
| - https://huggingface.co/xpertsystems/cyb005-baseline-classifier (ransomware actor-tier attribution) | |
| - https://huggingface.co/xpertsystems/cyb006-baseline-classifier (user risk tier + leakage diagnostic) | |
| - https://huggingface.co/xpertsystems/cyb007-baseline-classifier (insider threat type) | |
| - https://huggingface.co/xpertsystems/cyb008-baseline-classifier (SOC alert triage + leakage diagnostic) | |
| ## Citation | |
| ```bibtex | |
| @misc{xpertsystems_cyb009_baseline_2026, | |
| title = {CYB009 Baseline Classifier: XGBoost and MLP for Vulnerability Classification, with the XpertSystems Catalog's Most Comprehensive Structural-Leakage Audit}, | |
| author = {XpertSystems.ai}, | |
| year = {2026}, | |
| url = {https://huggingface.co/xpertsystems/cyb009-baseline-classifier}, | |
| note = {Reference baseline + 8-oracle-path leakage diagnostic on xpertsystems/cyb009-sample} | |
| } | |
| ``` | |