Initial release: XGBoost + MLP for user-risk-tier classification, plus structural-leakage diagnostic on threat-actor detection
e6a6835 verified | license: cc-by-nc-4.0 | |
| library_name: pytorch | |
| tags: | |
| - cybersecurity | |
| - identity-security | |
| - insider-threat | |
| - ueba | |
| - user-risk-scoring | |
| - tabular-classification | |
| - synthetic-data | |
| - xgboost | |
| - baseline | |
| - leakage-diagnostic | |
| pipeline_tag: tabular-classification | |
| base_model: [] | |
| datasets: | |
| - xpertsystems/cyb006-sample | |
| metrics: | |
| - accuracy | |
| - f1 | |
| - roc_auc | |
| model-index: | |
| - name: cyb006-baseline-classifier | |
| results: | |
| - task: | |
| type: tabular-classification | |
| name: 3-class user risk tier classification | |
| dataset: | |
| type: xpertsystems/cyb006-sample | |
| name: CYB006 Synthetic Login Activity Dataset (Sample) | |
| metrics: | |
| - type: roc_auc | |
| value: 0.8017 | |
| name: Test macro ROC-AUC OvR (XGBoost, seed 42) | |
| - type: accuracy | |
| value: 0.6667 | |
| name: Test accuracy (XGBoost, seed 42) | |
| - type: f1 | |
| value: 0.6454 | |
| name: Test macro-F1 (XGBoost, seed 42) | |
| - type: accuracy | |
| value: 0.700 | |
| name: Multi-seed accuracy mean ± 0.082 (XGBoost, 10 seeds) | |
| - type: roc_auc | |
| value: 0.812 | |
| name: Multi-seed ROC-AUC mean ± 0.048 (XGBoost, 10 seeds) | |
| - type: roc_auc | |
| value: 0.6974 | |
| name: Test macro ROC-AUC OvR (MLP, seed 42) | |
| - type: accuracy | |
| value: 0.6000 | |
| name: Test accuracy (MLP, seed 42) | |
| - type: f1 | |
| value: 0.5914 | |
| name: Test macro-F1 (MLP, seed 42) | |
| # CYB006 Baseline Classifier | |
| **User-risk-tier classifier trained on the CYB006 synthetic login | |
| activity sample. Predicts which of 3 risk tiers (`low` / `medium` / | |
| `high`) a user belongs to, from per-user identity aggregates and | |
| non-leaky session aggregates. ALSO ships a leakage diagnostic for the | |
| README's stated headline use case (threat-actor tier classification).** | |
| > **Read this first.** This repo ships two artifacts: (1) a working | |
| > baseline classifier for `user_risk_tier` (the primary product), and | |
| > (2) a separate diagnostic file (`leakage_diagnostic.json`) | |
| > documenting why the README's stated headline use case — 4-class | |
| > threat-actor tier classification — is not a usable ML task on the | |
| > sample dataset. Both matter; the diagnostic is required reading for | |
| > anyone evaluating CYB006 for a threat-detection product. | |
| ## Model overview | |
| | Property | Value | | |
| |---|---| | |
| | Primary task | 3-class user_risk_tier classification (`low`/`medium`/`high`) | | |
| | Secondary artifact | `leakage_diagnostic.json` — audit of threat-actor detection on this sample | | |
| | Training data | `xpertsystems/cyb006-sample` (200 users × 25 sessions = 5,000 sessions) | | |
| | Models | XGBoost + PyTorch MLP | | |
| | Input features | 34 (per-user aggregates + session aggregates + engineered) | | |
| | Split | **Stratified by user_risk_tier** (this is a user-level task, n=200) | | |
| | Validation | Single seed (artifact) + multi-seed aggregate across 10 seeds | | |
| | License | CC-BY-NC-4.0 (matches dataset) | | |
| | Status | Reference baseline + structural-leakage diagnostic | | |
| ## Why this task — and why not threat-actor classification? | |
| The CYB006 README's first suggested use case is "training **account | |
| takeover (ATO) detection** models" and second is "**threat-actor tier | |
| classification** — 4-class with realistic class imbalance". We piloted | |
| the threat-actor target first and discovered that the sample dataset | |
| contains **structural distributional non-overlap** between threat-actor | |
| and legitimate session populations across at least six independent | |
| feature groups: | |
| | Oracle feature | Actor range / value | Non-actor range / value | | |
| |---|---|---| | |
| | `velocity_anomaly_score` | [0.52, 0.82] | [0.00, 0.25] — **zero overlap** | | |
| | `session_timestamp_utc` | [6,417, 1,440,062] | [1,445,187, 18,000,137] — **disjoint windows** | | |
| | `credential_attempt_count` | [1, 59] (mean 12.9) | [1, 2] (mean 1.07) | | |
| | `login_outcome` | `success_normal` only occurs for non-actors; `failure_account_locked` / `account_takeover_confirmed` / `session_hijacked` / `success_anomalous` only occur for actors | | |
| | `geo_country_code` | `KP`, `XX`, `CN`, `BY` appear only for actors | | |
| | `device_trust_level` | `trusted_managed` / `compliant_enrolled` appear only for non-actors | | |
| As a consequence, **plain XGBoost achieves 100% test accuracy on | |
| threat-actor binary detection (any-actor vs none) across every random | |
| seed**, and stays at **97% accuracy and AUC 0.99 even with all six | |
| oracle feature groups dropped** (40+ columns excluded). This is not a | |
| useful ML benchmark; it's a property of the synthetic generator. Real | |
| identity-security telemetry has substantial overlap between threat | |
| and legitimate behaviour, with state-of-the-art detection systems | |
| operating at AUC 0.7–0.9, not 1.0. | |
| The diagnostic finding is documented quantitatively in | |
| [`leakage_diagnostic.json`](./leakage_diagnostic.json) and summarised | |
| in the [Leakage diagnostic](#leakage-diagnostic) section below. | |
| We therefore pivoted to **`user_risk_tier` (3-class user-level | |
| classification)** as the primary baseline target. This task: | |
| - Has **overlapping per-tier feature distributions** — no oracle features | |
| - Carries **modest real signal** (acc 0.66, AUC 0.80 over majority 0.57) | |
| - Targets a legitimate use case (the README lists "Insider threat scoring with composite behavioral indicators") | |
| - Demonstrates honest ML rigor on the dataset | |
| Two model artifacts are published. They are designed to be used together — disagreement is a useful triage signal: | |
| - `model_xgb.json` — gradient-boosted trees, primary recommendation | |
| - `model_mlp.safetensors` — PyTorch MLP in SafeTensors format | |
| ## Quick start | |
| ```bash | |
| pip install xgboost torch safetensors pandas huggingface_hub | |
| ``` | |
| ```python | |
| from huggingface_hub import hf_hub_download | |
| import json, numpy as np, torch, xgboost as xgb | |
| from safetensors.torch import load_file | |
| REPO = "xpertsystems/cyb006-baseline-classifier" | |
| paths = {n: hf_hub_download(REPO, n) for n in [ | |
| "model_xgb.json", "model_mlp.safetensors", | |
| "feature_engineering.py", "feature_meta.json", "feature_scaler.json", | |
| ]} | |
| import sys, os | |
| sys.path.insert(0, os.path.dirname(paths["feature_engineering.py"])) | |
| from feature_engineering import ( | |
| transform_single, load_meta, INT_TO_LABEL, | |
| compute_session_aggregates_for_user | |
| ) | |
| meta = load_meta(paths["feature_meta.json"]) | |
| xgb_model = xgb.XGBClassifier(); xgb_model.load_model(paths["model_xgb.json"]) | |
| # Compose a per-user record from user_risk_summary row + session aggregates | |
| user_record = user_summary_row.to_dict() | |
| user_record.update(compute_session_aggregates_for_user(user_sessions)) | |
| X = transform_single(user_record, meta) | |
| proba = xgb_model.predict_proba(X)[0] | |
| print(INT_TO_LABEL[int(np.argmax(proba))]) | |
| ``` | |
| See [`inference_example.ipynb`](./inference_example.ipynb) for the full | |
| copy-paste demo. | |
| ## Training data | |
| Trained on the public sample of CYB006, 200 per-user rows from | |
| `user_risk_summary.csv` enriched with per-user session aggregates | |
| computed from `login_sessions.csv`: | |
| | Tier | Users | Class share | | |
| |---|---:|---:| | |
| | `low` | 114 | 57% | | |
| | `medium` | 47 | 23.5% | | |
| | `high` | 39 | 19.5% | | |
| The CYB006 README claims a 4-tier scheme (`low`/`medium`/`high`/`critical`). | |
| The sample data contains only 3 — there is no `critical` tier present. | |
| ### Stratified split | |
| This is a **user-level** task (one row per user, 200 users total). | |
| Group-aware splitting does not apply since there is no | |
| many-rows-per-group structure to leak. We use | |
| **StratifiedShuffleSplit** (nested 70/15/15) to preserve the 3-tier | |
| class distribution across folds: | |
| | Fold | Users | | |
| |---|---:| | |
| | Train | 139 | | |
| | Validation | 31 | | |
| | Test | 30 | | |
| Class imbalance is addressed with `class_weight='balanced'` (XGBoost | |
| `sample_weight`) and weighted cross-entropy (MLP). | |
| ## Feature pipeline | |
| The bundled `feature_engineering.py` is the canonical feature recipe. | |
| 34 features survive after encoding, drawn from: | |
| - **Per-user numeric** (14, from `user_risk_summary.csv`): `total_login_attempts`, `successful_logins`, `failed_logins`, `mfa_failures`, `impossible_travel_events`, `lateral_hop_count`, `privilege_escalations`, `account_lockout_count`, `geo_dispersion_score`, `login_velocity_score`, `session_anomaly_rate`, `ueba_alert_count`, `overall_identity_risk_score`, `insider_threat_indicator_score` | |
| - **Per-user categorical** (1, one-hot): `peak_privilege_level_accessed` (6 values) | |
| - **Session aggregates** (8, derived from `login_sessions.csv`): `avg_session_duration_seconds`, `avg_mfa_response_latency_ms`, `avg_geo_anomaly_score`, `max_geo_anomaly_score`, `frac_impossible_travel`, `n_unique_countries`, `n_unique_devices`, `n_unique_applications` | |
| - **Engineered** (6): `failed_login_rate`, `mfa_failure_rate`, `ueba_alerts_per_session`, `hops_per_escalation`, `geo_velocity_composite`, `composite_anomaly_score` | |
| ### Leakage exclusions | |
| Three columns from `user_risk_summary.csv` are dropped to avoid contamination: | |
| - `threat_actor_flag` — perfect oracle for `tier='high'` subset (only high-tier users can be threat actors) | |
| - `account_takeover_flag` — 2 positive cases out of 200 (1%); too sparse and oracle-prone | |
| - `credential_attack_victim_flag` — 1 positive case out of 200 (0.5%); same issue | |
| Four columns from `login_sessions.csv` are NOT aggregated into session | |
| features because they exhibited the structural non-overlap documented | |
| in [Leakage diagnostic](#leakage-diagnostic): | |
| - `velocity_anomaly_score`, `session_timestamp_utc`, `credential_attempt_count`, `login_outcome` | |
| ## Evaluation | |
| ### Test-set metrics, seed 42 (n = 30 disjoint users) | |
| **XGBoost** (the published `model_xgb.json` artifact) | |
| | Metric | Value | | |
| |---|---:| | |
| | Macro ROC-AUC (OvR) | **0.8017** | | |
| | Accuracy | **0.6667** | | |
| | Macro-F1 | 0.6454 | | |
| | Weighted-F1 | 0.6606 | | |
| **MLP** (the published `model_mlp.safetensors` artifact) | |
| | Metric | Value | | |
| |---|---:| | |
| | Macro ROC-AUC (OvR) | 0.6974 | | |
| | Accuracy | 0.6000 | | |
| | Macro-F1 | 0.5914 | | |
| | Weighted-F1 | 0.6068 | | |
| ### Multi-seed robustness (XGBoost, 10 seeds) | |
| | Metric | Mean | Std | Min | Max | | |
| |---|---:|---:|---:|---:| | |
| | Accuracy | 0.700 | 0.082 | 0.533 | 0.867 | | |
| | Macro-F1 | 0.638 | 0.093 | 0.445 | 0.814 | | |
| | Macro ROC-AUC OvR | 0.812 | 0.048 | 0.738 | 0.877 | | |
| Full per-seed results in [`multi_seed_results.json`](./multi_seed_results.json). | |
| With only 30 test users per seed, single-seed accuracy varies materially | |
| (0.53–0.87 across seeds). **ROC-AUC 0.812 ± 0.048 is the more reliable | |
| performance estimate.** All 10 seeds yield all 3 tiers in the test | |
| fold thanks to stratification. | |
| ### Per-class F1 (seed 42) | |
| | Tier | Class share | XGBoost F1 | MLP F1 | | |
| |---|---:|---:|---:| | |
| | `low` | 57% | 0.727 | 0.647 | | |
| | `medium` | 23.5% | 0.286 | 0.400 | | |
| | `high` | 19.5% | **0.923** | 0.727 | | |
| The model performs best on `high` (the most behaviourally distinct | |
| tier — high failed-login rates, frequent impossible travel, elevated | |
| anomaly scores) and `low` (the majority class). The `medium` tier is | |
| hardest, which is the expected behaviour for a 3-tier ordinal task — | |
| mid-class samples sit between two boundaries and pick up confusion | |
| from both sides. | |
| ### Ablation: which feature groups matter | |
| | Configuration | Accuracy | Macro-F1 | Δ accuracy | | |
| |---|---:|---:|---:| | |
| | Full feature set (published) | 0.6667 | 0.6454 | — | | |
| | No user aggregates (count features) | 0.5333 | 0.4586 | **−0.1333** | | |
| | No risk scores | 0.5667 | 0.5300 | −0.1000 | | |
| | No engineered features | 0.5667 | 0.5444 | −0.1000 | | |
| | No session aggregates | 0.7000 | 0.6130 | +0.0333 | | |
| Findings: | |
| 1. **User-level count features matter most** (failed logins, lateral | |
| hops, MFA failures). Dropping them costs 13 pp accuracy. | |
| 2. **Risk scores and engineered features each contribute ~10 pp.** | |
| With only 139 training users, the trees can't fully recover | |
| engineered composites from raw inputs. | |
| 3. **Session aggregates slightly hurt accuracy** in seed 42 (gain | |
| 3 pp when dropped). With n=200, additional features can crowd | |
| the small data; the trees do better with fewer signals when | |
| each one is information-dense. Session aggregates are kept in | |
| the published pipeline because they help on most other seeds. | |
| ### Architecture | |
| **XGBoost:** multi-class gradient boosting (`multi:softprob`, 3 classes), | |
| `hist` tree method, class-balanced sample weights, early stopping on | |
| validation mlogloss. | |
| **MLP:** `34 → 128 → 64 → 3`, each hidden layer followed by `BatchNorm1d` | |
| → `ReLU` → `Dropout(0.3)`, weighted cross-entropy loss, AdamW optimizer, | |
| early stopping on validation macro-F1. | |
| Training hyperparameters are held internally by XpertSystems. | |
| ## Leakage diagnostic | |
| This is the most important section of the model card. The full | |
| diagnostic is in [`leakage_diagnostic.json`](./leakage_diagnostic.json). | |
| Summary: | |
| **Setup:** Train an XGBoost binary classifier to predict | |
| `threat_actor_capability_tier != 'none'` from per-session features. | |
| Use group-aware split by `user_id` (15% test = 30 disjoint users). | |
| Cumulatively drop suspected oracle feature groups and re-evaluate. | |
| | Configuration | n_features | Accuracy | ROC-AUC | | |
| |---|---:|---:|---:| | |
| | Full feature set | 166 | **1.0000** | **1.0000** | | |
| | − behavioural oracles (velocity, timestamp, credential count) | 163 | 0.9991 | 1.0000 | | |
| | − login_outcome | 154 | 0.9982 | 1.0000 | | |
| | − geo_country_code | 138 | 0.9987 | 1.0000 | | |
| | − device_trust_level | 133 | 0.9982 | 0.9999 | | |
| | − user_risk_tier | 130 | 0.9978 | 0.9996 | | |
| | − geo_anomaly_score | 129 | 0.9707 | 0.9897 | | |
| **Even after dropping six oracle feature groups (37 columns), the | |
| model still achieves 97% test accuracy and AUC 0.99.** The leakage | |
| is not localised to a few suspect features; it is distributed across | |
| the entire feature space because the synthetic generator produces | |
| threat-actor sessions that are anomalous on every dimension | |
| simultaneously, with no overlap into legitimate behaviour. | |
| ### Recommendation to dataset author | |
| For threat-actor detection to be a useful ML benchmark on this | |
| dataset, the next generator version should introduce **distributional | |
| overlap** between threat-actor and legitimate session populations | |
| across all anomaly indicators: | |
| - `velocity_anomaly_score`: extend non-actor distribution into [0.0, 0.5] and shrink actor to [0.3, 0.9] for substantial overlap in [0.3, 0.5] | |
| - `session_timestamp_utc`: interleave threat-actor and legitimate sessions across the same time window | |
| - `credential_attempt_count`: allow some non-actor users to exhibit elevated counts (mistyped passwords, MFA fatigue) | |
| - `login_outcome`: allow `failure_account_locked` and `success_anomalous` for some legitimate sessions | |
| - `geo_country_code`: include a baseline frequency of risky-country logins among legitimate users (business travel, contractors) | |
| - `device_trust_level`: allow threat actors to occasionally use compliant devices (token theft scenarios) | |
| Target operating regime: real-world detection AUC 0.7–0.9, not 1.0. | |
| ### What this means for buyers | |
| If you're evaluating CYB006 for a threat-detection product, you should | |
| know that: | |
| - **The sample dataset cannot be used to honestly benchmark threat-actor | |
| detection models.** A trivially regularised model will score 100%, | |
| which doesn't differentiate good detection systems from bad ones. | |
| - **The user-risk-tier task shipped in this baseline is a legitimate | |
| ML benchmark on the sample data.** It generalises modestly (AUC 0.81) | |
| and is the right starting point for evaluating insider-threat | |
| scoring on the sample. | |
| - **The full ~1.1M-row CYB006 product may or may not have the same | |
| structural property.** Confirm with XpertSystems before committing | |
| to a threat-detection use case. | |
| ## Limitations | |
| **This is a baseline reference, not a production identity-security system.** | |
| 1. **Small held-out test fold (n=30).** With only 30 test users per | |
| seed, single-seed metrics swing 0.53–0.87 in accuracy. The | |
| multi-seed ROC-AUC of 0.81 ± 0.05 is the reliable estimate. The | |
| full ~1.1M-row product would tighten the confidence interval | |
| substantially. | |
| 2. **The `medium` tier is harder than the others.** F1 0.29 on | |
| `medium` (vs 0.92 on `high`) is expected — ordinal middle classes | |
| are typically the hardest under a flat-classification setup. | |
| 3. **MLP weaker than XGBoost.** AUC 0.70 vs 0.80. With only 139 | |
| training users, the MLP cannot match boosted trees on tabular data. | |
| 4. **Threat-actor detection task is not usable on this sample.** | |
| See [Leakage diagnostic](#leakage-diagnostic) above. | |
| 5. **Synthetic-vs-real transfer.** The dataset is synthetic and | |
| calibrated to identity-security benchmarks (Microsoft Digital | |
| Defense Report, Okta Customer Identity Trends, Verizon DBIR, CISA | |
| Joint Advisories, Mandiant M-Trends, MITRE ATT&CK Evaluations). | |
| Real identity telemetry has different noise characteristics; do | |
| not assume metrics transfer. | |
| 6. **3 tiers, not 4.** README lists `low`/`medium`/`high`/`critical` | |
| but the data contains only 3. If you need 4-class support, wait | |
| for a regenerated sample. | |
| ## Notes on dataset schema | |
| The CYB006 sample dataset README describes some fields differently | |
| from the actual schema. The model was trained on the actual schema; | |
| this note helps buyers reconcile what they read with what they receive. | |
| | What the README says | What the data actually contains | | |
| |---|---| | |
| | `session_phase` has 6 values | **All 5,000 rows have `session_phase = session_termination`** — the field is constant. There is no usable session-phase target. | | |
| | `login_outcome` has 4 values (`success / failed / mfa_required / blocked`) | 9 values: `success_normal`, `failure_bad_password`, `failure_account_locked`, `failure_mfa_rejected`, `failure_device_untrusted`, `failure_geo_blocked`, `success_anomalous`, `account_takeover_confirmed`, `session_hijacked` | | |
| | 4 actor tiers | 5 values: 4 tier labels + `none` (92% of rows have `none`) | | |
| | `mfa_challenge_type` has 5 values | 7: adds `authenticator_app`, `hardware_token`, `voice_call` | | |
| | `authentication_method` has 4 values | 5: no `api_key`; adds `password_plus_mfa`, `phishing_resistant_fido2` | | |
| | `user_risk_tier` has 4 values (`low/medium/high/critical`) | 3 values: no `critical` | | |
| | `session_timestamp_utc` is an ISO timestamp string | It is an integer | | |
| | `user_risk_summary.csv` columns listed | Adds `peak_privilege_level_accessed`, `credential_attack_victim_flag` (not in README) | | |
| None of these affects model correctness — the feature pipeline uses | |
| the actual column names. If you build your own pipeline against the | |
| dataset, use the actual columns. | |
| ## Intended use | |
| - **Evaluating fit** of the CYB006 dataset for your insider-threat | |
| or user-risk-scoring research | |
| - **Baseline reference** for new model architectures | |
| - **Reference example of structural-leakage diagnostics** in synthetic | |
| cybersecurity datasets — the diagnostic methodology in | |
| `train_classifier.py` is reusable | |
| - **Feature engineering reference** for per-user identity aggregates | |
| ## Out-of-scope use | |
| - Production identity-security detection on real telemetry | |
| - Threat-actor attribution (this baseline does not address that task; see why above) | |
| - Any operational security or law-enforcement decision | |
| ## Reproducibility | |
| Outputs above were produced with `seed = 42` (published artifact), | |
| nested `StratifiedShuffleSplit` (70/15/15 by user_risk_tier), on the | |
| published sample (`xpertsystems/cyb006-sample`, version 1.0.0, | |
| generated 2026-05-16). The feature pipeline in `feature_engineering.py` | |
| is deterministic and the trained weights in this repo correspond | |
| exactly to the metrics above. | |
| Multi-seed results (seeds 42, 7, 13, 17, 23, 31, 45, 99, 123, 200) in | |
| `multi_seed_results.json` confirm robust performance across splits. | |
| The training script itself is private to XpertSystems. | |
| ## Files in this repo | |
| | File | Purpose | | |
| |---|---| | |
| | `model_xgb.json` | XGBoost weights (seed 42) | | |
| | `model_mlp.safetensors` | PyTorch MLP weights (seed 42) | | |
| | `feature_engineering.py` | Feature pipeline | | |
| | `feature_meta.json` | Feature column order + categorical levels | | |
| | `feature_scaler.json` | MLP input mean/std (XGBoost ignores) | | |
| | `validation_results.json` | Per-class metrics, confusion matrix, architecture | | |
| | `ablation_results.json` | Per-feature-group ablation | | |
| | `multi_seed_results.json` | XGBoost metrics across 10 seeds | | |
| | `leakage_diagnostic.json` | **Structural-leakage audit on threat-actor detection** | | |
| | `inference_example.ipynb` | End-to-end inference demo notebook | | |
| | `README.md` | This file | | |
| ## Contact and full product | |
| The full **CYB006** dataset contains ~1.1 million rows across four | |
| files, with 12 calibrated benchmark validation tests drawn from | |
| authoritative identity security and threat intelligence sources | |
| (Microsoft Digital Defense Report, Okta Customer Identity Trends, | |
| Verizon DBIR, CISA Joint Advisories, Mandiant M-Trends, MITRE ATT&CK | |
| Evaluations). The full XpertSystems.ai synthetic data catalogue spans | |
| 41 SKUs across Cybersecurity, Healthcare, Insurance & Risk, Oil & Gas, | |
| and Materials & Energy. | |
| - 📧 **pradeep@xpertsystems.ai** | |
| - 🌐 **https://xpertsystems.ai** | |
| - 🗂 Dataset: https://huggingface.co/datasets/xpertsystems/cyb006-sample | |
| - 🤖 Companion models: | |
| - https://huggingface.co/xpertsystems/cyb001-baseline-classifier (network traffic) | |
| - https://huggingface.co/xpertsystems/cyb002-baseline-classifier (ATT&CK kill-chain) | |
| - https://huggingface.co/xpertsystems/cyb003-baseline-classifier (malware execution phase) | |
| - https://huggingface.co/xpertsystems/cyb004-baseline-classifier (phishing campaign phase) | |
| - https://huggingface.co/xpertsystems/cyb005-baseline-classifier (ransomware actor-tier attribution) | |
| ## Citation | |
| ```bibtex | |
| @misc{xpertsystems_cyb006_baseline_2026, | |
| title = {CYB006 Baseline Classifier: XGBoost and MLP for User Risk Tier Classification, with Structural-Leakage Diagnostic on Threat-Actor Detection}, | |
| author = {XpertSystems.ai}, | |
| year = {2026}, | |
| url = {https://huggingface.co/xpertsystems/cyb006-baseline-classifier}, | |
| note = {Baseline reference model trained on xpertsystems/cyb006-sample} | |
| } | |
| ``` | |