{ "purpose": "CYB011 sample has multiple structural leakage patterns rooted in the generator's outcome-modeling logic. Three outcome columns (detection_outcome, detector_confidence_score, evasion_budget_consumed) are perfect or near-perfect oracles for attack_phase. Per-campaign features encode attacker_capability_tier via stealth_score. Per-segment topology features uniquely fingerprint each defender_architecture. The published baseline (attack_phase 7-class) trains with the three phase oracles excluded but retains timestep as a legitimate campaign-progress observable.", "primary_target": "attack_phase (7-class, per-timestep)", "split": "GroupShuffleSplit on campaign_id, 70/15/15 nested", "missing_attacker_tier_note": { "issue": "README claims 4 attacker_capability_tier values (script_kiddie, opportunistic, advanced_persistent_threat, nation_state). The sample data contains only 3: nation_state is entirely absent. Models trained on this sample cannot generalize to nation_state actors.", "tier_counts_in_sample": { "script_kiddie": 7000, "opportunistic": 5600, "advanced_persistent_threat": 1400 } }, "oracle_paths_documented": { "P1_detection_outcome": { "target": "attack_phase", "leak_column": "detection_outcome", "mechanism": "Three of the four detection_outcome values (evasion_success, marginal_alert, high_confidence_alert) occur ONLY when attack_phase == 'evasion_attempt'. The fourth value (suppressed_alert) occurs across all 7 phases. So detection_outcome != suppressed_alert is a perfect oracle for evasion_attempt phase.", "evidence_crosstab": { "evasion_success": { "campaign_consolidation": 0, "evasion_attempt": 416, "feature_space_probe": 0, "feedback_adaptation": 0, "idle_dwell": 0, "perturbation_craft": 0, "reconnaissance": 0 }, "high_confidence_alert": { "campaign_consolidation": 0, "evasion_attempt": 1102, "feature_space_probe": 0, "feedback_adaptation": 0, "idle_dwell": 0, "perturbation_craft": 0, "reconnaissance": 0 }, "marginal_alert": { "campaign_consolidation": 0, "evasion_attempt": 3228, "feature_space_probe": 0, "feedback_adaptation": 0, "idle_dwell": 0, "perturbation_craft": 0, "reconnaissance": 0 }, "suppressed_alert": { "campaign_consolidation": 829, "evasion_attempt": 2460, "feature_space_probe": 1465, "feedback_adaptation": 496, "idle_dwell": 2450, "perturbation_craft": 745, "reconnaissance": 809 } }, "verdict": "Perfect oracle for evasion_attempt (51% of all events)." }, "P2_detector_confidence_score": { "target": "attack_phase (via detection_outcome)", "leak_column": "detector_confidence_score", "mechanism": "detector_confidence_score is threshold-derived from detection_outcome: <0.25 -> evasion_success, [0.52,0.78] -> marginal_alert, >=0.78 -> high_confidence_alert. Non-overlapping ranges mean detection_outcome is mechanically decoded from this score, indirectly oracling attack_phase.", "score_ranges_by_outcome": { "evasion_success": { "min": 0.001, "max": 0.25, "mean": 0.1801, "std": 0.0553 }, "high_confidence_alert": { "min": 0.7801, "max": 0.999, "mean": 0.8558, "std": 0.0561 }, "marginal_alert": { "min": 0.5201, "max": 0.7797, "mean": 0.6436, "std": 0.0737 }, "suppressed_alert": { "min": 0.001, "max": 0.999, "mean": 0.3992, "std": 0.1817 } }, "verdict": "Mechanical decoder for detection_outcome -> indirect oracle for phase." }, "P3_evasion_budget_consumed_zero": { "target": "attack_phase (3 early phases)", "leak_column": "evasion_budget_consumed", "mechanism": "evasion_budget_consumed == 0 occurs in 100% of {reconnaissance, feature_space_probe, perturbation_craft} events (the 3 early phases that don't submit evasion attempts). > 0 occurs in 100% of the 4 later phases.", "early_phase_events_at_zero": 3019, "verdict": "Perfect oracle for the 3 early phases." }, "P4_stealth_score_to_tier": { "target": "attacker_capability_tier (campaign level)", "leak_column": "stealth_score", "mechanism": "stealth_score has tier-discriminative ranges with modest overlap: APT in [0.806, 0.938] (mean 0.912), opportunistic in [0.751, 0.924] (mean 0.882), script_kiddie in [0.715, 0.950] (mean 0.846). Drives per-campaign tier prediction to 0.94 accuracy vs 0.50 majority - artificially inflated.", "stealth_ranges_by_tier": { "advanced_persistent_threat": { "min": 0.806, "max": 0.938, "mean": 0.9116, "std": 0.0277 }, "opportunistic": { "min": 0.7508, "max": 0.9236, "mean": 0.8816, "std": 0.0359 }, "script_kiddie": { "min": 0.7148, "max": 0.95, "mean": 0.8456, "std": 0.0462 } }, "verdict": "Near-deterministic per-tier feature. Per-campaign tier prediction is structurally inflated by this leak." }, "P5_topology_fingerprint": { "target": "defender_architecture", "leak_column": "(combination of 7 topology features)", "mechanism": "Each defender_architecture has detection_strength and adversarial_robustness as a CONSTANT (std = 0.0 across all rows of that architecture). Combined with ranges of ensemble_size, alert_threshold, detection_coverage, feature_space_dim, and retraining_cadence_days, each topology row uniquely fingerprints its defender. The 8-class defender_architecture target hits 100% accuracy via this combination.", "detection_strength_std_within_arch": { "autoencoder_anomaly": 0.0, "ensemble_stacked": 0.0, "gradient_boosted_tree": 0.0, "isolation_forest": 0.0, "lstm_behavioural": 0.0, "neural_network_dense": 0.0, "rule_based_threshold": 0.0, "transformer_sequence": 0.0 }, "adversarial_robustness_std_within_arch": { "autoencoder_anomaly": 0.0, "ensemble_stacked": 0.0, "gradient_boosted_tree": 0.0, "isolation_forest": 0.0, "lstm_behavioural": 0.0, "neural_network_dense": 0.0, "rule_based_threshold": 0.0, "transformer_sequence": 0.0 }, "verdict": "Trivially leaky 8-class target. Each segment row uniquely identifies its defender architecture by feature combination." }, "P6_timestep_partial": { "target": "attack_phase (partial)", "leak_column": "timestep", "mechanism": "Phases have characteristic timestep ranges due to the sequential lifecycle structure. reconnaissance is timestep 1-7 (mean 3.16), campaign_consolidation is 65-70 (mean 67.96), feedback_adaptation is 63-66 (mean 64.15). The middle phases overlap broadly. NOTE: timestep is KEPT as a feature in the published model because it's a legitimate campaign-progress observable a defender would have at decision time. Documenting here for transparency: removing timestep drops headline accuracy by ~9pp (0.87 -> 0.78).", "timestep_ranges_by_phase": { "campaign_consolidation": { "min": 65, "max": 70, "mean": 67.96 }, "evasion_attempt": { "min": 11, "max": 62, "mean": 40.32 }, "feature_space_probe": { "min": 4, "max": 35, "mean": 11.29 }, "feedback_adaptation": { "min": 63, "max": 66, "mean": 64.15 }, "idle_dwell": { "min": 1, "max": 70, "mean": 35.44 }, "perturbation_craft": { "min": 8, "max": 38, "mean": 16.65 }, "reconnaissance": { "min": 1, "max": 7, "mean": 3.16 } }, "verdict": "Partial oracle for 3 phases (reconnaissance, feedback_adaptation, campaign_consolidation). KEPT as legitimate progress feature." } }, "unlearnable_targets": [ { "target": "campaign_success_flag (per-campaign)", "n_campaigns": 200, "majority_baseline": 0.605, "honest_accuracy": 0.5111111111111111, "honest_roc_auc": 0.48765432098765427, "verdict": "below_majority" }, { "target": "campaign_type (per-campaign)", "n_campaigns": 200, "majority_baseline": 0.17, "honest_accuracy": 0.11111111111111112, "honest_roc_auc": 0.48226979604757386, "verdict": "below_majority" }, { "target": "coordinated_attack_flag (per-campaign)", "n_campaigns": 200, "majority_baseline": 0.9, "honest_accuracy": 0.8333333333333334, "honest_roc_auc": 0.38271604938271603, "verdict": "below_majority" }, { "target": "defender_architecture (per-campaign, all 7 topology fingerprint features dropped)", "n_campaigns": 200, "majority_baseline": 0.17, "honest_accuracy": 0.13333333333333333, "honest_roc_auc": 0.5770656344684122, "verdict": "below_majority", "note": "With all 7 topology fingerprint features included, defender_architecture hits 100% trivially. With all 7 dropped, performance collapses to or below majority. The target is not learnable from the trajectory features themselves - only from the segment fingerprint." } ], "unlearnable_summary": "Four README-suggested headline targets are unlearnable on the sample after honest oracle removal: campaign_success_flag (acc ~0.51 vs maj 0.61), campaign_type 8-class (acc ~0.11 vs maj 0.17), coordinated_attack_flag (acc ~0.83 vs maj 0.90), and defender_architecture 8-class (trivially leaky via topology fingerprint; collapses when the fingerprint is dropped). Only attack_phase 7-class learns honestly with a respectable lift over majority.", "recommendations_to_dataset_author": [ "Make detector_confidence_score have OVERLAPPING ranges across detection_outcome values. As shipped, the ranges are perfectly non-overlapping (high_confidence_alert >=0.78, marginal_alert [0.52, 0.78], evasion_success <0.25). This makes detection_outcome a mechanical function of the score.", "Allow evasion_budget_consumed to be positive in some reconnaissance / feature_space_probe / perturbation_craft events. The current zero-only encoding creates a perfect oracle for these 3 phases.", "Add per-tier feature noise. stealth_score has tier-discriminative ranges (APT >0.80, script_kiddie <0.95) but with substantial overlap. Tighten the noise so the per-campaign tier-attribution task isn't structurally inflated.", "Add per-segment NOISE to detection_strength and adversarial_robustness. Currently these are CONSTANT per defender_architecture (std=0.0). Real systems have deployment-specific tuning, so these should vary within an architecture class.", "Include the missing nation_state attacker tier in the sample. The README lists 4 tiers but the sample contains only 3. Buyers cannot validate nation_state-specific modeling on the sample.", "Increase coordinated_attack positives in the sample (only 20 of 200 campaigns at 10%). With n=20 positives, the binary task has insufficient statistical power for honest evaluation.", "For campaign_type 8-class, add stronger per-type feature signatures. Currently the 8 types are not discriminable from trajectory features at n=200 campaigns." ] }