Koda-NIDS: XGBoost Intrusion Detection System

Koda-NIDS is a high-performance gradient-boosted decision tree classifier designed to identify malicious network activity. Built on the UNSW-NB15 dataset, it provides a robust defense mechanism for identifying various network attacks through behavioral flow analysis.


Model Details

Field Value
Model Name Koda-NIDS
Model Type XGBoost Classifier
Task Binary Network Intrusion Detection (NIDS)
Input Features 42 network flow features (duration, protocols, TTL, etc.)
Target label β€” 0: Normal, 1: Attack

Performance Summary

Evaluated on UNSW_NB15_testing-set.csv (175,341 samples), Koda-NIDS demonstrates strong generalization and exceptional precision in identifying threats.

Overall

Metric Value
Test Accuracy 89.54%
Macro Avg Precision 0.88
Macro Avg Recall 0.92
Macro Avg F1-Score 0.89
Weighted Avg F1-Score 0.90

Per-class breakdown

Class Precision Recall F1-Score Support
0 Normal 0.76 0.98 0.86 56,000
1 Attack 0.99 0.86 0.92 119,341

Top 5 Predictive Features

Rank Feature Importance Description
1 sttl 34.2% Source Time-to-Live β€” high importance in identifying packet origin anomalies
2 ct_dst_sport_ltm 10.7% Concentration of connections to the same destination and source port
3 tcprtt 9.1% TCP round-trip time metrics
4 ct_dst_src_ltm 5.5% Density of connections between specific host pairs
5 synack 5.3% Timing of the TCP three-way handshake

Usage & Deployment

To deploy Koda-NIDS, ensure you have both the model weights (xgb_intrusion_model.json) and the associated label encoders (label_encoders.pkl).

Python Implementation

import xgboost as xgb
import pickle
import pandas as pd

# 1. Load the Koda-NIDS brain
model = xgb.XGBClassifier()
model.load_model("xgb_intrusion_model.json")

# 2. Load the specific encoders for proto, service, and state
with open("label_encoders.pkl", "rb") as f:
    encoders = pickle.load(f)

# 3. Predict
# Ensure your input data is preprocessed using the exact same logic
# found in the training pipeline.

Preprocessing Requirements

The model expects raw network features to be cleaned as follows:

  • String Normalization β€” Categorical features (proto, service, state) must be lowercased and stripped of whitespace.
  • Label Encoding β€” Apply the mappings from label_encoders.pkl.
  • Imputation β€” Missing numerical values should be filled with 0. Unseen categorical labels should be mapped to the missing category defined during training.

Limitations & Scope

  • Extraction Dependency β€” Koda-NIDS requires pre-extracted flow features. It does not ingest raw .pcap files directly without a feature extraction layer.
  • Protocol Drift β€” Trained on 2015 network behaviors; effectiveness against ultra-modern evasion techniques may vary.

Citation

If you use Koda-NIDS in your research, please cite the underlying dataset:

Moustafa, Nour, and Jill Slay. "UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set)." Military Communications and Information Systems Conference (MilCIS), 2015.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support