AI/MLJan 2025 — May 2025
ML Intrusion Detection Evaluation
A scalable pipeline benchmarking RF, XGBoost, and TF-SVM models for network intrusion detection.
- 0.94
- Macro-F1 (XGBoost)
- 119k+
- Network flows
- 80+
- Features analyzed
Highlights
- Engineered a scalable ML pipeline to benchmark Random Forest, XGBoost, and TensorFlow SVM models on the CIC-IDS2017 dataset — 119,000+ balanced network flows.
- Mitigated extreme class imbalance via strategic data capping and stratified 10-fold cross-validation for robust, realistic evaluation metrics.
- Hit a 0.94 macro-average F1 with XGBoost by analyzing feature importance across 80+ dimensions to surface the protocol semantics that drive predictions.
Stack
PythonXGBoostscikit-learnTensorFlow