Implementation of
- Sketch-Based Anomaly Detection in Streaming Graphs. Siddharth Bhatia, Mohit Wadhwa, Kenji Kawaguchi, Neil Shah, Philip S. Yu, Bryan Hooi. KDD, 2023.
Existing methods only detect edge or subgraph anomalies. We extend count-min sketch to higher-order preserving the dense subgraph structure & detect both. Our approach is the first streaming method that uses dense subgraph search to detect graph anomalies in constant memory and time.
(a) Dense subgraph in the original graph between source nodes s1, s2, and destination nodes d1, d2, d3 is transformed to a (b) Dense submatrix between rows r1, r2, and columns c1, c2, c3 in the higher order CMS.
AnoEdge-G and AnoEdge-L detect edge anomalies by checking whether the received edge when mapped to a sketch matrix element is part of a dense submatrix. AnoEdge-G finds a Global dense submatrix and performs well in practice while ANOEDGE-L maintains and updates a Local dense submatrix around the matrix element and therefore has better time complexity.
AnoGraph and AnoGraph-K detect graph anomalies by first mapping the graph to a higher-order sketch, and then checking for a dense submatrix. AnoGraph greedily finds a dense submatrix with a 2-approximation guarantee on the density measure. AnoGraph-K greedily finds a dense submatrix around K strategically picked matrix elements performing equally well in practice.
- To run on DARPA dataset
bash demo.sh DARPA
- To run on ISCX dataset
bash demo.sh ISCX
CIC-IDS2018 and CIC-DDoS2019 Datasets are larger than 100MB and cannot be uploaded on Github. They can be downloaded from here. Please unzip and place the respective folders in the data folder of the repository.
This code has been tested on OS X 10.15.3 with a 2.4GHz Intel Core i9 processor.
- Python: Victor Hoffmann's Python
If you use this code for your research, please consider citing our KDD paper.
@inproceedings{bhatia2023anograph,
title={Sketch-Based Anomaly Detection in Streaming Graphs},
author={Siddharth Bhatia and Mohit Wadhwa and Kenji Kawaguchi and Neil Shah and Philip S. Yu and Bryan Hooi},
booktitle={SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)},
year={2023}
}