PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detecting suspicious activities in social networks [DLS+20] and security systems [CCL+21].

PyGOD includes more than 10 latest graph-based detection algorithms, such as Dominant (SDM’19) and GUIDE (BigData’21). For consistency and accessibility, PyGOD is developed on top of PyTorch Geometric (PyG) and PyTorch, and follows the API design of PyOD. See examples below for detecting anomalies with PyGOD in 5 lines!

PyGOD is featured for:

Unified APIs, detailed documentation, and interactive examples across various graph-based algorithms.
Comprehensive coverage of more than 10 latest graph outlier detectors.
Full support of detections at multiple levels, such as node-, edge- (WIP), and graph-level tasks (WIP).
Scalable design for processing large graphs via mini-batch and sampling.
Streamline data processing with PyG–fully compatible with PyG data objects.

Outlier Detection Using PyGOD with 5 Lines of Code:

# train a dominant detector
from pygod.models import DOMINANT

model = DOMINANT(num_layers=4, epoch=20)  # hyperparameters can be set here
model.fit(data)  # data is a Pytorch Geometric data object

# get outlier scores on the input data
outlier_scores = model.decision_scores # raw outlier scores on the input data

# predict on the new data in the inductive setting
outlier_scores = model.decision_function(test_data) # raw outlier scores on the input data  # predict raw outlier scores on test

Citing PyGOD:

PyGOD paper is available on arxiv [LDZ+22]. If you use PyGOD in a scientific publication, we would appreciate citations to the following paper:

@article{pygod2022,
  author  = {Liu, Kay and Dou, Yingtong and Zhao, Yue and Ding, Xueying and Hu, Xiyang and Zhang, Ruitong and Ding, Kaize and Chen, Canyu and Peng, Hao and Shu, Kai and Chen, George H. and Jia, Zhihao and Yu, Philip S.},
  title   = {PyGOD: A Python Library for Graph Outlier Detection},
  journal = {arXiv preprint arXiv:2204.12095},
  year    = {2022},
}

or:

Liu, K., Dou, Y., Zhao, Y., Ding, X., Hu, X., Zhang, R., Ding, K., Chen, C., Peng, H., Shu, K., Chen, G.H., Jia, Z., and Yu, P.S. 2022. PyGOD: A Python Library for Graph Outlier Detection. arXiv preprint arXiv:2204.12095.

Implemented Algorithms#

PyGOD toolkit consists of two major functional groups:

(i) Node-level detection :

Type	Backbone	Abbr	Year	Sampling	Class
Unsupervised	NN	MLPAE	2014	Yes	`pygod.models.MLPAE`
Unsupervised	GNN	GCNAE	2016	Yes	`pygod.models.GCNAE`
Unsupervised	MF	ONE	2019	No	`pygod.models.ONE`
Unsupervised	GNN	DOMINANT	2019	Yes	`pygod.models.DOMINANT`
Unsupervised	GNN	DONE	2020	Yes	`pygod.models.DONE`
Unsupervised	GNN	AdONE	2020	Yes	`pygod.models.AdONE`
Unsupervised	GNN	AnomalyDAE	2020	Yes	`pygod.models.AnomalyDAE`
Unsupervised	GAN	GAAN	2020	Yes	`pygod.models.GAAN`
Unsupervised	GNN	OCGNN	2021	Yes	`pygod.models.OCGNN`
Unsupervised/SSL	GNN	CoLA (beta)	2021	In progress	`pygod.models.CoLA`
Unsupervised/SSL	GNN	ANEMONE (beta)	2021	In progress	`pygod.models.ANEMONE`
Unsupervised	GNN	GUIDE	2021	Yes	`pygod.models.GUIDE`
Unsupervised/SSL	GNN	CONAD	2022	Yes	`pygod.models.CONAD`

(ii) Utility functions :

Type	Name	Function	Documentation
Metric	eval_precision_at_k	Calculating Precision@k	eval_precision_at_k
Metric	eval_recall_at_k	Calculating Recall@k	eval_recall_at_k
Metric	eval_roc_auc	Calculating ROC-AUC Score	eval_roc_auc
Metric	eval_average_precision	Calculating average precision	eval_average_precision
Data	gen_structure_outliers	Generating structural outliers	gen_structure_outliers
Data	gen_attribute_outliers	Generating attribute outliers	gen_attribute_outliers

API CheatSheet#

The following APIs are applicable for all detector models for easy use.

pygod.models.base.BaseDetector.fit(): Fit detector. y is ignored in unsupervised methods.
pygod.models.base.BaseDetector.decision_function(): Predict raw anomaly scores of PyG Graph G using the fitted detector
pygod.models.base.BaseDetector.predict(): Predict if a particular sample is an outlier or not using the fitted detector.
pygod.models.base.BaseDetector.predict_proba(): Predict the probability of a sample being outlier using the fitted detector.
pygod.models.base.BaseDetector.predict_confidence(): Predict the model’s sample-wise confidence (available in predict and predict_proba).
pygod.models.base.BaseDetector.process_graph() (you do not need to call this explicitly): Process the raw PyG data object into a tuple of sub data objects needed for the underlying model.

Key Attributes of a fitted model:

pygod.models.base.BaseDetector.decision_scores_: The outlier scores of the training data. The higher, the more abnormal. Outliers tend to have higher scores.
pygod.models.base.BaseDetector.labels_: The binary labels of the training data. 0 stands for inliers and 1 for outliers/anomalies.

Input of PyGOD: Please pass in a PyTorch Geometric (PyG) data object. See PyG data processing examples.