PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detecting suspicious activities in social networks [DLS+20] and security systems [CCL+21].
PyGOD includes more than 10 latest graphbased detection algorithms, such as Dominant (SDM’19) and GUIDE (BigData’21). For consistency and accessibility, PyGOD is developed on top of PyTorch Geometric (PyG) and PyTorch, and follows the API design of PyOD. See examples below for detecting anomalies with PyGOD in 5 lines!
PyGOD is featured for:
Unified APIs, detailed documentation, and interactive examples across various graphbased algorithms.
Comprehensive coverage of more than 10 latest graph outlier detectors.
Full support of detections at multiple levels, such as node, edge (WIP), and graphlevel tasks (WIP).
Scalable design for processing large graphs via minibatch and sampling.
Streamline data processing with PyG–fully compatible with PyG data objects.
Outlier Detection Using PyGOD with 5 Lines of Code:
# train a dominant detector
from pygod.models import DOMINANT
model = DOMINANT(num_layers=4, epoch=20) # hyperparameters can be set here
model.fit(data) # data is a Pytorch Geometric data object
# get outlier scores on the input data
outlier_scores = model.decision_scores_ # raw outlier scores on the input data
# predict on the new data in the inductive setting
outlier_scores = model.decision_function(test_data) # raw outlier scores on the input data # predict raw outlier scores on test
Citing PyGOD:
Our PyGOD benchmark paper is available on arxiv [LDZ+22]. If you use PyGOD in a scientific publication, we would appreciate citations to the following paper:
@article{liu2022benchmarking,
author = {Liu, Kay and Dou, Yingtong and Zhao, Yue and Ding, Xueying and Hu, Xiyang and Zhang, Ruitong and Ding, Kaize and Chen, Canyu and Peng, Hao and Shu, Kai and Sun, Lichao and Li, Jundong and Chen, George H. and Jia, Zhihao and Yu, Philip S.},
title = {Benchmarking Node Outlier Detection on Graphs},
journal = {arXiv preprint arXiv:2206.10071},
year = {2022},
}
or:
Liu, K., Dou, Y., Zhao, Y., Ding, X., Hu, X., Zhang, R., Ding, K., Chen, C., Peng, H., Shu, K., Sun, L., Li, J., Chen, G.H., Jia, Z., and Yu, P.S. 2022. Benchmarking Node Outlier Detection on Graphs. arXiv preprint arXiv:2206.10071.
Implemented Algorithms#
PyGOD toolkit consists of two major functional groups:
(i) Nodelevel detection :
Type 
Backbone 
Abbr 
Year 
Sampling 
Class 

Unsupervised 
MLP 
MLPAE 
2014 
Yes 

Unsupervised 
Clustering 
SCAN 
2007 
No 

Unsupervised 
GNN 
GCNAE 
2016 
Yes 

Unsupervised 
MF 
Radar 
2017 
No 

Unsupervised 
MF 
ANOMALOUS 
2018 
No 

Unsupervised 
MF 
ONE 
2019 
No 

Unsupervised 
GNN 
DOMINANT 
2019 
Yes 

Unsupervised 
MLP 
DONE 
2020 
Yes 

Unsupervised 
MLP 
AdONE 
2020 
Yes 

Unsupervised 
GNN 
AnomalyDAE 
2020 
Yes 

Unsupervised 
GAN 
GAAN 
2020 
Yes 

Unsupervised 
GNN 
OCGNN 
2021 
Yes 

Unsupervised/SSL 
GNN 
CoLA (beta) 
2021 
In progress 

Unsupervised/SSL 
GNN 
ANEMONE (beta) 
2021 
In progress 

Unsupervised 
GNN 
GUIDE 
2021 
Yes 

Unsupervised/SSL 
GNN 
CONAD 
2022 
Yes 
(ii) Utility functions :
Type 
Name 
Function 
Documentation 

Metric 
eval_precision_at_k 
Calculating Precision@k 

Metric 
eval_recall_at_k 
Calculating Recall@k 

Metric 
eval_roc_auc 
Calculating ROCAUC Score 

Metric 
eval_average_precision 
Calculating average precision 

Metric 
eval_ndcg 
Calculating NDCG 

Generator 
gen_structural_outliers 
Generating structural outliers 

Generator 
gen_contextual_outliers 
Generating attribute outliers 

Loader 
load_data 
Loading PyGOD builtin datasets 
API CheatSheet#
The following APIs are applicable for all detector models for easy use.
pygod.models.base.BaseDetector.fit()
: Fit detector. y is ignored in unsupervised methods.pygod.models.base.BaseDetector.decision_function()
: Predict raw anomaly scores of PyG Graph G using the fitted detectorpygod.models.base.BaseDetector.predict()
: Predict if a particular sample is an outlier or not using the fitted detector.pygod.models.base.BaseDetector.predict_proba()
: Predict the probability of a sample being outlier using the fitted detector.pygod.models.base.BaseDetector.predict_confidence()
: Predict the model’s samplewise confidence (available in predict and predict_proba).pygod.models.base.BaseDetector.process_graph()
(you do not need to call this explicitly): Process the raw PyG data object into a tuple of sub data objects needed for the underlying model.
Key Attributes of a fitted model:
pygod.models.base.BaseDetector.decision_scores_
: The outlier scores of the training data. The higher, the more abnormal. Outliers tend to have higher scores.pygod.models.base.BaseDetector.labels_
: The binary labels of the training data. 0 stands for inliers and 1 for outliers/anomalies.
Input of PyGOD: Please pass in a PyTorch Geometric (PyG) data object. See PyG data processing examples.