logo

PyPI version Documentation status GitHub stars GitHub forks PyPI downloads testing Coverage Status License

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detecting suspicious activities in social networks [DLS+20] and security systems [CCL+21].

PyGOD includes more than 10 latest graph-based detection algorithms, such as Dominant (SDM’19) and GUIDE (BigData’21). For consistency and accessibility, PyGOD is developed on top of PyTorch Geometric (PyG) and PyTorch, and follows the API design of PyOD. See examples below for detecting anomalies with PyGOD in 5 lines!

PyGOD is featured for:

  • Unified APIs, detailed documentation, and interactive examples across various graph-based algorithms.

  • Comprehensive coverage of more than 10 latest graph outlier detectors.

  • Full support of detections at multiple levels, such as node-, edge- (WIP), and graph-level tasks (WIP).

  • Scalable design for processing large graphs via mini-batch and sampling.

  • Streamline data processing with PyG–fully compatible with PyG data objects.

Outlier Detection Using PyGOD with 5 Lines of Code:

# train a dominant detector
from pygod.models import DOMINANT

model = DOMINANT(num_layers=4, epoch=20)  # hyperparameters can be set here
model.fit(data)  # data is a Pytorch Geometric data object

# get outlier scores on the input data
outlier_scores = model.decision_scores_ # raw outlier scores on the input data

# predict on the new data in the inductive setting
outlier_scores = model.decision_function(test_data) # raw outlier scores on the input data  # predict raw outlier scores on test

Citing PyGOD:

Our PyGOD benchmark paper is available on arxiv [LDZ+22]. If you use PyGOD in a scientific publication, we would appreciate citations to the following paper:

@article{liu2022benchmarking,
  author  = {Liu, Kay and Dou, Yingtong and Zhao, Yue and Ding, Xueying and Hu, Xiyang and Zhang, Ruitong and Ding, Kaize and Chen, Canyu and Peng, Hao and Shu, Kai and Sun, Lichao and Li, Jundong and Chen, George H. and Jia, Zhihao and Yu, Philip S.},
  title   = {Benchmarking Node Outlier Detection on Graphs},
  journal = {arXiv preprint arXiv:2206.10071},
  year    = {2022},
}

or:

Liu, K., Dou, Y., Zhao, Y., Ding, X., Hu, X., Zhang, R., Ding, K., Chen, C., Peng, H., Shu, K., Sun, L., Li, J., Chen, G.H., Jia, Z., and Yu, P.S. 2022. Benchmarking Node Outlier Detection on Graphs. arXiv preprint arXiv:2206.10071.

Implemented Algorithms#

PyGOD toolkit consists of two major functional groups:

(i) Node-level detection :

Type

Backbone

Abbr

Year

Sampling

Class

Unsupervised

MLP

MLPAE

2014

Yes

pygod.models.MLPAE

Unsupervised

Clustering

SCAN

2007

No

pygod.models.SCAN

Unsupervised

GNN

GCNAE

2016

Yes

pygod.models.GCNAE

Unsupervised

MF

Radar

2017

No

pygod.models.Radar

Unsupervised

MF

ANOMALOUS

2018

No

pygod.models.ANOMALOUS

Unsupervised

MF

ONE

2019

No

pygod.models.ONE

Unsupervised

GNN

DOMINANT

2019

Yes

pygod.models.DOMINANT

Unsupervised

MLP

DONE

2020

Yes

pygod.models.DONE

Unsupervised

MLP

AdONE

2020

Yes

pygod.models.AdONE

Unsupervised

GNN

AnomalyDAE

2020

Yes

pygod.models.AnomalyDAE

Unsupervised

GAN

GAAN

2020

Yes

pygod.models.GAAN

Unsupervised

GNN

OCGNN

2021

Yes

pygod.models.OCGNN

Unsupervised/SSL

GNN

CoLA (beta)

2021

In progress

pygod.models.CoLA

Unsupervised/SSL

GNN

ANEMONE (beta)

2021

In progress

pygod.models.ANEMONE

Unsupervised

GNN

GUIDE

2021

Yes

pygod.models.GUIDE

Unsupervised/SSL

GNN

CONAD

2022

Yes

pygod.models.CONAD

(ii) Utility functions :

Type

Name

Function

Documentation

Metric

eval_precision_at_k

Calculating Precision@k

eval_precision_at_k

Metric

eval_recall_at_k

Calculating Recall@k

eval_recall_at_k

Metric

eval_roc_auc

Calculating ROC-AUC Score

eval_roc_auc

Metric

eval_average_precision

Calculating average precision

eval_average_precision

Metric

eval_ndcg

Calculating NDCG

eval_ndcg

Generator

gen_structural_outliers

Generating structural outliers

gen_structural_outliers

Generator

gen_contextual_outliers

Generating attribute outliers

gen_contextual_outliers

Loader

load_data

Loading PyGOD built-in datasets

load_data


API CheatSheet#

The following APIs are applicable for all detector models for easy use.

Key Attributes of a fitted model:

Input of PyGOD: Please pass in a PyTorch Geometric (PyG) data object. See PyG data processing examples.