PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detecting suspicious activities in social networks [DLS+20] and security systems [CCL+21].
PyGOD includes more than 10 latest graph-based detection algorithms, such as Dominant (SDM’19) and GUIDE (BigData’21). For consistently and accessibility, PyGOD is developed on top of PyTorch Geometric (PyG) and PyTorch, and follows the API design of PyOD. See examples below for detecting anomalies with PyGOD in 5 lines!
PyGOD is under actively developed and will be updated frequently! Please star, watch, and fork.
PyGOD is featured for:
Unified APIs, detailed documentation, and interactive examples across various graph-based algorithms.
Comprehensive coverage of more than 10 latest graph outlier detectors.
Full support of detections at multiple levels, such as node-, edge-, and graph-level tasks (WIP).
Streamline data processing with PyG–fully compatible with PyG data objects.
Outlier Detection Using PyGOD with 5 Lines of Code:
# train a dominant detector
from pygod.models import DOMINANT
model = DOMINANT() # hyperparameters can be set here
model.fit(data) # data is a Pytorch Geometric data object
# get outlier scores on the input data
outlier_scores = model.decision_scores # raw outlier scores on the input data
# predict on the new data
outlier_scores = model.decision_function(test_data) # raw outlier scores on the input data # predict raw outlier scores on test
Citing PyGOD (to be announced soon):
PyGOD paper will be available on arxiv soon. If you use PyGOD in a scientific publication, we would appreciate citations to the following paper (to be announced):
@article{tba,
author = {tba},
title = {PyGOD: A Comprehensive Python Library for Graph Outlier Detection},
journal = {tba},
year = {2022},
}
or:
tba, 2022. PyGOD: A Comprehensive Python Library for Graph Outlier Detection. tba.
Implemented Algorithms#
PyGOD toolkit consists of two major functional groups:
(i) Node-level detection :
Type |
Backbone |
Abbr |
Algorithm |
Year |
Class |
---|---|---|---|---|---|
Unsupervised |
GNN |
DOMINANT |
Deep anomaly detection on attributed networks |
2019 |
|
Unsupervised |
GNN |
AnomalyDAE |
AnomalyDAE: Dual autoencoder for anomaly detection on attributed networks |
2020 |
|
Unsupervised |
GNN |
DONE |
Outlier Resistant Unsupervised Deep Architectures for Attributed Network Embedding |
2020 |
|
Unsupervised |
GNN |
AdONE |
Outlier Resistant Unsupervised Deep Architectures for Attributed Network Embedding |
2020 |
|
Unsupervised |
GNN |
GCNAE |
Variational Graph Auto-Encoders |
2021 |
|
Unsupervised |
NN |
MLPAE |
Neural Networks and Deep Learning |
2021 |
|
Unsupervised |
GNN |
GUIDE |
Higher-order Structure Based Anomaly Detection on Attributed Networks |
2021 |
|
Unsupervised |
GNN |
OCGNN |
One-Class Graph Neural Networks for Anomaly Detection in Attributed Networks |
2021 |
|
Unsupervised |
MF |
ONE |
Outlier aware network embedding for attributed networks |
2019 |
|
Unsupervised |
GAN |
GAAN |
Generative Adversarial Attributed Network Anomaly Detection |
2020 |
(ii) Utility functions :
Type |
Name |
Function |
Documentation |
---|---|---|---|
Metric |
eval_precision_at_k |
Calculating Precision@k |
|
Metric |
eval_recall_at_k |
Calculating Recall@k |
|
Metric |
eval_roc_auc |
Calculating ROC-AUC Score |
|
Data |
gen_structure_outliers |
Generating structural outliers |
|
Data |
gen_attribute_outliers |
Generating attribute outliers |
API CheatSheet#
The following APIs are applicable for all detector models for easy use.
pygod.models.base.BaseDetector.fit()
: Fit detector. y is ignored in unsupervised methods.pygod.models.base.BaseDetector.decision_function()
: Predict raw anomaly scores of PyG Graph G using the fitted detectorpygod.models.base.BaseDetector.predict()
: Predict if a particular sample is an outlier or not using the fitted detector.pygod.models.base.BaseDetector.predict_proba()
: Predict the probability of a sample being outlier using the fitted detector.pygod.models.base.BaseDetector.predict_confidence()
: Predict the model’s sample-wise confidence (available in predict and predict_proba).pygod.models.base.BaseDetector.process_graph()
(you do not need to call this explicitly): Process the raw PyG data object into a tuple of sub data objects needed for the underlying model.
Key Attributes of a fitted model:
pygod.models.base.BaseDetector.decision_scores_
: The outlier scores of the training data. The higher, the more abnormal. Outliers tend to have higher scores.pygod.models.base.BaseDetector.labels_
: The binary labels of the training data. 0 stands for inliers and 1 for outliers/anomalies.
Input of PyGOD: Please pass in a PyTorch Geometric (PyG) data object. See PyG data processing examples.