- class pygod.detector.CONAD(hid_dim=64, num_layers=4, dropout=0.0, weight_decay=0.0, act=<function relu>, sigmoid_s=False, backbone=<class 'torch_geometric.nn.models.basic_gnn.GCN'>, contamination=0.1, lr=0.004, epoch=100, gpu=-1, batch_size=0, num_neigh=-1, weight=0.5, eta=0.5, margin=0.5, r=0.2, m=50, k=50, f=10, verbose=0, save_emb=False, compile_model=False, **kwargs)#
Contrastive Attributed Network Anomaly Detection
CONAD is an anomaly detector consisting of a shared graph convolutional encoder, a structure reconstruction decoder, and an attribute reconstruction decoder. The model is trained with both contrastive loss and structure/attribute reconstruction loss. The reconstruction mean square error of the decoders are defined as structure anomaly score and attribute anomaly score, respectively.
See [XHZ+22] for details.
hid_dim (int, optional) – Hidden dimension of model. Default:
num_layers (int, optional) – Total number of layers in model. Default:
dropout (float, optional) – Dropout rate. Default:
weight_decay (float, optional) – Weight decay (L2 penalty). Default:
act (callable activation function or None, optional) – Activation function if not None. Default:
sigmoid_s (bool, optional) – Whether to use sigmoid function to scale the reconstructed structure. Default:
backbone (torch.nn.Module, optional) – The backbone of the deep detector implemented in PyG. Default:
contamination (float, optional) – The amount of contamination of the dataset in (0., 0.5], i.e., the proportion of outliers in the dataset. Used when fitting to define the threshold on the decision function. Default:
lr (float, optional) – Learning rate. Default:
epoch (int, optional) – Maximum number of training epoch. Default:
gpu (int) – GPU Index, -1 for using CPU. Default:
batch_size (int, optional) – Minibatch size, 0 for full batch training. Default:
num_neigh (int, optional) – Number of neighbors in sampling, -1 for all neighbors. Default:
weight (float, optional) – Weight between reconstruction of node feature and structure. Default:
eta (float, optional) – Weight between contrastive and reconstruction.Default:
margin (float, optional) – Margin in margin ranking loss. Default:
r (float, optional) – The rate of augmented anomalies. Default:
m (int, optional) – For densely connected nodes, the number of edges to add. Default:
k (int, optional) – Same as
f (int, optional) – For disproportionate nodes, the scale factor applied on their attribute value. Default:
verbose (int, optional) – Verbosity mode. Range in [0, 3]. Larger value for printing out more log information. Default:
save_emb (bool, optional) – Whether to save the embedding. Default:
compile_model (bool, optional) – Whether to compile the model with
**kwargs (optional) – Additional arguments for the backbone.
The outlier scores of the training data. Outliers tend to have higher scores. This value is available once the detector is fitted.
The threshold is based on
contamination. It is the \(N`*``contamination`\) most abnormal samples in
decision_score_. The threshold is calculated for generating binary outlier labels.
The binary labels of the training data. 0 stands for inliers and 1 for outliers. It is generated by applying
The learned node hidden embeddings of shape \(N \times\)
hid_dim. Only available when
True. When the detector has not been fitted,
None. When the detector has multiple embeddings,
embis a tuple of torch.Tensor.
- fit(data, label=None)#
Fit detector with training data.
self – Fitted detector.
- Return type:
- predict(data=None, label=None, return_pred=True, return_score=False, return_prob=False, prob_method='linear', return_conf=False, return_emb=False)#
Prediction for testing data using the fitted detector. Return predicted labels by default.
data (torch_geometric.data.Data, optional) – The testing graph. If
None, the training data is used. Default:
label (torch.Tensor, optional) – The optional outlier ground truth labels used for testing. Default:
return_pred (bool, optional) – Whether to return the predicted binary labels. The labels are determined by the outlier contamination on the raw outlier scores. Default:
return_score (bool, optional) – Whether to return the raw outlier scores. Default:
return_prob (bool, optional) – Whether to return the outlier probabilities. Default:
prob_method (str, optional) –
The method to convert the outlier scores to probabilities. Two approaches are possible:
'linear': simply use min-max conversion to linearly transform the outlier scores into the range of [0,1]. The model must be fitted first.
'unify': use unifying scores, see [KKSZ11].
return_conf (boolean, optional) – Whether to return the model’s confidence in making the same prediction under slightly different training sets. See [PVD20]. Default:
return_emb (bool, optional) – Whether to return the learned node representations. Default:
pred (torch.Tensor) – The predicted binary outlier labels of shape \(N\). 0 stands for inliers and 1 for outliers. Only available when
score (torch.Tensor) – The raw outlier scores of shape \(N\). Only available when
prob (torch.Tensor) – The outlier probabilities of shape \(N\). Only available when
conf (torch.Tensor) – The prediction confidence of shape \(N\). Only available when