GADNRBase#
- class pygod.nn.GADNRBase(in_dim, hid_dim=64, encoder_layers=1, deg_dec_layers=4, fea_dec_layers=3, sample_size=2, sample_time=3, neighbor_num_list=None, neigh_loss='KL', lambda_loss1=0.01, lambda_loss2=0.001, lambda_loss3=0.0001, full_batch=True, dropout=0.0, act=<function relu>, backbone=<class 'torch_geometric.nn.models.basic_gnn.GCN'>, device='cpu', **kwargs)[source]#
Bases:
Module
Graph Anomaly Detection via Neighborhood Reconstruction
GAD-NR is a new type of GAE based on neighborhood reconstruction for graph anomaly detection. GAD-NR aims to reconstruct the entire neighborhood (including local structure, self attributes, and neighbors attributes) around a node based on the corresponding node representation.
See [RSL+24] for details.
- Parameters:
in_dim (int) – Input dimension of model.
hid_dim (int) – Hidden dimension of model. Default:
64
.encoder_layers (int, optional) – The number of layers for the graph encoder. Default:
1
.deg_dec_layers (int, optional) – The number of layers for the node degree decoder. Default:
4
.fea_dec_layers (int, optional) – The number of layers for the node feature decoder. Default:
3
.sample_size (int, optional) – The number of samples for the neighborhood distribution. Default:
2
.sample_time (int, optional) – The number sample times to remove the noise during node feature and neighborhood distribution reconstruction. Default:
3
.neighbor_num_list (torch.Tensor) – The node degree tensor used by the PNAConv model.
neigh_loss (str, optional) – The neighbor reconstruction loss.
KL
represents the KL divergence loss,W2
represents the W2 loss. Default:KL
.lambda_loss1 (float, optional) – The weight of the neighborhood reconstruction loss term. Default:
1e-2
.lambda_loss2 (float, optional) – The weight of the node feature reconstruction loss term. Default:
1e-3
.lambda_loss3 (float, optional) – The weight of the node degree reconstruction loss term. Default:
1e-4
.full_batch (bool, optional) – Whether in the full batch or the mini-batch training/inference mode. Default:
True
.dropout (float, optional) – Dropout rate. Default:
0.
.act (callable activation function or None, optional) – Activation function if not None. Default:
torch.nn.functional.relu
.backbone (torch.nn.Module, optional) – The backbone of the deep detector implemented in PyG. Default:
torch_geometric.nn.GCN
.device (string, optional) – The device used by the model. Default:
cpu
.**kwargs (optional) – Additional arguments for the backbone.
- forward(x, edge_index, input_id=None, neighbor_dict=None, id_mapping=None)[source]#
Forward computation.
- Parameters:
x (torch.Tensor) – Input attribute embeddings.
edge_index (torch.Tensor) – Edge index.
input_id (List) – List of center node ids in the current batch. If
input_id
is notNone
, the input data is a sampled mini_batch. Ifinput_id
isNone
, the input data is a full batch. Default:None
.neighbor_dict (Dict) – Dictionary where nodes in the current batch as keys and their neighbor list as corresponding values. If
neighbor_dict
is notNone
, the input data is a sampled mini_batch. Ifneighbor_dict
isNone
, the input data is a full batch. Default:None
.id_mapping (Dict) – Dictionary where nodes in the current batch as keys and their feature matrix id as the values. If
id_mapping
is notNone
, the input data is a sampled mini_batch. Ifid_mapping
isNone
, the input data is a full batch. Default:None
.
- Returns:
h0 (torch.Tensor) – Node feature initial embeddings.
degree_logits (torch.Tensor) – Reconstructed node degree logits.
feat_recon_list (List[torch.Tensor]) – Reconstructed node features.
neigh_recon_list (List[torch.Tensor]) – Reconstructed neighbor distributions.
- loss_func(h0, degree_logits, feat_recon_list, neigh_recon_list, ground_truth_degree_matrix)[source]#
The loss function proposed in the GAD-NR paper.
- Parameters:
h0 (torch.Tensor) – Node feature initial embeddings.
degree_logits (torch.Tensor) – Reconstructed node degree logits.
feat_recon_list (List[torch.Tensor]) – Reconstructed node features.
neigh_recon_list (List[torch.Tensor]) – Reconstructed neighbor distributions.
ground_truth_degree_matrix (torch.Tensor) – The ground truth degree of the input nodes.
- Returns:
loss (torch.Tensor) – The total loss value used to backpropagate and update the model parameters.
loss_per_node (torch.Tensor) – The original loss value per node used to compute the decision score (outlier score) of the node.
h_loss_per_node (torch.Tensor) – The neigborhood reconstruction loss value per node used to compute the adaptive decision score (outlier score) of the node.
degree_loss_per_node (torch.Tensor) – The node degree reconstruction loss value per node used to compute the adaptive decision score (outlier score) of the node.
feature_loss_per_node (torch.Tensor) – The node feature reconstruction loss value per node used to compute the adaptive decision score (outlier score) of the node.
- static process_graph(data, input_id=None)[source]#
Preprocess the input graph and obtain the required data for future use.
- Parameters:
data (torch_geometric.data.Data) – Input graph.
input_id (List) – List of center node ids in the current batch. If
input_id
is notNone
, the input data is a sampled mini_batch. Ifinput_id
isNone
, the input data is a full batch. Default:None
.
- Returns:
data (torch_geometric.data.Data) – Preprocessed input graph.
neighbor_dict (Dict) – Dictionary where nodes in the input_id list as keys and their neighbor list as corresponding values.
neighbor_num_list (torch.Tensor) – A n*1 tensor where its value represents the corresponding node degree for the nodes in input_id list.
id_mapping (Dict) – Dictionary where nodes in the input_id list as keys and their feature matrix id as the values.