GADNRBase#

class pygod.nn.GADNRBase(in_dim, hid_dim=64, encoder_layers=1, deg_dec_layers=4, fea_dec_layers=3, sample_size=2, sample_time=3, neighbor_num_list=None, neigh_loss='KL', lambda_loss1=0.01, lambda_loss2=0.001, lambda_loss3=0.0001, full_batch=True, dropout=0.0, act=<function relu>, backbone=<class 'torch_geometric.nn.models.basic_gnn.GCN'>, device='cpu', **kwargs)[source]#

Bases: Module

GAD-NR: Graph Anomaly Detection via Neighborhood Reconstruction

GAD-NR, is a new type of GAE based on neighborhood reconstruction for graph anomaly detection. GAD-NR aims to reconstruct the entire neighborhood (including local structure, self attributes, and neighbors attributes) around a node based on the corresponding node representation.

See [RSL+24] for details.

Parameters:
  • in_dim (int) – Input dimension of model.

  • hid_dim (int) – Hidden dimension of model. Default: 64.

  • encoder_layers (int, optional) – The number of layers for the graph encoder. Default: 1.

  • deg_dec_layers (int, optional) – The number of layers for the node degree decoder. Default: 4.

  • fea_dec_layers (int, optional) – The number of layers for the node feature decoder. Default: 3.

  • sample_size (int, optional) – The number of samples for the neighborhood distribution. Default: 2.

  • sample_time (int, optional) – The number sample times to remove the noise during node feature and neighborhood distribution reconstruction. Default: 3.

  • neighbor_num_list (torch.Tensor) – The node degree tensor used by the PNAConv model.

  • neigh_loss (str, optional) – The neighbor reconstruction loss. KL represents the KL divergence loss, W2 represents the W2 loss. Defualt: KL.

  • lambda_loss1 (float, optional) – The weight of the neighborhood reconstruction loss term. Default: 1e-2.

  • lambda_loss2 (float, optional) – The weight of the node feature reconstruction loss term. Default: 1e-3.

  • lambda_loss3 (float, optional) – The weight of the node degree reconstruction loss term. Default: 1e-4.

  • full_batch (bool, optional) – Whether in the full batch or the mini-batch training/inference mode. Default: True.

  • dropout (float, optional) – Dropout rate. Default: 0..

  • act (callable activation function or None, optional) – Activation function if not None. Default: torch.nn.functional.relu.

  • backbone (torch.nn.Module, optional) – The backbone of the deep detector implemented in PyG. Default: torch_geometric.nn.GCN.

  • device (string, optinal) – The device used by the model. Default: cpu.

  • **kwargs (optional) – Additional arguments for the backbone.

forward(x, edge_index, input_id=None, neighbor_dict=None, id_mapping=None)[source]#

Forward computation.

Parameters:
  • x (torch.Tensor) – Input attribute embeddings.

  • edge_index (torch.Tensor) – Edge index.

  • input_id (List) – List of center node ids in the current batch. If input_id is not None, the input data is a sampled mini_batch. If input_id is None, the input data is a full batch. Default: None.

  • neighbor_dict (Dict) – Dictionary where nodes in the current batch as keys and their neighbor list as corresponding values. If neighbor_dict is not None, the input data is a sampled mini_batch. If neighbor_dict is None, the input data is a full batch. Default: None.

  • id_mapping (Dict) – Dictionary where nodes in the current batch as keys and their feature matrix id as the values. If id_mapping is not None, the input data is a sampled mini_batch. If id_mapping is None, the input data is a full batch. Default: None.

Returns:

  • h0 (torch.Tensor) – Node feature initial embeddings.

  • degree_logits (torch.Tensor) – Reconstructed node degree logits.

  • feat_recon_list (List[torch.Tensor]) – Reconstructed node features.

  • neigh_recon_list (List[torch.Tensor]) – Reconstructed neighbor distributions.

loss_func(h0, degree_logits, feat_recon_list, neigh_recon_list, ground_truth_degree_matrix)[source]#

The loss function proposed in the GAD-NR paper.

Parameters:
  • h0 (torch.Tensor) – Node feature initial embeddings.

  • degree_logits (torch.Tensor) – Reconstructed node degree logits.

  • feat_recon_list (List[torch.Tensor]) – Reconstructed node features.

  • neigh_recon_list (List[torch.Tensor]) – Reconstructed neighbor distributions.

  • ground_truth_degree_matrix (torch.Tensor) – The ground trurh degree of the input nodes.

Returns:

  • loss (torch.Tensor) – The total loss value used to backpropagate and update the model parameters.

  • loss_per_node (torch.Tensor) – The original loss value per node used to compute the decision score (outlier score) of the node.

  • h_loss_per_node (torch.Tensor) – The neigborhood reconstruction loss value per node used to compute the adaptive decision score (outlier score) of the node.

  • degree_loss_per_node (torch.Tensor) – The node degree reconstruction loss value per node used to compute the adaptive decision score (outlier score) of the node.

  • feature_loss_per_node (torch.Tensor) – The node feature reconstruction loss value per node used to compute the adaptive decision score (outlier score) of the node.

static process_graph(data, input_id=None)[source]#

Obtain the neighbor dictornary and number of neighbors per node list

Parameters:
  • data (torch_geometric.data.Data) – Input graph.

  • input_id (List) – List of center node ids in the current batch. If input_id is not None, the input data is a sampled mini_batch. If input_id is None, the input data is a full batch. Default: None.

Returns:

  • data (torch_geometric.data.Data) – Preprocessed input graph.

  • neighbor_dict (Dict) – Dictionary where nodes in the input_id list as keys and their neighbor list as corresponding values.

  • neighbor_num_list (torch.Tensor) – A n*1 tensor where its value represents the corresponding node degree for the nodes in input_id list.

  • id_mapping (Dict) – Dictionary where nodes in the input_id list as keys and their feature matrix id as the values.