Models

BaseModel

CompGCN

The models of the simplified CompGCN, without using basis vector, for a heterogeneous graph.

HetGNN

HetGNN[KDD2019]- Heterogeneous Graph Neural Network Source Code Link

RGCN

Title: Modeling Relational Data with Graph Convolutional Networks

RGAT

RSHN

Relation structure-aware heterogeneous graph neural network (RSHN) builds coarsened line graph to obtain edge features first, then uses a novel Message Passing Neural Network (MPNN) to propagate node and edge features.

SkipGram

HAN

This model shows an example of using dgl.metapath_reachable_graph on the original heterogeneous graph HAN from paper Heterogeneous Graph Attention Network.

HeCo

Title: Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning

HGT

Heterogeneous graph transformer convolution from Heterogeneous Graph Transformer

GTN

GTN from paper Graph Transformer Networks in NeurIPS_2019.

fastGTN

fastGTN from paper Graph Transformer Networks: Learning Meta-path Graphs to Improve GNNs.

MHNF

MHNF from paper Multi-hop Heterogeneous Neighborhood information Fusion graph representation learning.

MAGNN

This is the main method of model MAGNN

HeGAN

HeGAN was introduced in Adversarial Learning on Heterogeneous Information Networks

NSHE

NSHE[IJCAI2020] Network Schema Preserving Heterogeneous Information Network Embedding Paper Link <http://www.shichuan.org/doc/87.pdf> Code Link https://github.com/Andy-Border/NSHE

NARS

SCALABLE GRAPH NEURAL NETWORKS FOR HETEROGENEOUS GRAPHS.

RHGNN

This is the main method of model RHGNN

HPN

This model shows an example of using dgl.metapath_reachable_graph on the original heterogeneous graph.HPN from paper Heterogeneous Graph Propagation Network.

KGCN

This module KGCN was introduced in KGCN.

SLiCE

HGSL

HGSL, Heterogeneous Graph Structure Learning from paper.

homo_GNN

General homogeneous GNN model for HGNN HeteroMLP + HomoGNN + HeteroMLP

general_HGNN

General heterogeneous GNN model

HDE

SimpleHGN

This is a model SimpleHGN from Are we really making much progress? Revisiting, benchmarking, and refining heterogeneous graph neural networks

GATNE

Rsage

Mg2vec

This is a model mg2vec from `mg2vec: Learning Relationship-Preserving Heterogeneous Graph Representations via Metagraph Embedding<https://ieeexplore.ieee.org/document/9089251>`__

DHNE

Title: Structural Deep Embedding for Hyper-Networks

class BaseModel[源代码]
classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(*args)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

extra_loss()[源代码]

Some model want to use L2Norm which is not applied all parameters.

返回类型

th.Tensor

get_emb()[源代码]

Return the embedding of a model for further analysis.

返回类型

numpy.array

class CompGCN(in_dim, hid_dim, out_dim, etypes, n_nodes, n_rels, num_layers=2, comp_fn='sub', dropout=0.0, activation=<function relu>, batchnorm=True)[源代码]

The models of the simplified CompGCN, without using basis vector, for a heterogeneous graph.

Here, we present the implementation details for each task used for evaluation in the paper. For all the tasks, we used COMPGCN build on PyTorch geometric framework (Fey & Lenssen, 2019).

Link Prediction: For evaluation, 200-dimensional embeddings for node and relation embeddings are used. For selecting the best model we perform a hyperparameter search using the validation data over the values listed in Table 8. For training link prediction models, we use the standard binary cross entropy loss with label smoothing Dettmers et al. (2018).

Node Classification: Following Schlichtkrull et al. (2017), we use 10% training data as validation for selecting the best model for both the datasets. We restrict the number of hidden units to 32. We use cross-entropy loss for training our model.

For all the experiments, training is done using Adam optimizer (Kingma & Ba, 2014) and Xavier initialization (Glorot & Bengio, 2010) is used for initializing parameters.

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, n_feats)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class HetGNN(hg, args)[源代码]

HetGNN[KDD2019]- Heterogeneous Graph Neural Network Source Code Link

The author of the paper only gives the academic dataset.

Het_Aggrate

Het_Aggregate

Type

nn.Module

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h=None)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class RGCN(in_dim, hidden_dim, out_dim, etypes, num_bases, num_hidden_layers=1, dropout=0, use_self_loop=False)[源代码]

Title: Modeling Relational Data with Graph Convolutional Networks

Authors: Michael Schlichtkrull, Thomas N. Kipf, Peter Bloem, Rianne van den Berg, Ivan Titov, Max Welling

参数
  • in_dim (int) – Input feature size.

  • hidden_dim (int) – Hidden dimension .

  • out_dim (int) – Output feature size.

  • etypes (list[str]) – Relation names.

  • num_bases (int, optional) – Number of bases. If is none, use number of relations. Default: None.

  • num_hidden_layers (int) – Number of RelGraphConvLayer

  • dropout (float, optional) – Dropout rate. Default: 0.0

  • use_self_loop (bool, optional) – True to include self loop message. Default: False

RelGraphConvLayer
Type

RelGraphConvLayer

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h_dict)[源代码]

Support full-batch and mini-batch training.

参数
  • hg (dgl.HeteroGraph or dgl.blocks) – Input graph

  • h_dict (dict[str, th.Tensor]) – Input feature

返回

h – output feature

返回类型

dict[str, th.Tensor]

class RGAT(in_dim, out_dim, h_dim, etypes, num_heads, num_hidden_layers=1, dropout=0)[源代码]
classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h_dict=None)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class RSHN(dim, out_dim, num_node_layer, num_edge_layer, dropout)[源代码]

Relation structure-aware heterogeneous graph neural network (RSHN) builds coarsened line graph to obtain edge features first, then uses a novel Message Passing Neural Network (MPNN) to propagate node and edge features.

We implement a API build a coarsened line graph.

edge_layers

Applied in Edge Layer.

Type

AGNNConv

coarsened line graph

Propagate edge features.

Type

dgl.DGLGraph

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, n_feats, *args, **kwargs)[源代码]

First, apply edge_layer in cl_graph to get edge embedding. Then, propagate node and edge features through GraphConv.

class SkipGram(num_nodes, dim)[源代码]
classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(pos_u, pos_v, neg_v)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class HAN(ntype_meta_paths_dict, in_dim, hidden_dim, out_dim, num_heads, dropout)[源代码]

This model shows an example of using dgl.metapath_reachable_graph on the original heterogeneous graph HAN from paper Heterogeneous Graph Attention Network.. Because the original HAN implementation only gives the preprocessed homogeneous graph, this model could not reproduce the result in HAN as they did not provide the preprocessing code, and we constructed another dataset from ACM with a different set of papers, connections, features and labels.

\[\mathbf{h}_{i}^{\prime}=\mathbf{M}_{\phi_{i}} \cdot \mathbf{h}_{i}\]

where \(h_i\) and \(h'_i\) are the original and projected feature of node \(i\)

\[e_{i j}^{\Phi}=a t t_{\text {node }}\left(\mathbf{h}_{i}^{\prime}, \mathbf{h}_{j}^{\prime} ; \Phi\right)\]

where \({att}_{node}\) denotes the deep neural network.

\[\alpha_{i j}^{\Phi}=\operatorname{softmax}_{j}\left(e_{i j}^{\Phi}\right)=\frac{\exp \left(\sigma\left(\mathbf{a}_{\Phi}^{\mathrm{T}} \cdot\left[\mathbf{h}_{i}^{\prime} \| \mathbf{h}_{j}^{\prime}\right]\right)\right)}{\sum_{k \in \mathcal{N}_{i}^{\Phi}} \exp \left(\sigma\left(\mathbf{a}_{\Phi}^{\mathrm{T}} \cdot\left[\mathbf{h}_{i}^{\prime} \| \mathbf{h}_{k}^{\prime}\right]\right)\right)}\]

where \(\sigma\) denotes the activation function, || denotes the concatenate operation and \(a_{\Phi}\) is the node-level attention vector for meta-path \(\Phi\).

\[\mathbf{z}_{i}^{\Phi}=\prod_{k=1}^{K} \sigma\left(\sum_{j \in \mathcal{N}_{i}^{\Phi}} \alpha_{i j}^{\Phi} \cdot \mathbf{h}_{j}^{\prime}\right)\]

where \(z^{\Phi}_i\) is the learned embedding of node i for the meta-path \(\Phi\). Given the meta-path set {\(\Phi_0 ,\Phi_1,...,\Phi_P\)},after feeding node features into node-level attentionwe can obtain P groups of semantic-specific node embeddings, denotes as {\(Z_0 ,Z_1,...,Z_P\)}. We use MetapathConv to finish Node-level Attention and Semantic-level Attention.

参数
  • ntype_meta_paths_dict (dict[str, dict[str, list[etype]]]) – Dict from node type to dict from meta path name to meta path. For node classification, there is only one node type. For link prediction, there can be multiple node types which are source and destination node types of target links.

  • in_dim (int) – Input feature dimension.

  • hidden_dim (int) – Hidden layer dimension.

  • out_dim (int) – Output feature dimension.

  • num_heads (list[int]) – Number of attention heads.

  • dropout (float) – Dropout probability.

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(g, h_dict)[源代码]
参数
  • g (DGLHeteroGraph or dict[str, dict[str, DGLBlock]]) – For full batch, it is a heterogeneous graph. For mini batch, it is a dict from node type to dict from mata path name to DGLBlock.

  • h_dict (dict[str, Tensor] or dict[str, dict[str, dict[str, Tensor]]]) – The input features. For full batch, it is a dict from node type to node features. For mini batch, it is a dict from node type to dict from meta path name to dict from node type to node features.

返回

out_dict – The output features. Dict from node type to node features.

返回类型

dict[str, Tensor]

class HeCo(meta_paths_dict, network_schema, category, hidden_size, feat_drop, attn_drop, sample_rate, tau, lam)[源代码]

Title: Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning

Authors: Xiao Wang, Nian Liu, Hui Han, Chuan Shi

HeCo was introduced in [paper] and parameters are defined as follows:

参数
  • meta_paths (dict) – Extract metapaths from graph

  • network_schema (dict) – Directed edges from other types to target type

  • category (string) – The category of the nodes to be classificated

  • hidden_size (int) – Hidden units size

  • feat_drop (float) – Dropout rate for projected feature

  • attn_drop (float) – Dropout rate for attentions used in two view guided encoders

  • sample_rate (dict) – The nuber of neighbors of each type sampled for network schema view

  • tau (float) – Temperature parameter used for contrastive loss

  • lam (float) – Balance parameter for two contrastive losses

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(g, h_dict, pos)[源代码]

This is the forward part of model HeCo.

参数
  • g (DGLGraph) – A DGLGraph

  • h_dict (dict) – Projected features after linear projection

  • pos (matrix) – A matrix to indicate the postives for each node

返回

loss – The optimize objective

返回类型

float

备注

Pos matrix is pre-defined by users. The relative tool is given in original code.

get_embeds(g, h_dict)[源代码]

This is to get final embeddings of target nodes

class HGT(in_dim, out_dim, num_heads, num_etypes, ntypes, num_layers, dropout=0.2, norm=False)[源代码]

Heterogeneous graph transformer convolution from Heterogeneous Graph Transformer

For more details, you may refer to `HGT<https://docs.dgl.ai/en/0.8.x/generated/dgl.nn.pytorch.conv.HGTConv.html>`__

参数
  • in_dim (int) – the input dimension

  • out_dim (int) – the output dimension

  • num_heads (list) – the list of the number of heads in each layer

  • num_etypes (int) – the number of the edge type

  • num_ntypes (int) – the number of the node type

  • num_layers (int) – the number of layers we used in the computing

  • dropout (float) – the feature drop rate

  • norm (boolean) – if we need the norm operation

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h_dict)[源代码]

The forward part of the HGT.

参数
  • hg (object) – the dgl heterogeneous graph

  • h_dict (dict) – the feature dict of different node types

返回

The embeddings after the output projection.

返回类型

dict

class GTN(num_edge_type, num_channels, in_dim, hidden_dim, num_class, num_layers, category, norm, identity)[源代码]

GTN from paper Graph Transformer Networks in NeurIPS_2019. You can also see the extension paper Graph Transformer Networks: Learning Meta-path Graphs to Improve GNNs.

Code from author.

Given a heterogeneous graph \(G\) and its edge relation type set \(\mathcal{R}\).Then we extract the single relation adjacency matrix list. In that, we can generate combination adjacency matrix by conv the single relation adjacency matrix list. We can generate :math:’l-length’ meta-path adjacency matrix by multiplying combination adjacency matrix. Then we can generate node representation using a GCN layer.

参数
  • num_edge_type (int) – Number of relations.

  • num_channels (int) – Number of conv channels.

  • in_dim (int) – The dimension of input feature.

  • hidden_dim (int) – The dimension of hidden layer.

  • num_class (int) – Number of classification type.

  • num_layers (int) – Length of hybrid metapath.

  • category (string) – Type of predicted nodes.

  • norm (bool) – If True, the adjacency matrix will be normalized.

  • identity (bool) – If True, the identity matrix will be added to relation matrix set.

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class fastGTN(num_edge_type, num_channels, in_dim, hidden_dim, num_class, num_layers, category, norm, identity)[源代码]

fastGTN from paper Graph Transformer Networks: Learning Meta-path Graphs to Improve GNNs. It is the extension paper of GTN. Code from author.

Given a heterogeneous graph \(G\) and its edge relation type set \(\mathcal{R}\).Then we extract the single relation adjacency matrix list. In that, we can generate combination adjacency matrix by conv the single relation adjacency matrix list. We can generate :math:’l-length’ meta-path adjacency matrix by multiplying combination adjacency matrix. Then we can generate node representation using a GCN layer.

参数
  • num_edge_type (int) – Number of relations.

  • num_channels (int) – Number of conv channels.

  • in_dim (int) – The dimension of input feature.

  • hidden_dim (int) – The dimension of hidden layer.

  • num_class (int) – Number of classification type.

  • num_layers (int) – Length of hybrid metapath.

  • category (string) – Type of predicted nodes.

  • norm (bool) – If True, the adjacency matrix will be normalized.

  • identity (bool) – If True, the identity matrix will be added to relation matrix set.

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class MHNF(num_edge_type, num_channels, in_dim, hidden_dim, num_class, num_layers, category, norm, identity)[源代码]

MHNF from paper Multi-hop Heterogeneous Neighborhood information Fusion graph representation learning.

Given a heterogeneous graph \(G\) and its edge relation type set \(\mathcal{R}\).Then we can extract l-hops hybrid adjacency matrix list in HMAE model. The hybrid adjacency matrix list can be used in HLHIA model to generate l-hops representations. Then HSAF model use attention mechanism to aggregate l-hops representations and because of multi-channel conv, the HSAF model also aggregates different channels l-hops representations to generate a final representation. You can see detail operation in correspond model.

参数
  • num_edge_type (int) – Number of relations.

  • num_channels (int) – Number of conv channels.

  • in_dim (int) – The dimension of input feature.

  • hidden_dim (int) – The dimension of hidden layer.

  • num_class (int) – Number of classification type.

  • num_layers (int) – Length of hybrid metapath.

  • category (string) – Type of predicted nodes.

  • norm (bool) – If True, the adjacency matrix will be normalized.

  • identity (bool) – If True, the identity matrix will be added to relation matrix set.

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h=None)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class MAGNN(ntypes, h_feats, inter_attn_feats, num_heads, num_classes, num_layers, metapath_list, edge_type_list, dropout_rate, metapath_idx_dict, encoder_type='RotateE', activation=<function elu>)[源代码]

This is the main method of model MAGNN

参数
  • ntypes (list) – the nodes’ types of the dataset

  • h_feats (int) – hidden dimension

  • inter_attn_feats (int) – the dimension of attention vector in inter-metapath aggregation

  • num_heads (int) – the number of heads in intra metapath attention

  • num_classes (int) – the number of output classes

  • num_layers (int) – the number of hidden layers

  • metapath_list (list) – the list of metapaths, e.g [‘M-D-M’, ‘M-A-M’, …],

  • edge_type_list (list) – the list of edge types, e.g [‘M-A’, ‘A-M’, ‘M-D’, ‘D-M’],

  • dropout_rate (float) – the dropout rate of feat dropout and attention dropout

  • mp_instances (dict) – the metapath instances indices dict. e.g mp_instances[‘MAM’] stores MAM instances indices.

  • encoder_type (str) – the type of encoder, e.g [‘RotateE’, ‘Average’, ‘Linear’]

  • activation (callable activation function) – the activation function used in MAGNN. default: F.elu

备注

Please make sure that the please make sure that all the metapath is symmetric, e.g [‘MDM’, ‘MAM’ …] are symmetric, while [‘MAD’, ‘DAM’, …] are not symmetric.

please make sure that the edge_type_list meets the following form: [edge_type_1, edge_type_1_reverse, edge_type_2, edge_type_2_reverse, …], like the example above.

All the activation in MAGNN are the same according to the codes of author.

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

mini_reset_params(new_metapth_idx_dict)[源代码]

This method is utilized for reset some parameters including metapath_idx_dict, metapath_list, dst_ntypes… Other Parameters like weight matrix don’t need to be updated.

forward(g, feat_dict=None)[源代码]

The forward part of MAGNN

参数
  • g (object) – the dgl heterogeneous graph

  • feat_dict (dict) – the feature matrix dict of different node types, e.g {‘M’:feat_of_M, ‘D’:feat_of_D, …}

返回

  • dict – The predicted logit after the output projection. e.g For the predicted node type, such as M(movie), dict[‘M’] contains the probability that each node is classified as each class. For other node types, such as D(director), dict[‘D’] contains the result after the output projection.

  • dict – The embeddings before the output projection. e.g dict[‘M’] contains embeddings of every node of M type.

class HeGAN(emb_size, hg)[源代码]

HeGAN was introduced in Adversarial Learning on Heterogeneous Information Networks

It included a Discriminator and a Generator. For more details please read docs of both.

参数
  • emb_size (int) – embedding size

  • hg (dgl.heteroGraph) – hetorogeneous graph

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(*args)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

extra_loss()[源代码]

Some model want to use L2Norm which is not applied all parameters.

返回类型

th.Tensor

class NSHE(g, gnn_model, project_dim, emd_dim, context_dim)[源代码]

NSHE[IJCAI2020] Network Schema Preserving Heterogeneous Information Network Embedding Paper Link <http://www.shichuan.org/doc/87.pdf> Code Link https://github.com/Andy-Border/NSHE

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class NARS(num_hops, args, hg)[源代码]

SCALABLE GRAPH NEURAL NETWORKS FOR HETEROGENEOUS GRAPHS.

Given a heterogeneous graph \(G\) and its edge relation type set \(\mathcal{R}\), our proposed method first samples \(K\) unique subsets from \(\mathcal{R}\). Then for each sampled subset \(R_i \subseteq \mathcal{R}\), we generate a relation subgraph \(G_i\) from \(G\) in which only edges whose type belongs to \(R_i\) are kept. We treat \(G_i\) as a homogeneous graph or a bipartite graph, and perform neighbor aggregation to generate \(L\)-hop neighbor features for each node. Let \(H_{v,0}\) be the input features (of dimension \(D\)) for node \(v\). For each subgraph \(G_i\) , the \(l\)-th hop features \(H_{v,l}^{i}\) are computed as

\[H_{v, l}^{i}=\sum_{u \in N_{i}(v)} \frac{1}{\left|N_{i}(v)\right|} H_{u, l-1}^{i}\]

where \(N_i(v)\) is the set of neighbors of node \(v\) in \(G_i\).

For each layer \(l\), we let the model adaptively learn which relation-subgraph features to use by aggregating features from different subgraphs \(G_i\) with learnable 1-D convolution. The aggregated \(l\)-hop features across all subgraphs are calculated as

\[H_{v, l}^{a g g}=\sum_{i=1}^{K} a_{i, l} \cdot H_{v, l}^{i}\]

where \(H^i\) is the neighbor averaging features on subgraph \(G_i\) and \(a_{i,l}\) is a learned vector of length equal to the feature dimension \(D\).

参数
  • num_hops (int) – Number of hops.

  • category (str) – Type of predicted nodes.

  • hidden_dim (int) – The dimention of hidden layer.

  • num_feats (int) – The number of relation subsets.

备注

We do not support the dataset without feature, (e.g. HGBn-Freebase because the model performs neighbor aggregation to generate \(L\)-hop neighbor features at once.

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h_dict)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class RHGNN(graph: DGLHeteroGraph, input_dim_dict, hidden_dim: int, relation_input_dim: int, relation_hidden_dim: int, num_layers: int, category, out_dim, n_heads: int = 4, dropout: float = 0.2, negative_slope: float = 0.2, residual: bool = True, norm: bool = True)[源代码]

This is the main method of model RHGNN

参数
  • graph (dgl.DGLHeteroGraph) – a heterogeneous graph

  • input_dim_dict (dict) – node input dimension dictionary

  • hidden_dim (int) – node hidden dimension

  • relation_input_dim (int) – relation input dimension

  • relation_hidden_dim (int) – relation hidden dimension

  • num_layers (int) – number of stacked layers

  • n_heads (int) – number of attention heads

  • dropout (float) – dropout rate

  • negative_slope (float) – negative slope

  • residual (boolean) – residual connections or not

  • norm (boolean) – layer normalization or not

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

reset_parameters()[源代码]

Reinitialize learnable parameters.

forward(blocks: list, relation_target_node_features=None, relation_embedding: Optional[dict] = None)[源代码]
参数
  • blocks (list) – list of sampled dgl.DGLHeteroGraph

  • relation_target_node_features (dict) – target node features under each relation, e.g {(srctype, etype, dsttype): features}

  • relation_embedding (dict) – embedding for each relation, e.g {etype: feature} or None

inference(graph: DGLHeteroGraph, relation_target_node_features: dict, relation_embedding: Optional[dict] = None, device: str = 'cuda:0')[源代码]

mini-batch inference of final representation over all node types. Outer loop: Interate the layers, Inner loop: Interate the batches

参数
  • graph (dgl.DGLHeteroGraph) – The whole relational graphs

  • relation_target_node_features (dict) – target node features under each relation, e.g {(srctype, etype, dsttype): features}

  • relation_embedding (dict) – embedding for each relation, e.g {etype: feature} or None

  • device (str) – device

class HPN(meta_paths, category, in_size, out_size, dropout, k_layer, alpha, edge_drop)[源代码]

This model shows an example of using dgl.metapath_reachable_graph on the original heterogeneous graph.HPN from paper Heterogeneous Graph Propagation Network. The author did not provide codes. So, we complete it according to the implementation of HAN

\[\mathbf{Z}^{\Phi}=\mathcal{P}_{\Phi}(\mathbf{X})=g_\Phi(f_\Phi(\mathbf{X}))\]

where \(\mathbf{X}\) denotes initial feature matrix and \(\mathbf{Z^\Phi}\) denotes semantic-specific node embedding.

\[\mathbf{H}^{\Phi}=f_\Phi(\mathbf{X})=\sigma(\mathbf{X} \cdot \mathbf{W}^\Phi+\mathbf{b}^{\Phi})\]

where \(\mathbf{H}^{\Phi}\) is projected node feature matrix

\[\mathbf{Z}^{\Phi, k}=g_{\Phi}\left(\mathbf{Z}^{\Phi, k-1}\right)=(1-\gamma) \cdot \mathbf{M}^{\Phi} \cdot \mathbf{Z}^{\Phi, k-1}+\gamma \cdot \mathbf{H}^{\Phi}\]

where \(\mathbf{Z}^{\Phi,k}\) denotes node embeddings learned by k-th layer semantic propagation mechanism. \(\gamma\) is a weight scalar which indicates the importance of characteristic of node in aggregating process. We use MetapathConv to finish Semantic Propagation and Semantic Fusion.

参数
  • meta_paths (list) – contain multiple meta-paths.

  • category (str) – The category means the head and tail node of metapaths.

  • in_size (int) – input feature dimension.

  • out_size (int) – out dimension.

  • dropout (float) – Dropout probability.

  • k_layer (int) – propagation times.

  • alpha (float) – Value of restart probability.

  • edge_drop (float, optional) – The dropout rate on edges that controls the messages received by each node. Default: 0.

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(g, h_dict)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class KGCN(g, args)[源代码]

This module KGCN was introduced in KGCN.

It included two parts:

Aggregate the entity representation and its neighborhood representation into the entity’s embedding. The message function is defined as follow:

\(\mathrm{v}_{\mathcal{N}(v)}^{u}=\sum_{e \in \mathcal{N}(v)} \tilde{\pi}_{r_{v, e}}^{u} \mathrm{e}\)

where \(\mathrm{e}\) is the representation of entity, \(\tilde{\pi}_{r_{v, e}}^{u}\) is the scalar weight on the edge from entity to entity, the result \(\mathrm{v}_{\mathcal{N}(v)}^{u}\) saves message which is passed from neighbor nodes

There are three types of aggregators. Sum aggregator takes the summation of two representation vectors, Concat aggregator concatenates the two representation vectors and Neighbor aggregator directly takes the neighborhood representation of entity as the output representation

\(a g g_{s u m}=\sigma\left(\mathbf{W} \cdot\left(\mathrm{v}+\mathrm{v}_{\mathcal{S}(v)}^{u}\right)+\mathbf{b}\right)\)

\(agg $_{\text {concat }}=\sigma\left(\mathbf{W} \cdot \text{concat}\left(\mathrm{v}, \mathrm{v}_{\mathcal{S}(v)}^{u}\right)+\mathbf{b}\right)$\)

\(\text { agg }_{\text {neighbor }}=\sigma\left(\mathrm{W} \cdot \mathrm{v}_{\mathcal{S}(v)}^{u}+\mathrm{b}\right)\)

In the above equations, \(\sigma\) is the nonlinear function and \(\mathrm{W}\) and \(\mathrm{b}\) are transformation weight and bias. the representation of an item is bound up with its neighbors by aggregation

Obtain scores using final entity representation and user representation The final entity representation is denoted as \(\mathrm{v}^{u}\), \(\mathrm{v}^{u}\) do dot product with user representation \(\mathrm{u}\) can obtain the probability. The math formula for the above function is:

\($\hat{y}_{u v}=f\left(\mathbf{u}, \mathrm{v}^{u}\right)$\)

参数
  • g (DGLGraph) – A knowledge Graph preserves relationships between entities

  • args (Config) – Model’s config

classmethod build_model_from_args(args, g)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

get_score()[源代码]

Obtain scores using final entity representation and user representation

forward(blocks, inputdata)[源代码]

Predict the probability between user and entity

参数
  • blocks (list) – Blocks saves the information of neighbor nodes in each layer

  • inputdata (numpy.ndarray) – Inputdata contains the relationship between the user and the entity

返回

  • labels (torch.Tensor) – the label between users and entities

  • scores (torch.Tensor) – Probability of users clicking on entitys

class SLiCE(G, args, pretrained_node_embedding_tensor, num_layers=6, d_model=200, d_k=64, d_v=64, d_ff=800, n_heads=4, is_pre_trained=False, base_embedding_dim=200, max_length=6, num_gcn_layers=2, node_edge_composition_func='mult', get_embeddings=False, fine_tuning_layer=False)[源代码]
classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

load_pretrained_node2vec(filename, base_emb_dim)[源代码]

loads embeddings from node2vec style file, where each line is nodeid node_embedding returns tensor containing node_embeddings for graph nodes 0 to n-1

forward(subgraph_list)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class HGSL(feat_dims, undirected_relations, device, metapaths, mp_emb_dim, hidden_dim, num_heads, fs_eps, fp_eps, mp_eps, gnn_emd_dim, gnn_dropout, category, num_class)[源代码]

HGSL, Heterogeneous Graph Structure Learning from paper.

参数
  • feat_dims (dict) – The feature dimensions of different node types.

  • undirected_relations (str) – The HGSL model can only handle undirected heterographs, while in the dgl.heterograph format, directed edges are stored in two different edge types, separately and symmetrically, to represent undirected edge. Hence you have to specify which relations are those distinct undirected relations. In this parameter, each undirected relation is separated with a comma. For example, in a heterograph with 2 undirected relations: paper-author and paper-subject, there are 4 type of edges stored in the dgl.heterograph: paper-author, author-paper, paper-subject, subject-paper. Then this parameter can be “paper-author,paper-subject”, “author-paper,paper-subject”, “paper-author,subject-paper” or “author-paper,subject-paper”.

  • device (str) – The GPU device to select, like ‘cuda:0’.

  • metapaths (list) – The metapath name list.

  • mp_emb_dim (int) – The dimension of metapath embeddings from metapath2vec.

  • hidden_dim (int) – The dimension of mapped features in the graph generating procedure.

  • num_heads (int) – Number of heads in the K-head weighted cosine similarity function.

  • fs_eps (float) – Threshold of feature similarity graph \(\epsilon^{FS}\).

  • fp_eps (float) – Threshold of feature propagation graph \(\epsilon^{FP}\).

  • mp_eps (float) – Threshold of semantic graph \(\epsilon^{MP}\).

  • gnn_emd_dim (int) – The dimension of hidden layers of the downstream GNN.

  • gnn_dropout (float) – The dropout ratio of features in the downstream GNN.

  • category (str) – The target node type which the model will predict on.

  • out_dim (int) – number of classes of the target node type.

fgg_direct

Feature similarity graph generator(\(S_r^{FS}\)) dict in equation 2 of paper, in which keys are undirected-relation strs.

Type

nn.ModuleDict

fgg_left

Feature propagation graph generator(\(S_r^{FH}\)) dict which generates the graphs in equation 5 of paper.

Type

nn.ModuleDict

fgg_right

Feature propagation graph generator(\(S_r^{FT}\)) dict which generates the graphs in equation 6 of paper.

Type

nn.ModuleDict

fg_agg

A channel attention layer, in which a layer fuses one feature similarity graph and two feature propagation graphs generated, in equation 7 of paper.

Type

nn.ModuleDict

sgg_gen

Semantic subgraph generator(\(S_{r,m}^{MP}\)) dict, in equation 8 of paper.

Type

nn.ModuleDict

sg_agg

The channel attention layer which fuses semantic subgraphs, in equation 9 of paper.

Type

nn.ModuleDict

overall_g_agg

The channel attention layer which fuses the learned feature graph, semantic graph and the original graph.

Type

nn.ModuleDict

encoder

The type-specific mapping layer in equation 1 of paper.

Type

nn.ModuleDict

备注

This model under the best config has some slight differences compared with the code given by the paper author, which seems having little impact on performance:

  1. The regularization item in loss is on all parameters of the model, while in the author’s code, it is only on the generated adjacent matrix. If you want to implement the latter, a new task of OpenHGNN is needed.

  2. The normalization of input adjacent matrix is separately on different adjacent matrices of different relations, while in the author’s code, it is on the entire adjacent matrix composed of adjacent matrices of all relations.

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h_features)[源代码]
参数
  • hg (dgl.DGlHeteroGraph) – All input data is stored in this graph. The graph should be an undirected heterogeneous graph. Every node type in graph should have its feature named ‘h’ and the same feature dimension. Every node type in graph should have its metapath2vec embedding feature named ‘xxx_m2v_emb’ and the same feature dimension.

  • h_features (dict) – Not used.

返回

result – The target node type and the corresponding node embeddings.

返回类型

dict

class homo_GNN(args, hg, out_node_type, **kwargs)[源代码]

General homogeneous GNN model for HGNN HeteroMLP + HomoGNN + HeteroMLP

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h_dict)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class general_HGNN(args, hg, out_node_type, **kwargs)[源代码]

General heterogeneous GNN model

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h_dict)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class HDE(input_dim, output_dim, num_neighbor, use_bias=True)[源代码]
forward(fea_a, fea_b)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

class SimpleHGN(edge_dim, num_etypes, in_dim, hidden_dim, num_classes, num_layers, heads, feat_drop, negative_slope, residual, beta, ntypes)[源代码]

This is a model SimpleHGN from Are we really making much progress? Revisiting, benchmarking, and refining heterogeneous graph neural networks

The model extend the original graph attention mechanism in GAT by including edge type information into attention calculation.

Calculating the coefficient:

\[\alpha_{ij} = \frac{exp(LeakyReLU(a^T[Wh_i||Wh_j||W_r r_{\psi(<i,j>)}]))}{\Sigma_{k\in\mathcal{E}}{exp(LeakyReLU(a^T[Wh_i||Wh_k||W_r r_{\psi(<i,k>)}]))}} \quad (1)\]

Residual connection including Node residual:

\[h_i^{(l)} = \sigma(\Sigma_{j\in \mathcal{N}_i} {\alpha_{ij}^{(l)}W^{(l)}h_j^{(l-1)}} + h_i^{(l-1)}) \quad (2)\]

and Edge residual:

\[\alpha_{ij}^{(l)} = (1-\beta)\alpha_{ij}^{(l)}+\beta\alpha_{ij}^{(l-1)} \quad (3)\]

Multi-heads:

\[h^{(l+1)}_j = \parallel^M_{m = 1}h^{(l + 1, m)}_j \quad (4)\]

Residual:

\[h^{(l+1)}_j = h^{(l)}_j + \parallel^M_{m = 1}h^{(l + 1, m)}_j \quad (5)\]
参数
  • edge_dim (int) – the edge dimension

  • num_etypes (int) – the number of the edge type

  • in_dim (int) – the input dimension

  • hidden_dim (int) – the output dimension

  • num_classes (int) – the number of the output classes

  • num_layers (int) – the number of layers we used in the computing

  • heads (list) – the list of the number of heads in each layer

  • feat_drop (float) – the feature drop rate

  • negative_slope (float) – the negative slope used in the LeakyReLU

  • residual (boolean) – if we need the residual operation

  • beta (float) – the hyperparameter used in edge residual

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h_dict)[源代码]

The forward part of the SimpleHGN.

参数
  • hg (object) – the dgl heterogeneous graph

  • h_dict (dict) – the feature dict of different node types

返回

The embeddings after the output projection.

返回类型

dict

class GATNE(num_nodes, embedding_size, embedding_u_size, edge_types, edge_type_count, att_dim)[源代码]
classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(block)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class Rsage(in_dim, out_dim, h_dim, etypes, aggregator_type, num_hidden_layers=1, dropout=0)[源代码]
classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(hg, h_dict=None)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class Mg2vec(node_num, mg_num, emb_dimension, unigram, sample_num)[源代码]

This is a model mg2vec from `mg2vec: Learning Relationship-Preserving Heterogeneous Graph Representations via Metagraph Embedding<https://ieeexplore.ieee.org/document/9089251>`__

It contains following parts:

Achieve the metagraph and metagraph instances by mining the raw graph. Please go to `DataMaker-For-Mg2vec<https://github.com/null-xyj/DataMaker-For-Mg2vec>`__ for more details.

Initialize the embedding for every node and metagraph and adopt an unsupervised method to train the node embeddings and metagraph embeddings. In detail, for every node, we keep its embedding close to the metagraph it belongs to and far away from the metagraph we get by negative sampling.

Every node and meta-graph can be represented as an n-dim vector.We define the first-order loss and second-order loss. First-Order Loss is for single core node in every meta-graph. We compute the dot product of the node embedding and the positive meta-graph embedding as the true logit. Then We compute the dot product of the node embedding and the sampled negative meta-graph embedding as the neg logit. We use the binary_cross_entropy_with_logits function to compute the first-order loss. Second-Order Loss consider two core nodes in every meta-graph. First, we cancat the two node’s embedding, what is a 2n-dim vector. Then we use a 2n*n matrix and an n-dim vector to map the 2n-dim vector to an n-dim vector. The map function is showed below: .. math:

f(u,v) = RELU([u||v]W + b)

u and v means the origin embedding of the two nodes, || is the concatenation operator. W is the 2n*n matrix and b is the n-dim vector. RELU is the an activation function. f(u,v) means the n-dim vector after transforming. Then, the computation of second-order loss is the same as the first-order loss. Finally, we use a parameter alpha to balance the first-order loss and second-order loss. .. math:

L=(1-alpha)*L_1 + alpha*L_2

After we train the node embeddings, we use the embeddings to complete the relation prediction task. The relation prediction task is achieved by edge classification task. If two nodes are connected with a relation, we see the relation as an edge. Then we can adopt the edge classification to complete relation prediction task.

参数
  • node_num (int) – the number of core-nodes

  • mg_num (int) – the number of meta-graphs

  • emb_dimension (int) – the embedding dimension of nodes and meta-graphs

  • unigram (float) – the frequency of every meta-graph, for negative sampling

  • sample_num (int) – the number of sampled negative meta-graph

classmethod build_model_from_args(args, hg)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(train_a, train_b, train_labels, train_freq, train_weight, device)[源代码]

The model plays a role of encoder. So the forward will encoder original features into new features.

参数
  • hg (dgl.DGlHeteroGraph) – the heterogeneous graph

  • h_dict (dict[str, th.Tensor]) – the dict of heterogeneous feature

返回

out_dic – A dict of encoded feature. In general, it should ouput all nodes embedding. It is allowed that just output the embedding of target nodes which are participated in loss calculation.

返回类型

dict[str, th.Tensor]

class DHNE(nums_type, dim_features, embedding_sizes, hidden_size, device)[源代码]

Title: Structural Deep Embedding for Hyper-Networks

Authors: Ke Tu, Peng Cui, Xiao Wang, Fei Wang, Wenwu Zhu

DHNE was introduced in [paper] and parameters are defined as follows:

参数
  • nums_type (list) – the type of nodes

  • dim_features (array) – the embedding dimension of nodes

  • embedding_sizes (int) – the embedding dimension size

  • hidden_size (int) – The hidden full connected layer size

  • device (int) – the device DHNE working on

classmethod build_model_from_args(args)[源代码]

Build the model instance from args and hg.

So every subclass inheriting it should override the method.

forward(input_ids)[源代码]

The forward part of the DHNE.

参数

input_ids – the input block of this batch

返回

The logits after DHNE training.

返回类型

tensor