Layer

HeteroEmbedLayer

Embedding layer for featureless heterograph.

GeneralLinear

General Linear, combined with activation, normalization(batch and L2), dropout and so on.

HeteroLinearLayer

Transform feature with nn.Linear.

HeteroMLPLayer

HeteroMLPLayer contains multiple GeneralLinears, different with HeteroLinearLayer.

HeteroFeature

This is a feature preprocessing component which is dealt with various heterogeneous feature situation.

MetapathConv

MetapathConv is an aggregation function based on meta-path, which is similar with dgl.nn.pytorch.HeteroGraphConv.

HeteroGraphConv

A generic module for computing convolution on heterogeneous graphs.

ATTConv

It is macro_layer of the models [HetGNN].

MacroConv

param in_feats

Input feature size.

SemanticAttention

CompConv

Composition-based convolution was introduced in Composition-based Multi-Relational Graph Convolutional Networks and mathematically is defined as follows:

AttConv

Attention-based convolution was introduced in Hybrid Micro/Macro Level Convolution for Heterogeneous Graph Learning and mathematically is defined as follows:

LSTMConv

Aggregate the neighbors with LSTM

class HeteroEmbedLayer(n_nodes_dict, embed_size, embed_name='embed', activation=None, dropout=0.0)[source]

Embedding layer for featureless heterograph.

Parameters
  • n_nodes_dict (dict[str, int]) – Key of dict means node type, value of dict means number of nodes.

  • embed_size (int) – Dimension of embedding,

  • activation (callable activation function/layer or None, optional) – If not None, applies an activation function to the updated node features. Default: None.

  • dropout (float, optional) – Dropout rate. Default: 0.0

forward()[source]
Return type

The output embeddings.

forward_nodes(nodes_dict)[source]
Parameters

nodes_dict (dict[str, th.Tensor]) – Key of dict means node type, value of dict means idx of nodes.

Returns

out_feature – Output feature.

Return type

dict[str, th.Tensor]

class GeneralLinear(in_features, out_features, act=None, dropout=0.0, has_l2norm=True, has_bn=True, **kwargs)[source]

General Linear, combined with activation, normalization(batch and L2), dropout and so on.

Parameters
  • in_features (int) – size of each input sample, which is fed into nn.Linear

  • out_features (int) – size of each output sample, which is fed into nn.Linear

  • act (callable activation function/layer or None, optional) – If not None, applies an activation function to the updated node features. Default: None.

  • dropout (float, optional) – Dropout rate. Default: 0.0

  • has_l2norm (bool) – If True, applies torch.nn.functional.normalize to the node features at last of forward(). Default: True

  • has_bn (bool) – If True, applies torch.nn.BatchNorm1d to the node features after applying nn.Linear.

forward(batch_h: Tensor) Tensor[source]

Apply Linear, BatchNorm1d, Dropout and normalize(if need).

class HeteroLinearLayer(linear_dict, act=None, dropout=0.0, has_l2norm=True, has_bn=True, **kwargs)[source]

Transform feature with nn.Linear. In general, heterogeneous feature has different dimension as input. Even though they may have same dimension, they may have different semantic in every dimension. So we use a linear layer for each node type to map all node features to a shared feature space.

Parameters

linear_dict (dict) – Key of dict can be node type(node name), value of dict is a list contains input dimension and output dimension.

Examples

>>> import torch as th
>>> linear_dict = {}
>>> linear_dict['author'] = [110, 64]
>>> linear_dict['paper'] = [128,64]
>>> h_dict = {}
>>> h_dict['author'] = th.tensor(10, 110)
>>> h_dict['paper'] = th.tensor(5, 128)
>>> layer = HeteroLinearLayer(linear_dict)
>>> out_dict = layer(h_dict)
forward(dict_h: dict) dict[source]
Parameters
  • dict_h (return) – A dict of heterogeneous feature

  • dict_h

class HeteroMLPLayer(linear_dict, act=None, dropout=0.0, has_l2norm=True, has_bn=True, final_act=False, **kwargs)[source]

HeteroMLPLayer contains multiple GeneralLinears, different with HeteroLinearLayer. The latter contains only one layer.

Parameters

linear_dict (dict) – Key of dict can be node type(node name), value of dict is a list contains input, hidden and output dimension.

forward(dict_h)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class HeteroFeature(h_dict, n_nodes_dict, embed_size, act=None, need_trans=True, all_feats=True)[source]

This is a feature preprocessing component which is dealt with various heterogeneous feature situation.

In general, we will face the following three situations.

  1. The dataset has not feature at all.

  2. The dataset has features in every node type.

  3. The dataset has features of a part of node types.

To deal with that, we implement the HeteroFeature.In every situation, we can see that

  1. We will build embeddings for all node types.

  2. We will build linear layer for all node types.

  3. We will build embeddings for parts of node types and linear layer for parts of node types which have original feature.

Parameters
  • h_dict (dict) – Input heterogeneous feature dict, key of dict means node type, value of dict means corresponding feature of the node type. It can be None if the dataset has no feature.

  • n_nodes_dict (dict) – Key of dict means node type, value of dict means number of nodes.

  • embed_size (int) – Dimension of embedding, and used to assign to the output dimension of Linear which transform the original feature.

  • need_trans (bool, optional) – A flag to control whether to transform original feature linearly. Default is True.

  • act (callable activation function/layer or None, optional) – If not None, applies an activation function to the updated node features. Default: None.

embed_dict

store the embeddings

Type

nn.ParameterDict

hetero_linear

A heterogeneous linear layer to transform original feature.

Type

HeteroLinearLayer

forward()[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class MetapathConv(meta_paths_dict, mods, macro_func, **kargs)[source]

MetapathConv is an aggregation function based on meta-path, which is similar with dgl.nn.pytorch.HeteroGraphConv. We could choose Attention/ APPNP or any GraphConvLayer to aggregate node features. After that we will get embeddings based on different meta-paths and fusion them.

\[\mathbf{Z}=\mathcal{F}(Z^{\Phi_1},Z^{\Phi_2},...,Z^{\Phi_p})=\mathcal{F}(f(H,\Phi_1),f(H,\Phi_2),...,f(H,\Phi_p))\]

where \(\mathcal{F}\) denotes semantic fusion function, such as semantic-attention. \(\Phi_i\) denotes meta-path and \(f\) denotes the aggregation function, such as GAT, APPNP.

Parameters
  • meta_paths_dict (dict[str, list[tuple(meta-path)]]) – contain multiple meta-paths.

  • mods (nn.ModuleDict) – aggregation function

  • macro_func (callable aggregation func) – A semantic aggregation way, e.g. ‘mean’, ‘max’, ‘sum’ or ‘attention’

forward(g_dict, h_dict)[source]
Parameters
  • g_dict (dict[str: dgl.DGLGraph]) – A dict of DGLGraph(full batch) or DGLBlock(mini batch) extracted by metapaths.

  • h_dict (dict[str: torch.Tensor]) – The input features

Returns

h – The output features dict

Return type

dict[str: torch.Tensor]

class HeteroGraphConv(mods)[source]

A generic module for computing convolution on heterogeneous graphs.

The heterograph convolution applies sub-modules on their associating relation graphs, which reads the features from source nodes and writes the updated ones to destination nodes. If multiple relations have the same destination node types, their results are aggregated by the specified method.

If the relation graph has no edge, the corresponding module will not be called.

Parameters
  • mods (dict[str, nn.Module]) – Modules associated with every edge types. The forward function of each module must have a DGLHeteroGraph object as the first argument, and its second argument is either a tensor object representing the node features or a pair of tensor object representing the source and destination node features.

  • aggregate (str, callable, optional) –

    Method for aggregating node features generated by different relations. Allowed string values are ‘sum’, ‘max’, ‘min’, ‘mean’, ‘stack’. The ‘stack’ aggregation is performed along the second dimension, whose order is deterministic. User can also customize the aggregator by providing a callable instance. For example, aggregation by summation is equivalent to the follows:

    def my_agg_func(tensors, dsttype):
        # tensors: is a list of tensors to aggregate
        # dsttype: string name of the destination node type for which the
        #          aggregation is performed
        stacked = torch.stack(tensors, dim=0)
        return torch.sum(stacked, dim=0)
    

mods

Modules associated with every edge types.

Type

dict[str, nn.Module]

forward(g, inputs, mod_args=None, mod_kwargs=None)[source]

Forward computation

Invoke the forward function with each module and aggregate their results.

Parameters
  • g (DGLHeteroGraph) – Graph data.

  • inputs (dict[str, Tensor] or pair of dict[str, Tensor]) – Input node features.

  • mod_args (dict[str, tuple[any]], optional) – Extra positional arguments for the sub-modules.

  • mod_kwargs (dict[str, dict[str, any]], optional) – Extra key-word arguments for the sub-modules.

Returns

Output representations for every types of nodes.

Return type

dict[str, Tensor]

class ATTConv(ntypes, dim)[source]

It is macro_layer of the models [HetGNN]. It presents in the 3.3.2 Types Combination of the paper.

In this framework, to make embedding dimension consistent and models tuning easy, we use the same dimension d for content embedding in Section 3.2, aggregated content embedding in Section 3.3, and output node embedding in Section 3.3.

So just give one dim parameter.

Parameters
  • dim (int) – Input feature dimension.

  • ntypes (list) – Node types.

  • Note

    We don’t implement multi-heads version.

    atten_w is specific to the center node type, agnostic to the neighbor node type.

forward(hg, h_neigh, h_center)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class MacroConv(in_feats: int, out_feats: int, num_heads: int, dropout: float = 0.0, negative_slope: float = 0.2)[source]
Parameters
  • in_feats (int) – Input feature size.

  • out_feats (int) – Output feature size.

  • num_heads (int) – Number of heads in Multi-Head Attention.

  • dropout (float, optional) – Dropout rate, defaults: 0.

forward(graph, input_dst: dict, relation_features: dict, edge_type_transformation_weight: ParameterDict, central_node_transformation_weight: ParameterDict, edge_types_attention_weight: Parameter)[source]
Parameters
  • graph – dgl.DGLHeteroGraph

  • input_dst – dict: {ntype: features}

  • relation_features – dict: {(stype, etype, dtype): features}

  • edge_type_transformation_weight – ParameterDict {etype: (n_heads * hidden_dim, n_heads * hidden_dim)}

  • central_node_transformation_weight – ParameterDict {ntype: (input_central_node_dim, n_heads * hidden_dim)}

  • edge_types_attention_weight – Parameter (n_heads, 2 * hidden_dim)

Returns

output_features: dict, {“type”: features}

class SemanticAttention(in_size, hidden_size=128)[source]
forward(z, nty=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class CompConv(comp_fn, norm='right', linear=False, in_feats=None, out_feats=None, bias=False, activation=None, _allow_zero_in_degree=False)[source]

Composition-based convolution was introduced in Composition-based Multi-Relational Graph Convolutional Networks and mathematically is defined as follows:

Parameters

comp_fn (str, one of 'sub', 'mul', 'ccorr') –

forward(graph, feat, h_e, Linear=None)[source]

Compute Composition-based convolution.

Parameters
  • graph (DGLGraph) – The graph.

  • feat (torch.Tensor or pair of torch.Tensor) – If a torch.Tensor is given, it represents the input feature of shape \((N, D_{in})\) where \(D_{in}\) is size of input feature, \(N\) is the number of nodes. If a pair of torch.Tensor is given, which is the case for bipartite graph, the pair must contain two tensors of shape \((N_{in}, D_{in_{src}})\) and \((N_{out}, D_{in_{dst}})\).

  • Linear (a Linear nn.Module, optional) – Optional external weight tensor.

  • h_e (torch.Tensor) – \((1, D_{in})\) means the edge type feature.

Returns

The output feature

Return type

torch.Tensor

Raises

DGLError – Case 1: If there are 0-in-degree nodes in the input graph, it will raise DGLError since no message will be passed to those nodes. This will cause invalid output. The error can be ignored by setting allow_zero_in_degree parameter to True. Case 2: External weight is provided while at the same time the module has defined its own weight parameter.

Note

The h_e is a tensor of size (1, D_{in})

  • Input shape: \((N, *, \text{in_feats})\) where * means any number of additional dimensions, \(N\) is the number of nodes.

  • Output shape: \((N, *, \text{out_feats})\) where all but the last dimension are the same shape as the input.

  • Linear shape: \((\text{in_feats}, \text{out_feats})\).

class AttConv(in_feats: tuple, out_feats: int, num_heads: int, dropout: float = 0.0, negative_slope: float = 0.2)[source]

Attention-based convolution was introduced in Hybrid Micro/Macro Level Convolution for Heterogeneous Graph Learning and mathematically is defined as follows:

forward(graph: DGLHeteroGraph, feat: tuple, dst_node_transformation_weight: Parameter, src_node_transformation_weight: Parameter, src_nodes_attention_weight: Parameter)[source]

Compute graph attention network layer.

Parameters
  • graph – specific relational DGLHeteroGraph

  • feat (pair of torch.Tensor) – The pair contains two tensors of shape (N_{in}, D_{in_{src}})` and (N_{out}, D_{in_{dst}}).

  • dst_node_transformation_weight – Parameter (input_dst_dim, n_heads * hidden_dim)

  • src_node_transformation_weight – Parameter (input_src_dim, n_heads * hidden_dim)

  • src_nodes_attention_weight – Parameter (n_heads, 2 * hidden_dim)

Return type

torch.Tensor, shape (N, H, D_out)` where H is the number of heads, and D_out is size of output feature.

class LSTMConv(dim)[source]

Aggregate the neighbors with LSTM

reset_parameters()[source]

Reinitialize learnable parameters.

Note

The LSTM module is using xavier initialization method for its weights.

forward(g, inputs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.