openhgnn.dataset.AsNodeClassificationDataset

class AsNodeClassificationDataset(data, name=None, labeled_nodes_split_ratio=None, prediction_ratio=None, target_ntype=None, label_feat_name='label', label_mask_feat_name=None, **kwargs)[源代码]

Repurpose a dataset for a standard semi-supervised transductive node prediction task.

The class converts a given dataset into a new dataset object that:

  • Contains only one heterogeneous graph, accessible from dataset[0].

  • The graph stores:

    • Node labels in g.nodes[target_ntype].data['label'].

    • Train/val/test masks in g.nodes[target_ntype].data['train_mask'], g.nodes[target_ntype].data['val_mask'], and g.nodes[target_ntype].data['test_mask'] respectively.

  • In addition, the dataset contains the following attributes:

    • num_classes, the number of classes to predict.

    • train_idx, val_idx, test_idx, train/val/test indexes.

The class will keep only the first graph in the provided dataset and generate train/val/test masks according to the given spplit ratio. The generated masks will be cached to disk for fast re-loading. If the provided split ratio differs from the cached one, it will re-process the dataset properly.

参数:
  • data (DGLDataset or DGLHeteroGraph) – The dataset or graph to be converted.

  • name (str) – The dataset name. Optional when data is DGLDataset. Required when data is DGLHeteroGraph.

  • labeled_nodes_split_ratio ((float, float, float), optional) – Split ratios for training, validation and test sets. Must sum to 1. If None, we will use the train_mask, val_mask and test_mask from the original graph.

  • prediction_ratio (float, optional) – The ratio of number of prediction nodes to all unlabeled nodes. Prediction_ratio ranges from 0 to 1. If None, we will use the pred_mask from the original graph.

  • target_ntype (str) – The node type to add split mask for.

  • label_feat_name (str, optional) – The feature name of label. If None, we will use the name “label”.

  • label_mask_feat_name (str, optional) – The feature name of the mask indicating the indices of nodes with labels. None means that all nodes are labeled.

num_classes

Number of classes to predict.

Type:

int

train_idx

An 1-D integer tensor of training node IDs.

Type:

Tensor

val_idx

An 1-D integer tensor of validation node IDs.

Type:

Tensor

test_idx

An 1-D integer tensor of test node IDs.

Type:

Tensor

pred_idx

An 1-D integer tensor of prediction node IDs.

Type:

Tensor