Task-specific labeled data is usually required for GNNs which is arduous to obtain.
Pre-training expressive GNN model on unlabeled data with self-supervision and then transfer the learned model to downstream tasks with only a few labels.
Likelihood of graph generation decomposed into two components: attribute generation and edge generation.
Introduction
GNNs for semi-supervised node classification, recommendation systems and knowledge graph inference.
Input: graph with attributes — convolutional filters generate node-level representations layer by layer.
Similar to BERT pre-trained model, you can train on an unlabeled corpus and then transfer the model to downstream tasks with few labels.
NN-generation techniques don't work for GNNs because they generate graph structure without attributes. Also their scale is limited.
Diagram showing the GPT-GNN framework: attribute generation and edge generation as a joint optimization problem.
Attribute generation and edge generation joint optimization == maximizing probability likelihood of the whole attributed graph.
OAG: Open Academic Graph of 179M nodes & 2B edges successfully done.
Preliminaries and Related Work
Assuming $H_t$ is the node representation of node $t$ at the $l$-th GNN layer.
$N_t$ are all of the source nodes of node $t$. $E(s,t)$ all edges from $s$ to $t$.
Equation showing the GNN layer update: Aggregate and Extract functions for neighborhood information.
Aggregate: aggregation from neighborhood information (mean, max, sum).