Setting Up Graph Neural Network Prediction Tasks: Part 13 of my Graph Series of Blogs
Stanford University Course on "Machine Learning with Graphs"

Setting Up Graph Neural Network Prediction Tasks: Part 13 of my Graph Series of Blogs


1. Introduction:

?

This is the continuation of my Graph Series of Blogs and is the thirteenth blog in the series. In this blog, we will be discussing a very interesting aspect of Graph Machine Learning! ?This blog will further dig into the theory of Graph Neural Networks beyond the discussions in some of my earlier blogs in the ongoing Graph Series of blogs:

?

?

?

?

?

In the discussions so far, we have seen how to design Graph Neural Networks, how to set the training- inference pipeline and we have talked about:

?

a)????? Setting up the input Graph.

?

b)???? Defining the Graph Neural Network Architecture.

?

c)????? Using the Graph Neural Network Architecture to create node embeddings.

?

d)???? We also talked about getting to the prediction head from the node embeddings.

?

e)????? We clearly discussed different prediction heads based on node classification, link prediction and graph classification.


f)????? We talked about how to make node predictions and then comparing the predictions with ground truth labels to optimize the loss and be able to back propagate all the way down to the graph neural network structure.

?

g)???? And then with different evaluation metrics, one was able to assess the performance of the model.

?

The above points are illustrated in the Figure below and were part of the discussions in Part 8, Part 9 and Part 12 of the on ongoing series as hyperlinked above.



Figure 1: A Typical GNN Training Pipeline


One thing that remains unanswered is how do we set up the tasks – how do we split effectively into training and validation and test set.

?

We will now address the question on splitting the Graph datasets into training/validation/test sets.



Figure 2: How do we split the dataset into Training/Validation/Test Set?


With this objective in mind, this article is organized as follows:

?

·???????? Section 2 talks about splitting of Datasets into Training/Validation/Test sets and explains about Fixed and Random Splits.

?

·???????? Section 3 explains why the spitting into Training/Validation/Test Sets in Graphs is different compared to an problem involving just Images or Documents.

?

·???????? Section 4 explains about Transductive and Inductive Setting for Splitting of Datasets in Graphs.

?

·???????? Section 5 covers some examples of splitting of datasets for problems involving: Node Classification, Graph Classification and Edge Prediction.

?

?

2. Splitting of Graph Dataset into Training / Validation and Test Set

?

We have two options on splitting the dataset:

?

Option 1: Fixed Split

?

We can split the dataset at once wherein we can take our dataset and split it into three adjoining independent pieces:

?

a)????? We will have a “Training Set” which we will use it optimize the to get the GNN model parameters.

?

b)???? We have the “Validation Set” which we will use to tune the model hyperparameters and various constants / decision choices in terms of the model architecture.


c)?????Then, once we have the final model parameters and the hyperparameters we will apply the model to the independent “Test Set” that was held out all the time in order to evaluate the final performance.

?

This is a fair way to evaluate the model because we used the training, and the validation set to build and fine tune the model and we use the untouched test set to evaluate the model performance.


Option 2: Random Split

?

In Random Split, we randomly split the data into training, validation and test set and then rather a single split we use the parameters corresponding to average performance over different random seeds. That is – we try different instantiations of training-validation-test split and report average performance to get more robust result.


3. Concern with Graphs: Why splitting Graphs is special?

?

Something that’s unique and interesting in Graphs is that we cannot guarantee that the test set will really be held out – that is we cannot guarantee that there will no information leakage from the Training and Validation set into the Test Set. Let’s understand this in some detail:

?

Let us say we have an image dataset, or a document dataset as shown in the figure below. Then, in such datasets the datapoints – which are images or documents are independent of one another.


Figure 3: Datapoints independent of one-another in Image or Document dataset


Since the datapoints are independent of one another, it is easy to split into training, test or validation sets. There is no leakage of information from training to validation set or the test set.

However, splitting in Graphs is different – the problem with Graphs is that the nodes are connected with each other as shown below:


Figure 4: Datapoints in Graphs – Nodes – connected to each other


These nodes are not independent of one another – they are connected to one another – each node represents a data-point. For example, if we take the node 5 then in order to make predictions on the node 5, the GNN will also have to take information from nodes 1 and 2.


This means that nodes 1 and 2 will affect the predictions on the node 5. This means that if nodes 1 and 2 are the training dataset and node 5 is the test dataset then clearly there will be some information leakage. This is why the splitting of dataset in Graphs is interesting, let us see what the various options are.

?


4. Options in Graphs for Splitting of Dataset:

?

Let us now see the options in Graphs for splitting of the dataset. We can do the following in Graphs:

?

Solution 1: Transductive Setting

?

The first solution what we can do is: “Transductive Settings” where the input graph can be seen in all the dataset splits.? That is – we work with the same graph structure in training, validation and test sets. We only split the node labels – this means we keep some node labels in the training set, some in the validation set and some in the test set ?- keeping the same graph structure always, as shown in the figure below:


Figure 5: Transductive Setting – Split only the Node Labels – Whole Graph Structure retained for Training


This means that during training we compute the embedding using the entire graph structure but train using the node labels 1 and 2.


At validation time, we compute the embeddings using the entire graph but fine tune the hyperparameters using the node labels 3 and 4 – for the example shown in Figure 5.

?

Solution 2:

Another approach is called – Inductive Setting. Here we break the edges between splits into multiple graphs or multiple independent components as shown in the Figure 6 below:



Figure 6: Three independent (separate) Graphs for training, validation and test sets


Here we will have three Graphs which are independent – three different Graphs – we call one as the Training Graph and others as the Validation and the Test Graph.

?

We can observe from the Figure 6 that if we are making the prediction on node 5, it will not be affected by the information from nodes 1 and node 2. This means that:

?

  • During training time, we compute the embeddings using nodes 1 and 2 and – this is for training the model parameters.

?

  • During validation time, we compute the embeddings using nodes 3 and 4 and evaluate on the nodes 3 and 4 labels (check if the sentence is correct).


The drawback in Inductive setting is that we have thrown away a lot of Graph information (edges) during the formulation of training, validation and test set and if the Graphs are small – this is not preferred.

?

Therefore, the trade-off is either we have leakage of information between training, validation and tests set and keep the labels independent as in Transductive setting or throw away the information and chop the Graph in independent pieces as in Inductive setting.


Summary of Solution 1 and Solution 2:

?

Transductive Setting:

?

1.????? In Transductive Setting, we have training, validation and the test set on same Graph.

?

2.????? The entire Graph is part of the training, validation and the test set.

?

3.????? We only split the node labels into the training, validation and test set.

?

This setting is applicable to node prediction as well as edge prediction tasks.

?

Inductive Setting:

?

1.????? For Inductive Setting, the training, validation and test set are all in different Graphs.

?

2.????? These Graphs are independent of each other.

?

3.????? We compute the node embeddings and train using the respective Graph part of the training set and validate on the respective graph and node labels part of the validation set and test on an unseen graphs part of test set.


5. Examples – Transductive and Inductive Setting

?

Now, let us see some examples involving node classification and link prediction with Transductive and Inductive approach.

?

Node Classification:

?

In terms of Transductive Node classification, all the splits retain the entire graph structure, but the labels of the respective nodes are split into training, validation and test set as shown below: ?



Figure 7: Splitting of Node Labels into Training/Validation/Test Set while retaining the entire Graph in all sets – Node Classification Problem


In Inductive setting, we split the Graph Structure into multiple Graphs incorporating some graphs incorporating some graphs in the training set, some in the validation and others in the test set.


Multiple Graphs will have to be created by dropping/chopping edges from the Graph structure and split the multiple graphs into training/validation/test sets as shown in Figure 8.


Figure 8: Inductive Setting – Splitting into Training/Validation/Test involves 3 independent Graphs for the respective sets

?

Graph Classification:

?

Now that we have talked about Node Classification, lets switch to Graph Classification. In Graph Classification, Inductive Setting is well defined because we have independent Graphs, and we can simply put them into training, validation and test set. Suppose we have a dataset of 5 graphs, then, each split will contain independent graph(s). as shown in the Figure below. There is no chance of information leakage as Graphs are already independent.



Figure 9: Graph Classification – problem is simpler as we already have independent Graphs and assign them appropriately into Training/Validation/Test Set


Edge Classification:

?

The trickiest of setting up training, validation and test set in Graphs is in a link prediction problem. In Link prediction, the goal is to predict missing edges and setting up of training, validation and tests et is a bit of a thought. That is because, link prediction is a self-supervised or semi supervised task, and we need to create data splits.

?

I have discussed about setting up of Link Prediction in section 8 of Part 11 of the ongoing Graph series of my blogs here. ?In Link Prediction, we will have to hide some edges from the GNN and let the GNN predict if the edges exist.

?

For link prediction, we will split edges twice

  • Step 1: Assign 2 types of edges in the original graph


  1. Message passing edges: Used for GNN message passing
  2. Supervision edges: Use for computing objectives


  • After step 1:


  1. Only message edges will remain in the graph
  2. Supervision edges are used as supervision for edge predictions made by the model, will not be fed into GNN!

?

  • Step 2: Split edges into train / validation / test

?

?Transductive Setting:

The Transductive approach is illustrated below:



Figure 10: Transductive Setting – Link Prediction Problem


Inductive Link Prediction:

?

In Inductive Link Prediction, each inductive split will contain an independent Graph. In Training or Validation or Test Set, we have two types of Edges:

?

  • Message Passing Edges
  • Supervision Edges

?

Supervision Edges are not input to the GNN.


This is illustrated in the Figure below:


Figure 11: Inductive Link Setting for Link Prediction tasks


要查看或添加评论,请登录

Ajay Taneja的更多文章

社区洞察

其他会员也浏览了