An Introduction to Graph Networks – my notes: Part 1 of X

An Introduction to Graph Networks – my notes: Part 1 of X


1. Introduction

?

I had announced in a LinkedIn post some days ago [https://www.dhirubhai.net/posts/ajay-taneja-47727817_machinelearning-deeplearning-graphnetworks-activity-7132788256881917952-ncXP?utm_source=share&utm_medium=member_desktop ] that I will be publishing my notes on Graph networks as a series of articles/blogs. This is my first blog in the series and here I talk of some fundamental aspects of Graph Networks. This article is organized as follows:


  • Section 2 and section 3 of this article are about the definition of Graphs and some real examples of graph networks.
  • Having discussed some examples of the relational structure between data explicitly expressed through a graph network in section 3, section 4 throws some light on why is it difficult to process a graph structure for Machine or Deep Learning compared to that of image processing or text processing.

?

  • Section 5 goes into the details of the anatomy of a graph and throws light on vertices/nodes, edges and talks of design choices of directed and undirected graphs.

?

  • Section 6 talks of some interesting graph networks such as “Bipartite Graphs” and Folded Networks which are a consequence of Bipartite Graphs.

?

  • Section 7 is all about numerical representation of graphs and hence goes into the basic definitions and mathematics of “Adjacency Matrices” and representation of graphs through an edge List and Adjacency List

?

  • Section 8 talks about Node Attributes and Multigraphs.


2. What are Graphs and why Graphs?

?

Graphs are a general language for describing and analysing entities with relationships and interactions. This means that rather than thinking of a domain as specific data points we can think of it in terms of networks and relationships between the entities. This means that there is an underlying graph of relation between the entities. The structure of the graph will denote the relationship between the entity’s as shown in the figure below.

Figure: Graph Network – entities related to each other in accordance with the connections




2. Data as Graphs: Examples

?

There are many types of data that can be represented as graphs and modelling the graphical relations allows us to build accurate models of the underlying phenomenon. Some of the examples of graph networks include:

  • Event Networks
  • Disease Pathways
  • Food webs
  • Particle Networks
  • Underground Networks
  • Social Networks
  • Communication Networks
  • Citation Networks
  • Network of Neurons (Brain)
  • Econmoic Networks
  • Molecules
  • 3D Shapes
  • Computer code- graph between different functions


Stated below is a concise explanation of the above graph networks.


Event Graphs:

The figure below is an example of an event graph where the circles indicate individual events, the solid arrows are causal links extracted from the game, and the dashed arrows are player-links derived from the flock of cameras.

Figure: An example of an event graph.


Disease Pathways:

This is an example of a Protein-Protein interaction wherein a system of interacting proteins collectively produces some disease. In the figure below, proteins associated with a disease are projected onto the protein-protein interaction (PPI) network. Disease pathway is then a subgraph of the PPI network defined by the set of disease-associated proteins.

Figure: Disease pathways [Reference


Food webs:

?

The figure shows a simple food web for a community of seven species: robin, fox, grasshopper, raccoon, salamander, milksnake, and toad. It may be noticed that, here, that we can de?ne competition using the food web. Two species compete if and only if they have a common prey. Thus, in the example of the figure below, raccoon and fox compete (since robin is a common prey), milksnake and raccoon compete, while salamander and robin do not compete. We use this competition relation to de?ne a graph called the competition graph.

Figure: A simple food web


?Particle networks [Ref: https://commons.wikimedia.org/wiki/File:Elementary_particle_interactions.svg ] :

The diagram below summarizes the tree-level interactions between elementary particles described in the Standard Model. Vertices (darkened circles) represent types of particles, and edges (blue arcs) connecting them represent interactions that can take place.

The organization of the diagram is as follows: the top row of vertices (leptons and quarks) are the matter particles; the second row of vertices (photon, W/Z, gluons) are the force mediating particles; and the bottom row is the Higgs boson.


Figure: Example of a Particle network in Physics


Underground Network:

The London Underground Tube Map is also a graph. The stations are the vertices and the train lines joining them are the edges. Interchanges are shown where different lines connect.

Figure: Topology map of the London Underground Tube


Social Network Graph

Model of a social network describes a graph with several types of nodes and edges with respect to each activity on the online platform. According to each model, many sub graphs can be extracted and analyzed in order to predict behavioural patterns of the users within the network.

The figure below shows a typical graph of a social media platform as LinkedIn or Facebook or X (previously, Twitter).

Figure: A typical Social Network Graph model (e.g., LinkedIn)


Citation network graph:

Here is a nice interactive example of a citation network graph: https://signsat40.signsjournal.org/cocitation/interactive/

?

Network of neurons:

This is graph showing how neurons in human brain are connected.

Figure: Network of neurons in the brain


Knowledge Graphs:

And then, we can take knowledge and represent facts as relationship between different entities. These are called as Knowledge Graphs. ?Knowledge graphs, as shown in Figure, are networks of entities, their semantic types, properties and relationships (Nickel et al. 2016).

Figure: Knowledge Graphs [


Software Code:

Figure: Software Code as a Graph Network


4. Graphs for Machine and Deep Learning: Image vs text vs graph processing for ML/DL

?

With the several examples described above, the question to be answered is how we take advantage of the relational structure inherent in a graph for better and accurate prediction. This is because explicit modelling of relationship can help us achieve better and more accurate predictions.

This leads us to the next topic of discussion – let us understand why graph networks are difficult to process for Machine and Deep Learning.?

?

Why are graph networks difficult to process for Deep Learning?

?

It should be underscored that compared to images or text; graphs are relatively difficult to process for deep learning. This is because graph networks have arbitrary size and complex topology as shown in the figure below.


Figure: Graph networks – no fixed node ordering or reference point


Contrary to images which are represented as fixed grids or text which are sequences represented as higher dimension word embedding – in case of graphs there is no spatial locality. It may be recalled that in case of text, we have a reference point – thinking of the state-of-the-art transformers - we have the concept of positional encoding which allows us to model the order of the words [https://www.dhirubhai.net/pulse/deep-dive-positional-encodings-transformer-neural-network-ajay-taneja/?trackingId=mSLcJxc7TlOj18qu3bBvEQ%3D%3D ]

?

Figure: Graph vs Image (grid of pixels) vs text (Sequence)


What is the Deep Learning process in Graphs?

Deep Learning process in graphs will involve taking the entire graph network, passing it through the “designed” neural network architecture that allows us to do this end-to-end without any human feature engineering.


Figure: End-to-End Deep Learning process in Graphs (without any human feature engineering)


For the Deep Learning process in graphs, each node will be represented as “d” dimensional embedding such that similar nodes in the network are embedded close together in the embedding space. depicting a feature representation of a node or an embedding of a link or an embedding of an entire graph.


Figure: Mapping the nodes of a graph into a “d” dimensional embedding for the Deep Learning process


Methods for Machine Learning and Representation Learning (Representation Learning implies the automatic extraction of features from the graph without any human feature engineering) for Graph Structural Data:

?

  • Traditional Methods – These include Graphlets, Graph Kernels
  • Methods for Node Embedding – DeepWalk and Node2Vec
  • Graph Neural Network – Graph Convolution Neural Network
  • Knowledge Graph and Learning and then,
  • Deep Generartive Models for Graph


I will be talking of all of the above methods in my subsequent blogs in this series.


5. Components of a Graph and Design Choices:

?

Next, let us see what the components of a graph are or a network – the network is composed of two types of objects; first we have objects or entities known as nodes or vertices and then we have interactions between these nodes which are represented as links or edges – the entire domain is called a network or a graph.

?

It should be understood that the choice of a proper network representation of a given domain/problem determines our ability to use the network successfully. In some cases, there is a unique, unambiguous representation but in other cases the representation is by no means unique. The way you assess the links will determine the nature of the problem you study.


Figure: Components of a Graph


Design Choices:

Let us now see what the design choices are we’re faced with whilst creating graphs from data. Firstly, let us define Undirected Graphs and Directed Graphs

Undirected vs Directed Graphs:

Undirected graphs are graphs used for modelling symmetric or reciprocal relationships e.g., collaboration, friendship, interaction between proteins, etc.

Figure: Undirected Graphs


Contrary, to undirected graphs, directed graphs are captured by directed links where every link has a source and a destination Examples include phone calls, financial transactions, following on LinkedIn / Instagram. This is denoted by an arrow as shown in the figure below.


Figure: Directed Graphs


Undirected Graphs: Node Degree

Node Degree denotes the number of edges adjacent to a given node. For example, the node A in the figure below has degree 4 as it’s got 4 edges connected to it.


Figure: Node Degree in an undirected graph


Undirected Graphs: Average Node Degree:

Average Node Degree simply denotes the average degree of all the nodes in the network. Mathematically, the average node degree is represented as:

It may be noticed that mathematically, the average node degree will work out to twice the number of edges in the network divided by the number of nodes in the network- one can understand here that when we’re computing the node degree each edge gets counted twice because every edge has two end points.

The above is valid for “Undirected Graphs”.


Figure: Justification on why the average node degree twice the number of edges divided by the number of nodes.


Directed Graphs: Node Degree

In Directed Graphs, we have the in-degree and out-degree. In-degree is the number of edges pointing towards the node whereas the out-degree is the number of edges pointing outward.

It may be noticed from the figure below, that the node C has an in-degree of 2 and out-degree of 1.

Figure: Directed Graphs – in-degree and out-degree



6. Bipartite Graphs and Folded Bipartite Graphs

?

Bipartite Graph [Bipartite ~ 2 partitions]:

Another very popular type of graph structure that is used quite often is called “Bipartite Graph”. Bipartite Graph is a graph where the nodes can be divided into two disjoin sets U and V such every link connects to a node in U and to one node in V such U and V are independent sets,

Examples of Bipartite Graph include:

  • Scientific Authors linked to the papers they authored.
  • ?Actors linked to the movies they appeared in.
  • ?Users linked to the movies they watched.


Figure: Bipartite Graph

?

Folded Bipartite Graph:

Having discussed about the Bipartite Graph, it is worthwhile talking about Folded Bipartite Graph – considering an example where we have Bipartite Graph as shown below which denotes scientific authors (1, 2, 3, 4, 5, 6) linked to the publications (A, B, C, D).

?

Now, if we just take the vertices of the scientific authors and connect these vertices if they have co-authored certain publication as shown in the network below.

Figure: Folded network – scientific authors connected based on the publications that they’ve co-authored


Then this type of graph is called Folded or Projected Bipartite Graph. As shown in the figure above, since 1,2,3 have co-authored Publication A, they are connected. Similarly, since nodes 2 and 5 have co-authored publications 4, 5, 6, 7 they are connected as shown above.

?

Analogous, to the above we can create a folded network corresponding to the publications A, B, C and D and we will obtain a folded network as shown in the figure below.


Figure: Folded network


It can be seen from the figure above, that, since the publications B, C, D have been co-authored by 5, they are connected and since publications B and A are co-authored by 2, they are connected.

?

7.How do we represent graphs numerically?

?

Adjacency Matrix for Undirected Graphs:


Representing graphs numerically is again an interesting problem. One way to represent graphs numerically is through an “Adjacency Matrix”. For example, in case of a network s shown in the figure below which ahs 4 nodes, the Adjacency matrix will be a square matrix of size 4 x 4 and will have entries 0 or 1. The element ij of the matrix will have an entry 1 if the nodes I and j are connected and an entry 0 if the nodes I and j are not connected.

With the above rule, the Adjacency matrix of a 4 noded network will be represented as below:


Figure: Adjacency matrix of a 4 noded undirected graph


It should be noted – the Adjacency Matrix of an Undirected Graph will naturally be symmetric – as implied through the definition above.

?

Adjacency Matrix of a Directed Graph:

If the graph is Directed, then, its Adjacency Matrix will not be symmetric. This is because if the node I is connected to j, then, the element Aij will be equal to 1 but since the node j is not connected to I, the element Aji will be 0.


Figure: Adjacency matrix for a 4 noded directed network is unsymmetric.


Node Degree in terms of Adjacency Matrix:

I have talked about the node degree in section 5. Having obtained the Adjacency matrix, it is possible express the node degree in terms of Adjacency Matix.

For an undirected graph,

Figure: Node degree in terms of Adjacency matrix for an Undirected graph


For a directed graph,


Figure: Node degree in terms of Adjacency matrix for a Directed graph


Adjacency Matrices are Sparse Because Networks are Sparse Graphs:

?

One important consequence of real-world graphs is that they are extremely sparse. Thus, if we look at the Adjacency Matrix of a real-world network, it is extremely sparse – that is: many vertices/nodes are not connected and thus the Adjacency matrix has many 0’s.

One can imagine that – human social network, citation network is extremely sparse [it is not possible for everybody to be connected to everybody or all authors have all publications!]. Thus, very rarely the Adjacency matrix will be a dense matrix.

?

Representing Graph as an Edge List:

There are 2 other ways to represent a graph numerically – one is to represent graph as an edge list. This representation is also very popular in Deep Learning frameworks.

For a network shown in the figure below, the graph can be represented as a list of edges as illustrated below.


Figure: Graph Representation as Edge List


However, unlike the Adjacency matrix, it is not trivial to do any kind of analysis of a graph through this type of representation because even computing a degree of a given node is not trivial.

?

Representing Graph as an Adjacency List:

Another common representation of a graph is Adjacency List. Adjacency List is easier to work if the network is sparse and large and allows us to quickly retrieve all neighbours of a given node.

For example, for the network shown, its Adjacency List is shown to the right in the figure:


Figure: Representing Graph as an Adjacency List


?8 .Nodes and Edge Attributes:

?

A last important thing that is noteworthy is that graphs can have attributes attached to them. Nodes, edges as well s the entire graph can have attributes/properties attached to them.

Example:

  • An Edge can have a weight denoting how strong is the relationship.
  • It can have a type – Friend based relationship, co-worker relationship.
  • Sign – friend or foe

?

Edges can have different properties – if it’s a phone call- its duration.

Nodes can have attributes such as: age, gender, location depending upon the entity under consideration.

?

Representing properties within the Adjacency Matrix:

It is also possible to represent such as weights within the Adjacency Matrix as shown in the figure below – here the weight represents the strength of the connection.


Figure: Weighted undirected graph


Multi-graphs

Sometimes one may want to represent a graph as a multi-graph as illustrated in the figure below where there is more than one connection between nodes. This can be common where the entities under consideration have more than one transaction involved.


Figure: Example of a Multigraph


要查看或添加评论,请登录

Ajay Taneja的更多文章

社区洞察

其他会员也浏览了