登录查看更多内容

An Introduction to Graph Networks – my notes: Part 1 of X

Ajay Taneja

Senior Data Engineer | Generative AI Engineer at Jaguar Land Rover | Ex - Rolls-Royce | Data Engineering, Data Science, Finite Element Methods Development, Stress Analysis, Fatigue and Fracture Mechanics

发布日期: 2023年11月26日

1. Introduction

I had announced in a LinkedIn post some days ago [https://www.dhirubhai.net/posts/ajay-taneja-47727817_machinelearning-deeplearning-graphnetworks-activity-7132788256881917952-ncXP?utm_source=share&utm_medium=member_desktop ] that I will be publishing my notes on Graph networks as a series of articles/blogs. This is my first blog in the series and here I talk of some fundamental aspects of Graph Networks. This article is organized as follows:

Section 2 and section 3 of this article are about the definition of Graphs and some real examples of graph networks.
Having discussed some examples of the relational structure between data explicitly expressed through a graph network in section 3, section 4 throws some light on why is it difficult to process a graph structure for Machine or Deep Learning compared to that of image processing or text processing.

Section 5 goes into the details of the anatomy of a graph and throws light on vertices/nodes, edges and talks of design choices of directed and undirected graphs.

Section 6 talks of some interesting graph networks such as “Bipartite Graphs” and Folded Networks which are a consequence of Bipartite Graphs.

Section 7 is all about numerical representation of graphs and hence goes into the basic definitions and mathematics of “Adjacency Matrices” and representation of graphs through an edge List and Adjacency List

Section 8 talks about Node Attributes and Multigraphs.

2. What are Graphs and why Graphs?

Graphs are a general language for describing and analysing entities with relationships and interactions. This means that rather than thinking of a domain as specific data points we can think of it in terms of networks and relationships between the entities. This means that there is an underlying graph of relation between the entities. The structure of the graph will denote the relationship between the entity’s as shown in the figure below.

Figure: Graph Network – entities related to each other in accordance with the connections

2. Data as Graphs: Examples

There are many types of data that can be represented as graphs and modelling the graphical relations allows us to build accurate models of the underlying phenomenon. Some of the examples of graph networks include:

Event Networks
Disease Pathways
Food webs
Particle Networks
Underground Networks
Social Networks
Communication Networks
Citation Networks
Network of Neurons (Brain)
Econmoic Networks
Molecules
3D Shapes
Computer code- graph between different functions

Stated below is a concise explanation of the above graph networks.

Event Graphs:

The figure below is an example of an event graph where the circles indicate individual events, the solid arrows are causal links extracted from the game, and the dashed arrows are player-links derived from the flock of cameras.

Disease Pathways:

This is an example of a Protein-Protein interaction wherein a system of interacting proteins collectively produces some disease. In the figure below, proteins associated with a disease are projected onto the protein-protein interaction (PPI) network. Disease pathway is then a subgraph of the PPI network defined by the set of disease-associated proteins.

Food webs:

The figure shows a simple food web for a community of seven species: robin, fox, grasshopper, raccoon, salamander, milksnake, and toad. It may be noticed that, here, that we can de?ne competition using the food web. Two species compete if and only if they have a common prey. Thus, in the example of the figure below, raccoon and fox compete (since robin is a common prey), milksnake and raccoon compete, while salamander and robin do not compete. We use this competition relation to de?ne a graph called the competition graph.

?Particle networks [Ref: https://commons.wikimedia.org/wiki/File:Elementary_particle_interactions.svg ] :

The diagram below summarizes the tree-level interactions between elementary particles described in the Standard Model. Vertices (darkened circles) represent types of particles, and edges (blue arcs) connecting them represent interactions that can take place.

The organization of the diagram is as follows: the top row of vertices (leptons and quarks) are the matter particles; the second row of vertices (photon, W/Z, gluons) are the force mediating particles; and the bottom row is the Higgs boson.

Figure: Example of a Particle network in Physics

Underground Network:

The London Underground Tube Map is also a graph. The stations are the vertices and the train lines joining them are the edges. Interchanges are shown where different lines connect.

Figure: Topology map of the London Underground Tube

Social Network Graph

Model of a social network describes a graph with several types of nodes and edges with respect to each activity on the online platform. According to each model, many sub graphs can be extracted and analyzed in order to predict behavioural patterns of the users within the network.

The figure below shows a typical graph of a social media platform as LinkedIn or Facebook or X (previously, Twitter).

Figure: A typical Social Network Graph model (e.g., LinkedIn)

Citation network graph:

Here is a nice interactive example of a citation network graph: https://signsat40.signsjournal.org/cocitation/interactive/

Network of neurons:

This is graph showing how neurons in human brain are connected.

Knowledge Graphs:

And then, we can take knowledge and represent facts as relationship between different entities. These are called as Knowledge Graphs. ?Knowledge graphs, as shown in Figure, are networks of entities, their semantic types, properties and relationships (Nickel et al. 2016).

Software Code:

4. Graphs for Machine and Deep Learning: Image vs text vs graph processing for ML/DL

With the several examples described above, the question to be answered is how we take advantage of the relational structure inherent in a graph for better and accurate prediction. This is because explicit modelling of relationship can help us achieve better and more accurate predictions.

This leads us to the next topic of discussion – let us understand why graph networks are difficult to process for Machine and Deep Learning.?

Why are graph networks difficult to process for Deep Learning?

It should be underscored that compared to images or text; graphs are relatively difficult to process for deep learning. This is because graph networks have arbitrary size and complex topology as shown in the figure below.

Figure: Graph networks – no fixed node ordering or reference point

Contrary to images which are represented as fixed grids or text which are sequences represented as higher dimension word embedding – in case of graphs there is no spatial locality. It may be recalled that in case of text, we have a reference point – thinking of the state-of-the-art transformers - we have the concept of positional encoding which allows us to model the order of the words [https://www.dhirubhai.net/pulse/deep-dive-positional-encodings-transformer-neural-network-ajay-taneja/?trackingId=mSLcJxc7TlOj18qu3bBvEQ%3D%3D ]

Figure: Graph vs Image (grid of pixels) vs text (Sequence)

What is the Deep Learning process in Graphs?

Deep Learning process in graphs will involve taking the entire graph network, passing it through the “designed” neural network architecture that allows us to do this end-to-end without any human feature engineering.

Figure: End-to-End Deep Learning process in Graphs (without any human feature engineering)

For the Deep Learning process in graphs, each node will be represented as “d” dimensional embedding such that similar nodes in the network are embedded close together in the embedding space. depicting a feature representation of a node or an embedding of a link or an embedding of an entire graph.

Figure: Mapping the nodes of a graph into a “d” dimensional embedding for the Deep Learning process

Methods for Machine Learning and Representation Learning (Representation Learning implies the automatic extraction of features from the graph without any human feature engineering) for Graph Structural Data:

Traditional Methods – These include Graphlets, Graph Kernels
Methods for Node Embedding – DeepWalk and Node2Vec
Graph Neural Network – Graph Convolution Neural Network
Knowledge Graph and Learning and then,
Deep Generartive Models for Graph

I will be talking of all of the above methods in my subsequent blogs in this series.

5. Components of a Graph and Design Choices:

Next, let us see what the components of a graph are or a network – the network is composed of two types of objects; first we have objects or entities known as nodes or vertices and then we have interactions between these nodes which are represented as links or edges – the entire domain is called a network or a graph.

It should be understood that the choice of a proper network representation of a given domain/problem determines our ability to use the network successfully. In some cases, there is a unique, unambiguous representation but in other cases the representation is by no means unique. The way you assess the links will determine the nature of the problem you study.

Design Choices:

Let us now see what the design choices are we’re faced with whilst creating graphs from data. Firstly, let us define Undirected Graphs and Directed Graphs

Undirected vs Directed Graphs:

Undirected graphs are graphs used for modelling symmetric or reciprocal relationships e.g., collaboration, friendship, interaction between proteins, etc.

领英推荐

Data Phoenix Digest - ISSUE 6.2023

Dmytro Spodarets 1 年前

FiftyOne Computer Vision Community Update – September…

Jimmy Guerrero 1 年前

Heatmaps: FiftyOne Computer Vision Tips and Tricks –…

Voxel51 1 年前

Contrary, to undirected graphs, directed graphs are captured by directed links where every link has a source and a destination Examples include phone calls, financial transactions, following on LinkedIn / Instagram. This is denoted by an arrow as shown in the figure below.

Undirected Graphs: Node Degree

Node Degree denotes the number of edges adjacent to a given node. For example, the node A in the figure below has degree 4 as it’s got 4 edges connected to it.

Figure: Node Degree in an undirected graph

Undirected Graphs: Average Node Degree:

Average Node Degree simply denotes the average degree of all the nodes in the network. Mathematically, the average node degree is represented as:

It may be noticed that mathematically, the average node degree will work out to twice the number of edges in the network divided by the number of nodes in the network- one can understand here that when we’re computing the node degree each edge gets counted twice because every edge has two end points.

The above is valid for “Undirected Graphs”.

Figure: Justification on why the average node degree twice the number of edges divided by the number of nodes.

Directed Graphs: Node Degree

In Directed Graphs, we have the in-degree and out-degree. In-degree is the number of edges pointing towards the node whereas the out-degree is the number of edges pointing outward.

It may be noticed from the figure below, that the node C has an in-degree of 2 and out-degree of 1.

Figure: Directed Graphs – in-degree and out-degree

6. Bipartite Graphs and Folded Bipartite Graphs

Bipartite Graph [Bipartite ~ 2 partitions]:

Another very popular type of graph structure that is used quite often is called “Bipartite Graph”. Bipartite Graph is a graph where the nodes can be divided into two disjoin sets U and V such every link connects to a node in U and to one node in V such U and V are independent sets,

Examples of Bipartite Graph include:

Scientific Authors linked to the papers they authored.
?Actors linked to the movies they appeared in.
?Users linked to the movies they watched.

Folded Bipartite Graph:

Having discussed about the Bipartite Graph, it is worthwhile talking about Folded Bipartite Graph – considering an example where we have Bipartite Graph as shown below which denotes scientific authors (1, 2, 3, 4, 5, 6) linked to the publications (A, B, C, D).

Now, if we just take the vertices of the scientific authors and connect these vertices if they have co-authored certain publication as shown in the network below.

Figure: Folded network – scientific authors connected based on the publications that they’ve co-authored

Then this type of graph is called Folded or Projected Bipartite Graph. As shown in the figure above, since 1,2,3 have co-authored Publication A, they are connected. Similarly, since nodes 2 and 5 have co-authored publications 4, 5, 6, 7 they are connected as shown above.

Analogous, to the above we can create a folded network corresponding to the publications A, B, C and D and we will obtain a folded network as shown in the figure below.

It can be seen from the figure above, that, since the publications B, C, D have been co-authored by 5, they are connected and since publications B and A are co-authored by 2, they are connected.

7.How do we represent graphs numerically?

Adjacency Matrix for Undirected Graphs:

Representing graphs numerically is again an interesting problem. One way to represent graphs numerically is through an “Adjacency Matrix”. For example, in case of a network s shown in the figure below which ahs 4 nodes, the Adjacency matrix will be a square matrix of size 4 x 4 and will have entries 0 or 1. The element ij of the matrix will have an entry 1 if the nodes I and j are connected and an entry 0 if the nodes I and j are not connected.

With the above rule, the Adjacency matrix of a 4 noded network will be represented as below:

Figure: Adjacency matrix of a 4 noded undirected graph

It should be noted – the Adjacency Matrix of an Undirected Graph will naturally be symmetric – as implied through the definition above.

Adjacency Matrix of a Directed Graph:

If the graph is Directed, then, its Adjacency Matrix will not be symmetric. This is because if the node I is connected to j, then, the element Aij will be equal to 1 but since the node j is not connected to I, the element Aji will be 0.

Figure: Adjacency matrix for a 4 noded directed network is unsymmetric.

Node Degree in terms of Adjacency Matrix:

I have talked about the node degree in section 5. Having obtained the Adjacency matrix, it is possible express the node degree in terms of Adjacency Matix.

For an undirected graph,

For a directed graph,

Adjacency Matrices are Sparse Because Networks are Sparse Graphs:

One important consequence of real-world graphs is that they are extremely sparse. Thus, if we look at the Adjacency Matrix of a real-world network, it is extremely sparse – that is: many vertices/nodes are not connected and thus the Adjacency matrix has many 0’s.

One can imagine that – human social network, citation network is extremely sparse [it is not possible for everybody to be connected to everybody or all authors have all publications!]. Thus, very rarely the Adjacency matrix will be a dense matrix.

Representing Graph as an Edge List:

There are 2 other ways to represent a graph numerically – one is to represent graph as an edge list. This representation is also very popular in Deep Learning frameworks.

For a network shown in the figure below, the graph can be represented as a list of edges as illustrated below.

Figure: Graph Representation as Edge List

However, unlike the Adjacency matrix, it is not trivial to do any kind of analysis of a graph through this type of representation because even computing a degree of a given node is not trivial.

Representing Graph as an Adjacency List:

Another common representation of a graph is Adjacency List. Adjacency List is easier to work if the network is sparse and large and allows us to quickly retrieve all neighbours of a given node.

For example, for the network shown, its Adjacency List is shown to the right in the figure:

Figure: Representing Graph as an Adjacency List

?8 .Nodes and Edge Attributes:

A last important thing that is noteworthy is that graphs can have attributes attached to them. Nodes, edges as well s the entire graph can have attributes/properties attached to them.

Example:

An Edge can have a weight denoting how strong is the relationship.
It can have a type – Friend based relationship, co-worker relationship.
Sign – friend or foe

Edges can have different properties – if it’s a phone call- its duration.

Nodes can have attributes such as: age, gender, location depending upon the entity under consideration.

Representing properties within the Adjacency Matrix:

It is also possible to represent such as weights within the Adjacency Matrix as shown in the figure below – here the weight represents the strength of the connection.

Multi-graphs

Sometimes one may want to represent a graph as a multi-graph as illustrated in the figure below where there is more than one connection between nodes. This can be common where the entities under consideration have more than one transaction involved.

要查看或添加评论，请登录

Ajay Taneja的更多文章

The Marriage of Retrieval-Augmented Generation (RAGs) with Knowledge Graphs: Part 15 of my Graph Series of Blogs

2024年10月24日

The Marriage of Retrieval-Augmented Generation (RAGs) with Knowledge Graphs: Part 15 of my Graph Series of Blogs

1. Introduction: The general idea of Retrieval-Augmented Generation (RAGs) is now well understood in LLM community and…

2 条评论
Knowledge Graph Completion and Knowledge Graph Embeddings: Part 14 of my Graph Series of Blogs

2024年9月23日

Knowledge Graph Completion and Knowledge Graph Embeddings: Part 14 of my Graph Series of Blogs

1. Introduction: This is the continuation of my series of blogs on Graphs and is the 14th article in the series.

3 条评论
Setting Up Graph Neural Network Prediction Tasks: Part 13 of my Graph Series of Blogs

2024年8月26日

Setting Up Graph Neural Network Prediction Tasks: Part 13 of my Graph Series of Blogs

1. Introduction: This is the continuation of my Graph Series of Blogs and is the thirteenth blog in the series.
Training Graph Neural Networks: Part 12 of my Graph series of blogs

2024年8月18日

Training Graph Neural Networks: Part 12 of my Graph series of blogs

1. Introduction: This is the continuation of my series of blogs on Graphs and is the twelfth article in the series.
Heterogeneous Graphs and Relational Graph Convolutional Neural Networks (RGCNs): Part 11 of my Graph series of blogs

2024年6月30日

Heterogeneous Graphs and Relational Graph Convolutional Neural Networks (RGCNs): Part 11 of my Graph series of blogs

1. Introduction: This article is the continuation of my series of blogs on “Graphs” and is the eleventh article in the…
Knowledge Graphs for RAG: Part 10 of my Graph series of blogs

2024年5月26日

Knowledge Graphs for RAG: Part 10 of my Graph series of blogs

1.Introduction: Few weeks ago, I had taken a course titled: “Knowledge Graphs for RAG” – the course is offered by…
Deep Learning with Graphs: Part 9 of my Graph Series of blogs

2024年4月28日

Deep Learning with Graphs: Part 9 of my Graph Series of blogs

1. Introduction: This is the continuation of my series of blogs on Graphs.
An Introduction to Graph Neural Networks: Part 8 of my Graph Series

2024年4月7日

An Introduction to Graph Neural Networks: Part 8 of my Graph Series

1. Introduction: This is the continuation of my series of blogs on Graphs and is the 8th blog in the series.
Embedding Entire Graphs or Sub-Graphs: Part 7 of X of my notes

2024年3月18日

Embedding Entire Graphs or Sub-Graphs: Part 7 of X of my notes

1. Introduction: This is the continuation of my series of blogs on Graphs and is the 7th blog in the series.
Formulation of Node Embeddings in Graphs: Node2Vec Algorithm - Part 6 of X of my notes

2024年3月3日

Formulation of Node Embeddings in Graphs: Node2Vec Algorithm - Part 6 of X of my notes

1.Introduction: This is the continuation of my series of articles on Graphs and is the sixth article in the series.

See all articles

An Introduction to Graph Networks – my notes: Part 1 of X

Ajay Taneja

Senior Data Engineer | Generative AI Engineer at Jaguar Land Rover | Ex - Rolls-Royce | Data Engineering, Data Science, Finite Element Methods Development, Stress Analysis, Fatigue and Fracture Mechanics

1. Introduction

2. What are Graphs and why Graphs?

2. Data as Graphs: Examples

4. Graphs for Machine and Deep Learning: Image vs text vs graph processing for ML/DL

5. Components of a Graph and Design Choices:

领英推荐

6. Bipartite Graphs and Folded Bipartite Graphs

7.How do we represent graphs numerically?

?8 .Nodes and Edge Attributes:

Ajay Taneja的更多文章

社区洞察

其他会员也浏览了

New $1 Million Biennial Prize to Revitalize Statistics

What is the Future of Data Analysis?

Norman Alex Newsletter

Piecewise Linear Regression

Simple and intuitive visualization method of how deep networks "see" the world (Code Included)

Data Analytics Models: Unraveling Insights from the Data Symphony

Math is the Language of Prediction

Using Claude 3.5 for Research: Rapid Access to Vast Knowledge with Significant Limits

Perceptron

Unlocking the Power of Bayesian Networks

1. Introduction

2. What are Graphs and why Graphs?

2. Data as Graphs: Examples

4. Graphs for Machine and Deep Learning: Image vs text vs graph processing for ML/DL

5. Components of a Graph and Design Choices:

领英推荐

6. Bipartite Graphs and Folded Bipartite Graphs

7.How do we represent graphs numerically?

?8 .Nodes and Edge Attributes:

Ajay Taneja的更多文章

The Marriage of Retrieval-Augmented Generation (RAGs) with Knowledge Graphs: Part 15 of my Graph Series of Blogs

Knowledge Graph Completion and Knowledge Graph Embeddings: Part 14 of my Graph Series of Blogs

Setting Up Graph Neural Network Prediction Tasks: Part 13 of my Graph Series of Blogs

Training Graph Neural Networks: Part 12 of my Graph series of blogs

Heterogeneous Graphs and Relational Graph Convolutional Neural Networks (RGCNs): Part 11 of my Graph series of blogs

Knowledge Graphs for RAG: Part 10 of my Graph series of blogs

Deep Learning with Graphs: Part 9 of my Graph Series of blogs

An Introduction to Graph Neural Networks: Part 8 of my Graph Series

Embedding Entire Graphs or Sub-Graphs: Part 7 of X of my notes

Formulation of Node Embeddings in Graphs: Node2Vec Algorithm - Part 6 of X of my notes

社区洞察

其他会员也浏览了

New $1 Million Biennial Prize to Revitalize Statistics

What is the Future of Data Analysis?

Norman Alex Newsletter

Piecewise Linear Regression

Simple and intuitive visualization method of how deep networks "see" the world (Code Included)

Data Analytics Models: Unraveling Insights from the Data Symphony

Math is the Language of Prediction

Using Claude 3.5 for Research: Rapid Access to Vast Knowledge with Significant Limits

Perceptron

Unlocking the Power of Bayesian Networks