Graph Database Fundamentals: An Introduction to the World of Graphs
Photo by NASA on Unsplash

Graph Database Fundamentals: An Introduction to the World of Graphs

Graph Database Fundamentals

As machine learning has become widely understood and utilized across multiple industries, graph-based technology has been garnering popularity.

So, what is a "Graph"?

A graph database is a data structure that is made up of elements called vertices (also known as nodes) and edges, in contrast to the familiar rows, columns, tables, and joins of relational databases.

For my fellow math people out there, if you'd like to think of a graph in terms of a formula, this is what it would look like: G = (V, E), where G = Graph, V = Vertex, and E = Edge.

Vertices can connect to one or many other vertices or even exist unconnected (though this wouldn't be taking advantage of the relational nature of graphs).

Sample Graph Database Visualization


There are several nuanced types of graph databases, but this article will focus on one of the most common types: a directional graph database. As the name suggests, in a directional graph database, the edges that connect vertices together maintain a specific direction. In other words, an edge would be going from one vertex to another vertex.

Notice in the diagram above, how Edge 1 goes from Vertex 1 to Vertex 2, and Edge 3 goes from Vertex 2 to Vertex 1. Though these edges look similar and connect the same two vertices, they are distinct due to the direction they maintain.

Furthermore, in graphs, each vertex and edge can store properties, or information, inside. These properties are stored in key-value pair relationships. For example, in the diagram below, Vertex 1 has the following properties (Name, Age, Color, Hex Code, Shape, Created Date) with the respective values (Vertex 1, 3 years, 1 day, Blue, #B4C7E, Circle, 2/24/2019).

Vertex with property information displayed.

Access to the property information inside vertices and edges can be obtained in two ways: through querying the graph database or through an interactive GUI that provides a visual representation of the graph. For relational databases, you might be familiar with querying using a language like SQL. For graph databases, you will need to use a different query language such as Gremlin, GQL, or SPARQL, that allows you to query based on the vertex and edge identifiers or properties. Depending on the type of visualization you have for your graph database, these properties may or may not be visible unless you click on the vertex or edge you're inquiring about.

When a vertex has properties, it is important that one of them is unique amongst all other vertices. For example, in the diagram above, the unique identifier for Vertex 1 is Name. This means that no other node in the graph can have a unique identifier with the value Vertex 1.?This is similar to how primary and foreign keys work in relational databases, though the nuanced technicalities of unique identifiers for graph databases are beyond the scope of this article.

Common Applications of Graph Databases

Relational databases have typical use cases such as retail purchase orders and supply chain information, graph databases also have some typical applications. These include mapping social media connections, fraud detection, network mapping, and recommender systems. In each of these use cases, the relationship between vertices in the graph is of utmost importance and relevance. For example, to determine whether a transaction is fraudulent or legitimate, it would be helpful to look at edge information that includes where the transaction took place, how big the purchase was, or if the user had purchased from a certain vendor before. It would also be helpful to look at vertex information such as the purchaser's name or the vendor's address.

Why a Graph Database over a Relational Database?

While some use cases lend themselves perfectly to old-fashioned relational databases, there are some unparalleled advantages of graphs that should be mentioned. Querying a graph database is fast. Compared to querying relational databases, where you have to manually join to get the information you need, graph databases are already connected through edges. In addition, graph databases are highly scalable, better suited for changing schema, great at displaying and organizing complex relationships, and are visually easier to understand and interpret. There are also incredibly powerful Machine Learning and AI techniques available that run on top of Graph Databases (but this point lends itself to another article entirely).

Resources

Listed below are some resources that can be used to bolster your knowledge of Graph Databases.



Nicole Flora-Peppe

Principal Product Owner (MDM)

3 年

Katherine Schmitzer this is really good! I’m very interested in visualization capabilities/solutions. How to visually tell the stories pertaining to the complex relationships and overlaying the 'properties' in the Graph Database, also how to perform interactive visualizations, so exciting.

Megan E.

Cybersecurity Leader | Expert in Internal & External Risk Management | Recognized ERM & GRC Advisor | Public Speaker | Former US Military Officer ??

3 年

?? Great article Katherine!

要查看或添加评论,请登录

Katherine Schmitzer的更多文章

社区洞察

其他会员也浏览了