All the world's a Graph (theory)...
…and all the data merely players
I’ll never forget a conversation with my Chief Data Scientist, a person whose abilities I greatly respected. Pouring over data late one night, and struggling with assessing the fit of our probabilistic risk model, he turned and said, “It’s a graph problem.” Taking in my quizzical expression, he smiled serenely. “You know, most of the interesting problems are.” At the time I didn’t fully appreciate that wisdom, but I’ve come see how insightful that comment really was.
Graph theory is the mathematical study of graphs, their properties and applications.
As used in graph theory, the term “graph” does not refer to data charts - such as line graphs or bar graphs. No! Instead, it can be thought of as a set of ordered pairs. A set of vertices (points or nodes) and of edges (or lines) that connect the vertices. When any two vertices are joined by more than one edge, the graph is called a “multigraph”.
Graphs are really about the connectedness of things. The correlation and dependence between seemingly random variables. The degrees of freedom and separation from other objects. It is one of the most exciting and visual areas of mathematics, and has important applications. It is critical to understanding that correlation does not mean causation. Graphs allows us create predictive models that actually work!
The bridges of Konigsberg
Brief Background: The history of graph theory can be traced to 1735, when the Swiss mathematician Leonhard Euler solved the K?nigsberg bridge problem. The formal, mathematical definition for a graph is simply: G = (V, E). That’s it! And again, it works because the ordered pair — (V, E) — is actually made up of two objects: a set of vertices, and a set of edges.
So can you think of a few things that can be represented by a graph? Yes! How about your brain, the internet, integrated circuits, trains, planes and automobiles?! It’s really amazing if you open yourself up to the theory. Let’s take a look at just a few.
Computer Science
For those of us deeply involved in software and computer science, many of the problems that we’re familiar with, fall under this broad category. Here’s a small list:
- Search engines - ranking algorithms based on graph theory
- Optimizing video delivery - Netflix, Amazon
- Path planning in robotics - drones
- Database design - scaling large data sets
- Data mining - reducing the search space
- Network systems - shortest path (self-driving vehicles)
But it extends beyond that. Graphs are considered an excellent modeling tool which is used for many types of relations. Problems of real world can be represented by graphs. They way that human beings are connected, is an interesting example.
Twitter Social Graph
Social Graphs and more
By now, most of us have heard of social graphs. Twitter, and Facebook are two famous examples. And this very platform, LinkedIn is a social graph! (This is why Microsoft paid $26.2 Billion). Say what you will about Mark Zuckerberg, it’s mind-blowing to me that he created a social network site in his college dorm room and managed to grow it into a $100 billion company. Compared to what most of us did in our college dorm rooms, Zuckerberg is a genius. And he helped the Internet evolve into what it is now.
Here’s an interesting article in the Harvard Business Review about the rise of social graphs for business.
Conclusion: Graphs are amazing. And important. Study them!
Cheap advice: Life is short. Choose wisely. Continue learning and exploring. And use that power for good.
Joseph Prindle is a gentleman, entrepreneur, and scientist deep within his bones.