Connected by Degrees: The Science Behind Network Separation
Aaron Maroja, CFP?
Cientista de Dados | Engenheiro de Machine Learning | Doutorando em Matemática
Have you ever wondered if we are all connected somehow?
Well, some might think the cosmo dust has to be intertwined whether through our DNA code or by the number of contacts in your LinkedIn account. The fact we all want to live in an entangled society predates our ability to hunt for food, and now, networking can help us expand our personal and professional nook, increase our visibility, and access new opportunities.
Do you think it is possible to measure how many degrees of separation are we from each other?
In 1994, three Albright College students, Brian Turtle, Mike Ginelli, and Craig Fass, were snowed at home watching a TV movie marathon starring Kevin Bacon and decided to create a game that would measure the degree of separation between any actor and Kevin Bacon. The idea is simple: every actor is either directly connected to Kevin Bacon through appearing in a movie together or indirectly connected through other actors who have appeared in a movie with Kevin Bacon.
The Bacon Number is an interesting application of graph theory, a branch of mathematics that studies networks of relationships between individuals. One of the most famous concepts in graph theory is the Erd?s Number, which measures the degree of separation between a mathematician and Paul Erd?s. In the same way, the Bacon Number is used to determine the degree of separation between actors and Kevin Bacon.
To calculate the Bacon Number, we can use a graph-based algorithm called breadth-first-search (BFS), which is commonly used in AI applications. BFS is an algorithm that explores all the vertices of a graph in breadth-first order, meaning it visits all the vertices at distance 1 from the starting vertex before visiting vertices at distance 2, and so on. In this case, the starting vertex is Kevin Bacon, and the vertices are actors. The algorithm explores the relationships between actors and Kevin Bacon by visiting their co-starring movies, and we use the number of edges (movies) traversed to determine the Bacon Number.
In order to test this out, I have scraped IMDb's first 1000 best actors to calculate the average of Kevin's number using this algorithm. Here is the code used for this analysis. [You may algo want to play with the degees.py file where you can choose to calculate the degree of any two actors linkable].
With a margin error of 3.4% and with a 90% confidence interval, we found that the average Bacon Number is 2.20. It is worth noticing that as the list from IMDB gets refreshed, we might have different results. Moreover, this was a sample considering only this list predominantly with Hollywood actors and actresses. The average Bacon Number changes with a larger sample as seen for example at?The Oracle of Bacon?which gives an average of 3.16 when this was written considering a database roughly of 1.5 million different linkable actors.
Can we generalize this idea?
Of course! All we need is a large dataset of any two groups we want to measure, say, any two people on LinkedIn as the vertices and their companies as the edge or links. However, you'll need amount of memory to process all of this information at once. Meta did this in 2016 to find an average number of separation of its users to be 3.57, this was measure with 1.59 billion active Facebook users at the time.
Overall, the idea of interconnectedness creates a sense of community and progress. With advancements in technology and social networking, we are closer than ever before. So let's continue to bring ourselves together and share knowledge and experiences. Whether through our DNA code, LinkedIn connections, or shared interests, the bonds that connect us are stronger than ever before.
If you have any question over the methods used, please, feel free to ask for more details.
References: