Predicting an event using Bayesian Network
Samir Paul
Senior Data Science and Analytics Leader | Marketing Analytics | Machine Learning & AI
It's been a while I have published a write-up on Bayesian Estimation and the possibilities of a Bayesian Network or Bayesian Belief Network (BBN). Thought it was time to talk more on that. Bayesian Network is a part of Probabilistic Graphical Model (PGM), quite a powerful framework for representing complex domains using probability distributions. Uncertainties are part of lives, so are of business. These models capture evidence in terms of dependencies between random variables and provide a flexible way to model with intricate interactions towards measuring uncertainties.
Two common PGMs are
While I am going to illustrate on predicting an event, PGMS can be used in customer segmentation, recommendation systems, churn predictions, anomaly detection and in many more areas. These models naturally incorporate uncertainty through probability distributions. Graphical structure provides insights into variable relationships to make interpretability easier and is known for its generative capabilities to new data samples.
When juxtaposed with other modeling like decision trees, it goes beyond simple rule-based decisions using complex dependencies probabilistically.
Let's move into the code:
As usually I do, I have created a hypothetical dataset where buying a smartphone depends on some other evidence including having a job, being active on social network, education and income level as well as whether the individuals were exposed to an ad. It is truly random data and the dependencies there should not be judged.
领英推荐
The objective is to see how we can build a model and predict any event which is hypothetically dependent on the other evidence.
Income and having job are the dependent on education whereas buying smartphone is conditioned on watching ad and being active on social network
Once we fit the model, we can check the conditional probability distributions of the events.
Now we are predicting the events on the data that has not been exposed to the model yet which is analogous to any future data on the same relationship to predict on later.
We can build the confusion matrix as we have both the actual event and predicted data on buying smart phone.
Bayesian approach was always considered to be a bit complex and difficult to interpret, but I trust with the advancement of the machine learning era, the fear is being diminished fast.