Understanding Bayesian with Examples In Python

Understanding Bayesian with Examples In Python

Bayesian estimation is a statistical method that involves updating the probability estimate for a hypothesis as additional evidence is acquired. This approach to probability allows us to systematically improve our estimates in light of new data.

Note: This article is a continuation of the previous article:

Bayes' Theorem

Bayes' Theorem is a formula that describes how to update the probabilities of hypotheses when given evidence. It relies on incorporating prior knowledge and observed data. The theorem is mathematically stated as:

Example in Python: Bayesian Updating

To understand the concept, let's solve the snowfall problem below:

A father claims about snowfall last night. The first daughter tells that the probability of snowfall on a particular night is 1/8. The second daughter says that 5 out of 6 times, the father is lying! What is the probability that there actually was a snowfall?

To solve this problem using Bayes' theorem, let's define the events:

  • Let S be the event that there was snowfall last night.
  • Let C be the event that the father claims there was snowfall.

We are given:

  • P(S), the probability of snowfall on a particular night, which is 1/8.
  • P(?S), the probability that there was no snowfall, which is 1?P(S)=7/8.
  • P(L), the probability that the father is lying, which is 5/6.
  • P(T), the probability that the father is telling the truth, which is 1?P(L)=1/6.

We want to find P(SC), the probability that there was snowfall, given that the father claims there was snowfall.

Bayes' theorem states that:


We have P(S), but we need P(CS) and P(C).

Since the father's claim is about snowfall, we consider two cases for P(C):

  1. The father claims snowfall when it actually snowed, P(CS). Since the father is telling the truth, and we assume that if it snowed, he would tell the truth, so P(CS)=P(T)=1/6.
  2. The father claims snowfall when it did not snow, P(C∣?S). Since this would be a lie, P(C∣?S)=P(L)=5/6.

Now we can find P(C), the total probability that the father claims there was snowfall:

Now, we can apply Bayes' theorem:


The probability that there was actually snowfall given the father's claim is 1/36, which is quite low.

Insights from the result using Bayes' theorem:

This low probability might seem counterintuitive at first since the father did claim there was snowfall. However, the father's credibility is low (he tells the truth only 1/6 of the time), and the prior probability of snowfall on any given night is also low (1/8).

Bayes' theorem combines these pieces of information to update our belief in the event of snowfall based on the father's claim. It reflects the fact that a claim from an unreliable source does not significantly increase the likelihood of the event being true, especially when the event itself is already known to be rare.

Thus, despite the claim of snowfall, the probability of it having actually snowed remains low when considering the father's tendency to lie and the low likelihood of snowfall in general. This demonstrates how Bayes' theorem can adjust our beliefs based on the reliability of new information and the prior likelihood of the event.


Now, let's solve it with Python:

# Prior probability of snowfall
P_S = 1/8

# Probability that the father is lying
P_L = 5/6

# Probability that the father is telling the truth
P_T = 1 - P_L

# Probability of the father claiming snowfall, given there was snowfall
P_C_S = 1/6

# Probability of the father claiming snowfall, given there was no snowfall (lying)
P_C_NS = P_L

# Using Bayes' Theorem to find the posterior probability of snowfall given the claim
P_S_C = (P_C_S * P_S) / ((P_C_S * P_S) + (P_C_NS * (1 - P_S)))

print(f"Probability of actual snowfall given the father's claim: {P_S_C:.2f}")
        
Probability of actual snowfall given the father's claim: 0.03
        


Understanding the Problem

  • A father claims that it snowed last night.
  • The first daughter says that the probability of snowfall (P_S) on a particular night is 1/8.
  • The second daughter says that the father lies 5 out of 6 times P_L.

Variables Defined in the Program

  1. Prior Probability of Snowfall (P_S):The probability that it snows on a given night without considering the father's claim, given as 1/8.
  2. The probability that the Father is Lying (P_L):Given as 5/6, it’s the probability that the father is not telling the truth.
  3. The probability that the Father is Telling the Truth (P_T):This is calculated as 1 - P_L, which is the complement of the father lying = 1/6
  4. Probability of the Father Claiming Snowfall Given There Was Snowfall (P_C_S):This is the probability that the father is telling the truth, which is (P_T) = 1?P_L=1/6
  5. Probability of the Father Claiming Snowfall Given There Was No Snowfall (P_C_NS):This is equal to P_L, the probability that the father is lying. It represents the scenario where the father claims it snowed when it actually did not.

Bayes' Theorem

Bayes' Theorem is used to update the probability of snowfall based on the father’s claim. The theorem is stated as:


Where:

  • P(Snowfall∣Claim) is the posterior probability of snowfall given the father's claim.
  • P(Claim∣Snowfall) is the probability of the father claiming snowfall if it actually snowed, which is 1.
  • P(Snowfall) is the prior probability of snowfall, 1/8.
  • P(Claim) is the total probability of the father claiming snowfall, whether it snowed or not. This is calculated as P(Claim∣Snowfall)×P(Snowfall)+P(Claim∣No?Snowfall)×P(No?Snowfall).

Calculating the Posterior Probability

  • The program calculates the posterior probability P(Snowfall∣Claim) using the values defined above.
  • This probability represents how likely it is that it actually snowed, given the father's claim and considering both the likelihood of snowfall and the father's truthfulness.


Analyzing the results:

The result of the program, which calculates the probability of actual snowfall given the father's claim, came out to be approximately 0.03, or 3%. Let's delve into what this means and the insights we can gain from it:

  1. Low Probability of Actual Snowfall Despite the Claim: Despite the father's claim that it snowed, there's only a 3% probability that it actually did. This low probability is influenced significantly by two factors:The low prior probability of snowfall (1/8 or 12.5%).The high probability of the father lying (5/6 or approximately 83.33%).
  2. Impact of the Father's Credibility: The father's tendency to lie has a substantial impact on the posterior probability. Even though he claims that it snowed, his lack of credibility (he lies 5 out of 6 times) greatly diminishes the likelihood that his claim is true.
  3. Bayesian Updating Reflects Prior Beliefs and New Evidence: This outcome is a classic example of Bayesian inference, where the prior belief (the natural likelihood of snowfall) is updated with new evidence (the father's claim). However, the weight given to this new evidence is moderated by the reliability of the source (the father's credibility).
  4. Interplay Between Prior Probability and Likelihood: The calculation shows how Bayesian analysis balances prior probability (our initial belief or the baseline likelihood of an event) and the likelihood (the probability of observing the evidence, given the event). In this case, the low prior probability of snowfall is slightly offset by the father's claim, but not significantly enough due to his high likelihood of lying.
  5. Real-World Implications: If this scenario were in a real-world context, the outcome suggests skepticism towards the claim of snowfall. A decision maker using this probability would likely lean towards the conclusion that it didn't snow, but would still acknowledge a non-negligible chance that it did.

In summary, the 0.03% probability reflects a situation where a claim is made under questionable credibility, combined with an event (snowfall) that was initially unlikely. This result demonstrates the power of Bayesian analysis in updating beliefs and making decisions under uncertainty.


The Chocolate Example

Imagine a box having 50% dark chocolates and 50% white chocolates. Half of the dark chocolate is wrapped in gold paper and another half in silver paper. All the white chocolates are wrapped in silver paper. A kid picked chocolate from the box, wrapped in silver paper. What is the probability that the picked one is dark chocolate?

  • ● ?The event D is that the picked chocolate is dark.
  • ● ?The event S is that the picked chocolate is wrapped in silver paper.

To solve this problem using Bayes' theorem, we want to find the probability that the picked chocolate is dark, given that it is wrapped in silver paper P(DS).

Here's what we know:

  • P(D) is the probability that any chocolate picked is dark, which is 50% or 0.5.
  • P(SD) is the probability that a chocolate is wrapped in silver paper, given that it is dark, which is 50% or 0.5 (since half of the dark chocolates are wrapped in silver paper).
  • P(S) is the probability of picking a silver-wrapped chocolate from the box. All white chocolates are wrapped in silver paper, and half of the dark chocolates are. Since there is an equal number of white and dark chocolates, the probability that a randomly picked chocolate is wrapped in silver is 75% or 0.75 (the total number of silver-wrapped chocolates divided by the total number of chocolates).

Bayes' theorem states that:


Substituting the given probabilities into the equation gives us:


Now, calculate the result:

P(DS)=0.25/ 0.75

P(DS)=1/3

Therefore, the probability that the chocolate is dark, given that it is wrapped in silver paper, is 1/3 or approximately 33.33%.

The result we obtained using Bayes’ theorem in the context of the chocolate problem is insightful in understanding how prior knowledge can update our beliefs about a particular event.

Bayes' theorem, in essence, provides a mathematical framework for updating our beliefs based on new evidence. It allows us to refine our predictions about the probability of an event by incorporating our existing knowledge (the prior) and the new evidence (the likelihood).

In the chocolate example, the prior knowledge is that there is a 50% chance of picking a dark chocolate at random from the box P(D). However, this probability is not very informative when we are presented with additional evidence: the chocolate we picked is wrapped in silver paper P(S).

Before considering the wrapping, the chances of picking a dark chocolate were equal to picking a white chocolate. But once we account for the fact that we've picked a silver-wrapped chocolate, we need to update this probability because not all chocolates are wrapped in silver. All white chocolates are in silver wrappers, but only half of the dark chocolates are. This shifts the likelihood toward a white chocolate because there are more silver-wrapped white chocolates than silver-wrapped dark chocolates.

By applying Bayes’ theorem, we integrate this new evidence to update our belief. The likelihood of picking a silver-wrapped chocolate if it’s dark is P(SD)=0.5, but the overall probability of picking a silver-wrapped chocolate is P(S)=0.75. This overall probability of picking a silver wrapper is greater because it includes all the white chocolates as well.

The updated probability P(DS)=1/3 tells us that, given the evidence that a silver-wrapped chocolate has been picked, there is a roughly 33.33% chance that this chocolate is dark. This is less than our initial belief P(D)=0.5, which makes sense intuitively: since there are more silver-wrapped white chocolates, picking a silver wrapper makes it less likely that we’ve picked a dark chocolate compared to our initial assumption before we knew the color of the wrapper.

Thus, Bayes' theorem crucially allows us to move from our prior (initial guess based on overall proportions) to our posterior (revised guess, taking the evidence into account), giving us a more accurate assessment of the situation based on all available information.

Considering the chocolate problem, we can use Python to calculate the posterior probability that the selected chocolate is dark, given that it is wrapped in silver paper.

# Probability of picking dark chocolate (prior)
P_D = 0.5

# Probability of picking silver paper chocolate
P_S = 0.75  # All chocolates are in silver paper except half of the dark ones

# Probability of silver paper given dark chocolate
P_S_D = 0.5

# Posterior probability of dark chocolate given silver paper
P_D_S = (P_S_D * P_D) / P_S

print(f"Probability that the chocolate is dark given it's wrapped in silver paper: {P_D_S:.2f}")
        
Probability that the chocolate is dark given it's wrapped in silver paper: 0.33
        


Bayesian Equation:

The basic equation of Bayesian inference is given by P(θy) = P(yθ)∝P(θ), where:

  • P(θy): The posterior probability of the parameter θ given the data y.
  • P(yθ): The likelihood of observing the data y given the parameter θ.
  • P(θ): The prior probability of the parameter θ before observing the data y.
  • The symbol ∝ means "proportional to," as the true posterior is the product of the likelihood and the prior normalized by the evidence (or marginal likelihood).


Prior and Posterior Distributions

The concept of prior and posterior distributions is fundamental to Bayesian inference. The prior distribution encapsulates our beliefs about an unknown parameter before we have seen the data. The posterior distribution, on the other hand, is the updated belief after observing the data.

Conjugate Priors

In Bayesian statistics, conjugate priors are a way to simplify the process of updating beliefs with new evidence. A conjugate prior is a prior distribution that, when combined with a likelihood function from the same distribution family, yields a posterior distribution that is also from the same distribution family. This property makes the analytical computation of the posterior distribution much simpler.

Several common likelihood functions along with their corresponding conjugate priors:

  • Normal Likelihood: The conjugate prior for a normal likelihood is also a normal distribution. This is the case when the variance is known, and you are updating the mean.
  • Binomial/Bernoulli Likelihood: The conjugate prior for a binomial or Bernoulli likelihood is the Beta distribution, as previously discussed.
  • Exponential/Poisson Likelihood: For data modeled with an exponential or Poisson distribution, the conjugate prior is the Gamma distribution. This is often used in the context of modeling wait times or the arrival of events.
  • Uniform Likelihood: When the likelihood is uniform, a Pareto distribution serves as its conjugate prior. The uniform likelihood could represent complete uncertainty over a certain range.
  • Multinomial Likelihood: The Dirichlet distribution is the conjugate prior for a multinomial likelihood, which is a generalization of the binomial distribution to more than two outcomes.

The use of conjugate priors simplifies the Bayesian updating process, because the posterior distribution can be derived algebraically from the prior and the likelihood without the need for numerical integration or advanced sampling methods. This makes the computation of the posterior distribution more manageable, especially in cases where there's a need for repeated updates with new data.


Beta and Binomial Distributions

The Beta distribution is often used as a conjugate prior for the Binomial distribution because of this simplification. In the case of the Beta distribution, the probability parameter is itself a random variable, while for the Binomial distribution, the probability of success is a fixed parameter.


Example for Binomial/Bernoulli Likelihood - Beta Distribution

Here’s an example involving the beta distribution as a conjugate prior for the binomial distribution:

Problem: Suppose you have a non fair coin and you are interested in estimating the probability p that the coin lands heads up. You don't know p, so you decide to model your beliefs about p using a beta distribution, which is the conjugate prior for the binomial likelihood function you will use.

Why Did we use a Beta distribution

The Beta distribution is used as the prior in the coin flip problem for several key reasons:

  1. Range of Values: The Beta distribution is defined for values between 0 and 1, which perfectly matches the range of possible values for a probability like p, the probability of a coin landing heads up.
  2. Conjugacy with the Binomial Distribution: The Beta distribution is a conjugate prior for the binomial likelihood, which means the posterior distribution is also a Beta distribution. This conjugacy simplifies the calculation of the posterior distribution, as you can easily update the parameters of the Beta distribution with the number of successes (heads) and failures (tails) from your data. Note: The binomial likelihood function arise when you are dealing with events that have two possible outcomes (binary outcomes), like success/failure, yes/no, or in the this example, heads/tails for coin flips.
  3. Flexibility in Encoding Beliefs: The Beta distribution is very flexible and can encode a variety of beliefs about p. For example:A Beta distribution with parameters α=1 and β=1 represents a uniform distribution, encoding no prior knowledge about p.Parameters greater than 1 can encode different shapes of distributions, allowing you to represent different degrees of belief about the likelihood of heads or tails.

This graph shows a series of Beta distributions with various α (alpha) and β (beta) parameter values. The Beta distribution is a continuous probability distribution defined on the interval [0, 1], and is frequently used in Bayesian statistics to model the distribution of probability values.

The parameters α and β control the shape of the distribution:

  • When α = β = 1, the Beta distribution is uniform across the interval [0, 1], meaning all values are equally likely. This is not shown in the graph you provided.
  • When α > 1 and β > 1, the distribution is unimodal (has a single peak) and can take on different shapes:If α = β, the distribution is symmetric around 0.5.If α > β, the distribution is skewed towards 1, indicating a belief that higher values are more likely.If α < β, the distribution is skewed towards 0, indicating a belief that lower values are more likely.
  • When either α or β is less than 1, the distribution becomes skewed towards the respective boundaries (0 or 1), and can become U-shaped if both are less than 1.

Let's go through the curves on the graph:

  • Blue Curve (α = 0.5, β = 0.5): This is actually a special case of the Beta distribution known as the arcsine distribution. It suggests that values near 0 and 1 are more likely than those near the center (0.5).
  • Orange Curve (α = 5, β = 1): This distribution is skewed towards 1. It indicates a belief that the probability is likely to be close to 1.
  • Green Curve (α = 1, β = 3): This distribution is skewed towards 0. It indicates a belief that the probability is likely to be close to 0.
  • Red Curve (α = 2, β = 2): This distribution is symmetric and bell-shaped, centered around 0.5. This is a common choice for a "neutral" prior, suggesting that the probability is equally likely to be above or below 0.5.
  • Purple Curve (α = 2, β = 5): This distribution is skewed towards 0, suggesting that lower probabilities are more likely than higher ones, but less extreme than the green curve.

In summary, this graph visually demonstrates how the choice of α and β parameters in the Beta distribution can express different prior beliefs about a probability value. This flexibility makes the Beta distribution very useful for Bayesian inference, especially for binary or proportion data like the rate of success in a series of experiments or the probability of an event occurring.


4. Analytical Tractability: Working with the Beta distribution in conjunction with the binomial likelihood allows for closed-form solutions for the posterior distribution's parameters. This is particularly advantageous computationally, as it avoids the need for numerical methods or simulation-based approaches like Markov Chain Monte Carlo (MCMC).

In the context of an unfair coin, where p is not known and you cannot assume it to be 0.5 (as you would with a fair coin), the Beta distribution lets you incorporate your prior beliefs about p. For instance, if you believe the coin is biased towards heads, you might choose a Beta distribution with α>β to reflect this belief.

After you observe data from flipping the coin (your empirical evidence), you can update your Beta distribution parameters by adding the number of observed heads to α and the number of tails to β. This results in a new Beta distribution that reflects both your prior beliefs and the evidence from the coin flips, giving you the posterior distribution for p.

Prior: You choose a beta distribution with parameters α and β (written as Beta(α,β)) to model your prior beliefs about p. Suppose you want to start with a relatively uniform prior, which expresses uncertainty about whether the coin is biased or not, so you choose α=2 and β=2.

Data: You flip the coin 10 times, resulting in 7 heads and 3 tails.

Likelihood: The likelihood of observing 7 heads in 10 tosses, if the true probability of heads is p, is given by a binomial distribution:

Binomial(n=10,k=7,p=p).

Posterior: Using the conjugate prior property, you can calculate the posterior distribution analytically. The posterior parameters α′ and β′ for a beta distribution are updated as follows:

  • α′=α+k (number of successes)
  • β′=β+(n?k) (number of failures)

So after observing the data, the updated α′ and β′ are:

  • α′=2+7=9
  • β′=2+3=5

The posterior distribution is therefore Beta(α′,β′)=Beta(9,5).

This posterior distribution now represents your updated beliefs about the probability p after taking into account the evidence from your coin flips. It is more peaked around the ratio of heads to total flips, indicating increased confidence that p is closer to 0.7 than to 0.5. However, it still reflects some uncertainty due to the relatively small number of flips.

The advantage of conjugate priors is that you can continue to update your beliefs as you gather more data by simply updating the parameters α′ and β′ without having to perform complex integrations or use numerical methods to find the posterior distribution.


Example 2:

Let’s dig into the concept of prior and posterior distributions through an example.

Prior Distribution

Imagine you are a scientist studying a new drug meant to reduce blood pressure. Before any tests are done, based on previous studies and theoretical knowledge, you have some belief about the effectiveness of this drug. Perhaps you believe there is a 70% chance that this drug can effectively reduce blood pressure in patients with hypertension.

To represent this belief in a Bayesian framework, you would use a prior distribution. A common distribution to use for a probability like this is the Beta distribution because it is defined on the interval [0, 1], which is suitable for probabilities.

You might choose a Beta distribution with parameters that reflect your belief. In this case, a Beta(α=7,β=3) might be a reasonable choice because the expected value


0.7 is reflecting your belief that there is a 70% chance of the drug being effective.

Posterior Distribution

Now, let’s say you run a clinical trial with 100 patients and find that 80 of them show a significant reduction in blood pressure after taking the drug. With this data, you want to update your belief about the drug's effectiveness.

In a Bayesian context, this update is done by calculating the posterior distribution. The data you collected (80 out of 100) will be modeled as a likelihood function, which in this case could be a binomial likelihood because you're counting the number of successes out of a fixed number of trials.

Given that the Beta distribution is a conjugate prior for the binomial likelihood, you can easily update your Beta parameters. The posterior α (alpha) will be your prior α plus the number of successes (80), and the posterior β (beta) will be your prior β plus the number of failures (20).

Note:The Beta distribution is used as the prior in this scenario because of its mathematical convenience and its relevance to problems involving proportions or probabilities.

So, your posterior distribution is Beta(αposterior=7+80,βposterior=3+20) which simplifies to Beta(αposterior=87,βposterior=23).


Interpretation of the Posterior Distribution

The posterior distribution represents your updated belief about the drug's effectiveness after considering the new evidence from the clinical trial. In this case, the posterior distribution is centered around 8787+23≈0.7987+2387≈0.79, which suggests that, given the trial results, you now believe there's a 79% chance of the drug being effective.

Insights and Decisions

The shift from the prior to the posterior distribution is a crucial part of Bayesian inference. It quantifies how evidence changes our beliefs. In practice, the posterior distribution can then be used for decision-making. For example, if the posterior probability that the drug is effective exceeds a certain threshold, it might be considered for approval by medical authorities.

This Bayesian approach contrasts with frequentist statistics, where you would use the data to perform hypothesis testing and calculate p-values without considering prior beliefs. In Bayesian inference, both prior beliefs and new data are integrated to inform the final analysis.


Example3: Bayesian Estimation with Beta-Binomial Conjugacy

Let's say we have a coin and we want to estimate the probability of it landing heads (H) using a Beta prior and updating this belief as we observe new data (coin flips).

from scipy.stats import beta
import matplotlib.pyplot as plt
import numpy as np

# Prior belief: Beta distribution with alpha=2, beta=2 (symmetric belief)
alpha_prior = 2
beta_prior = 2

# We flip the coin 10 times and observe 6 heads
heads, tails = 6, 4

# Update the prior to get the posterior distribution
alpha_posterior = alpha_prior + heads
beta_posterior = beta_prior + tails

# Plot the prior and posterior Beta distributions
x = np.linspace(0, 1, 100)
plt.plot(x, beta.pdf(x, alpha_prior, beta_prior), label='Prior')
plt.plot(x, beta.pdf(x, alpha_posterior, beta_posterior), label='Posterior')
plt.title('Beta Distributions of Prior and Posterior')
plt.xlabel('Probability of Heads')
plt.ylabel('Density')
plt.legend()
plt.show()
        


In the plot, you observe that the peak of the posterior distribution has shifted towards the observed relative frequency of heads, reflecting our updated belief about the probability of the coin landing heads up.

Performing Bayesian Estimation in Python

With these concepts in mind, Bayesian estimation can be performed in Python using probabilistic programming libraries like PyMC3 or Stan. These libraries allow for more complex models and inferences, handling the computational complexity under the hood.

Summary

Bayesian estimation offers a powerful framework for understanding uncertainty and making decisions under uncertainty. By starting with prior beliefs and systematically updating these beliefs as new data becomes available, Bayesian methods can provide more nuanced and adaptive inferences than traditional frequentist statistics. The use of conjugate priors, where applicable, simplifies computations and is especially useful when real-time updates to the probability distributions are required.

=====================

Practice Problem1: Medical Diagnostic Test

Scenario:

Imagine you are evaluating a new diagnostic test for a disease that is present in 1% of the population. The test has a 95% probability of correctly identifying an individual with the disease (true positive rate). However, the test also has a 2% probability of incorrectly indicating the disease in an individual who doesn't have it (false positive rate).

Task:

A patient takes the test and receives a positive result. Use Bayesian estimation to calculate the probability that the patient actually has the disease, given the positive test result.

Given:

  • The prior probability of having the disease, P(Disease)=0.01 (prevalence of the disease).
  • The probability of a true positive, P(Positive∣Disease)=0.95 (sensitivity of the test).
  • The probability of a false positive, P(Positive∣No?Disease)=0.02 (1 - specificity of the test).

To Calculate:

  • The posterior probability that the patient has the disease is given a positive test result, P(Disease∣Positive).

=======

Solution:

Before starting the solution, watch the following video:


Let’s walk through the problem step-by-step. We'll apply Bayes' Theorem to find the posterior probability, which is the probability that the patient actually has the disease, given that they tested positive.

Given:

  • P(Disease)=0.01: The prior probability of having the disease (the prevalence).
  • P(Positive∣Disease)=0.95: The true positive rate (sensitivity).
  • P(Positive∣No?Disease)=0.02: The false positive rate (1 - specificity).


We want to find P(Disease∣Positive), the probability that the patient has the disease given a positive test result.

We can calculate it using Bayes' Theorem:

We have everything except P(Positive), which is the total probability of testing positive. We can calculate this using the Law of Total Probability:

Where:

  • P(No?Disease)=1?P(Disease), which is the probability of not having the disease.

Let’s calculate this in Python:

# Given probabilities
P_Disease = 0.01
P_Positive_Disease = 0.95
P_Positive_NoDisease = 0.02
P_NoDisease = 1 - P_Disease

# Total probability of testing positive
P_Positive = (P_Positive_Disease * P_Disease) + (P_Positive_NoDisease * P_NoDisease)

# Posterior probability of having the disease given a positive test result
P_Disease_Positive = (P_Positive_Disease * P_Disease) / P_Positive

print(f"The probability of the patient having the disease given a positive test result: {P_Disease_Positive:.4f}")
        
The probability of the patient having the disease given a positive test result: 0.3242        

Now, let’s plot the prior and posterior probabilities to visualize how our belief about the presence of the disease changes after we receive a positive test result.

import matplotlib.pyplot as plt

# Probabilities for the plot
probabilities = {
    'Prior': P_Disease,
    'Posterior': P_Disease_Positive
}

# Create bar plot
plt.bar(probabilities.keys(), probabilities.values(), color=['blue', 'green'])

# Adding the text labels on the bars
for i, value in enumerate(probabilities.values()):
    plt.text(i, value, f"{value:.4f}", ha='center', va='bottom')

plt.title('Prior vs Posterior Probability of Disease')
plt.ylabel('Probability')
plt.ylim(0, 1)
plt.show()
        

This diagram shows the stark difference between our initial belief (prior) and updated belief (posterior) regarding the likelihood of the patient having the disease.

The plot shows that despite a positive test result, the actual chance of having the disease is still not extremely high due to the low prior probability (prevalence) of the disease and the test's false positive rate. This is a classic example of how Bayesian updating can yield counterintuitive but more accurate results by incorporating prior knowledge and evidence.


Additional Resources:







要查看或添加评论,请登录

Rany ElHousieny, PhD???的更多文章

社区洞察

其他会员也浏览了